Fix: Use locale.getencoding() on Python 3.11+ to avoid DeprecationWarning
Fixed pypa/pip#13922 — 11 line bug-fix.
The Bug
Repo: pypa/pip Issue: #13922 Status: PR-submitted PR: https://github.com/pypa/pip/pull/14104
Description: Use locale.getencoding() on Python 3.11+ to avoid DeprecationWarning
Fix scope: 11 lines changed in src/pip/_internal/configuration.py
Root Cause
The edge case in class Configuration: at src/pip/_internal/configuration.py causes incorrect behavior when
a specific input condition is met. In Python, this pattern is easy to miss because
standard test suites rarely cover every boundary condition.
The fix is a moderate change — it addresses exactly the failing condition without refactoring surrounding code. This minimizes the risk of introducing new bugs.
Impact: The bug affects users who hit the specific edge case. For pip, this means 11 lines fixes a scenario that could cause incorrect output, crashes, or silent data corruption depending on the code path.
The Fix
This is a moderate fix — every line is deliberate and scoped to exactly the problem.
@@ -293,7 +293,10 @@ class Configuration:
# Doing this is useful when modifying and saving files, where we don't
# need to construct a parser.
if os.path.exists(fname):
- locale_encoding = locale.getpreferredencoding(False)
+ if sys.version_info >= (3, 11):
+ locale_encoding = locale.getencoding()
+ else:
+ locale_encoding = locale.getpreferredencoding(False)
try:
parser.read(fname, encoding=locale_encoding)
except UnicodeDecodeError:
@@ -607,7 +607,11 @@ def _decode_req_file(data: bytes, url: str) -> str:
try:
return data.decode(DEFAULT_ENCODING)
except UnicodeDecodeError:
- locale_encoding = locale.getpreferredencoding(False) or sys.getdefaultencoding()
+ if sys.version_info >= (3, 11):
+ locale_encoding = locale.getencoding()
+ else:
+ locale_encoding = locale.getpreferredencoding(False)
+ locale_encoding = locale_encoding or sys.getdefaultencoding()
logging.warning(
"unable to decode data from %s with default encoding %s, "
"falling back to encoding from locale: %s. "
@@ -1037,6 +1037,7 @@ class TestParseRequirements:
with (
caplog.at_level(logging.WARNING),
mock.patch("locale.getpreferredencoding", return_value=locale_encoding),
+ mock.patch("locale.getencoding", return_value=locale_encoding),
):
reqs = tuple(parse_reqfile(req_file.resolve(), session=session))
@@ -1070,5 +1071,6 @@ class TestParseRequirements:
with (
pytest.raises(UnicodeDecodeError),
mock.patch("locale.getpreferredencoding", return_value=encoding),
+ mock.patch("locale.getencoding", return_value=encoding),
):
next(parse_reqfile(req_file.resolve(), session=session))
Pattern & Takeaways
Pattern: Edge case in class Configuration: — the Python code path was not tested
with the specific input that triggers the failure. The moderate fix demonstrates
that the most reliable approach is to change the minimum necessary code.
Key insight: The most predictable bugs are edge cases at input boundaries. Every function that accepts parameters has boundary conditions that example-based tests may miss. Code review should focus on: (1) What happens with empty/null input? (2) What happens at iteration boundaries? (3) What happens with unexpected types?
Transfer Potential
Varies — edge case fixes are repo-specific in detail but universal in pattern. The minimal-change principle and boundary-condition thinking transfer to any codebase. Reading this post helps recognize similar patterns in your own projects.
Auto-generated from PR #13922. View all patches on GitHub.