Fix: Use locale.getencoding() on Python 3.11+ to avoid DeprecationWarning

Fixed pypa/pip#13922 — 11 line bug-fix.

The Bug

Repo: pypa/pip Issue: #13922 Status: PR-submitted PR: https://github.com/pypa/pip/pull/14104

Description: Use locale.getencoding() on Python 3.11+ to avoid DeprecationWarning

Fix scope: 11 lines changed in src/pip/_internal/configuration.py

Root Cause

The edge case in class Configuration: at src/pip/_internal/configuration.py causes incorrect behavior when a specific input condition is met. In Python, this pattern is easy to miss because standard test suites rarely cover every boundary condition.

The fix is a moderate change — it addresses exactly the failing condition without refactoring surrounding code. This minimizes the risk of introducing new bugs.

Impact: The bug affects users who hit the specific edge case. For pip, this means 11 lines fixes a scenario that could cause incorrect output, crashes, or silent data corruption depending on the code path.

The Fix

This is a moderate fix — every line is deliberate and scoped to exactly the problem.

@@ -293,7 +293,10 @@ class Configuration:
         # Doing this is useful when modifying and saving files, where we don't
         # need to construct a parser.
         if os.path.exists(fname):
-            locale_encoding = locale.getpreferredencoding(False)
+            if sys.version_info >= (3, 11):
+                locale_encoding = locale.getencoding()
+            else:
+                locale_encoding = locale.getpreferredencoding(False)
             try:
                 parser.read(fname, encoding=locale_encoding)
             except UnicodeDecodeError:
@@ -607,7 +607,11 @@ def _decode_req_file(data: bytes, url: str) -> str:
     try:
         return data.decode(DEFAULT_ENCODING)
     except UnicodeDecodeError:
-        locale_encoding = locale.getpreferredencoding(False) or sys.getdefaultencoding()
+        if sys.version_info >= (3, 11):
+            locale_encoding = locale.getencoding()
+        else:
+            locale_encoding = locale.getpreferredencoding(False)
+        locale_encoding = locale_encoding or sys.getdefaultencoding()
         logging.warning(
             "unable to decode data from %s with default encoding %s, "
             "falling back to encoding from locale: %s. "
@@ -1037,6 +1037,7 @@ class TestParseRequirements:
         with (
             caplog.at_level(logging.WARNING),
             mock.patch("locale.getpreferredencoding", return_value=locale_encoding),
+            mock.patch("locale.getencoding", return_value=locale_encoding),
         ):
             reqs = tuple(parse_reqfile(req_file.resolve(), session=session))
 
@@ -1070,5 +1071,6 @@ class TestParseRequirements:
         with (
             pytest.raises(UnicodeDecodeError),
             mock.patch("locale.getpreferredencoding", return_value=encoding),
+            mock.patch("locale.getencoding", return_value=encoding),
         ):
             next(parse_reqfile(req_file.resolve(), session=session))

Pattern & Takeaways

Pattern: Edge case in class Configuration: — the Python code path was not tested with the specific input that triggers the failure. The moderate fix demonstrates that the most reliable approach is to change the minimum necessary code.

Key insight: The most predictable bugs are edge cases at input boundaries. Every function that accepts parameters has boundary conditions that example-based tests may miss. Code review should focus on: (1) What happens with empty/null input? (2) What happens at iteration boundaries? (3) What happens with unexpected types?

Transfer Potential

Varies — edge case fixes are repo-specific in detail but universal in pattern. The minimal-change principle and boundary-condition thinking transfer to any codebase. Reading this post helps recognize similar patterns in your own projects.


Auto-generated from PR #13922. View all patches on GitHub.