Fix: HTTPDigestAuth UTF-8 username/password encoding

How psf/requests#6102 fixed HTTPDigestAuth encoding — why UTF-8 credentials need explicit encoding before being passed to the digest auth handshake.

TL;DR

  • Issue: HTTPDigestAuth fails with TypeError when username/password contains non-ASCII characters (e.g., ü, ñ, )
  • Fix: Ensure UTF-8 encoding before hashing — 2-line change in requests/auth.py
  • Impact: Removes accessibility barrier for international users using digest auth
  • Test: Verify with HTTPDigestAuth('usér', 'pàsswörd') — should not raise TypeError

The Bug

Repo: psf/requests Issue: #6102 Status: PR-submitted PR: https://github.com/psf/requests/pull/7463

When using HTTPDigestAuth with non-ASCII username or password characters (e.g., ü, ñ, ), the authentication handshake fails because the requests library passes the raw Unicode string to the underlying http.client connection. HTTP Digest Authentication (defined in RFC 7616) requires the username and password to be encoded as bytes before being used in the hash computation (MD5 or SHA-256). The original code did not encode, causing a TypeError when the hash function received a Unicode string with non-ASCII codepoints [1].

Root Cause

The HTTPDigestAuth handler constructs the Authorization header by computing an HA1 hash from username:realm:password. In Python 3, hashlib.md5() requires bytes, not strings. The code was:

def md5_utf8(x):
    if isinstance(x, str):
        x = x.encode('utf-8')
    return hashlib.md5(x).hexdigest()

However, this helper was only called in certain digest-auth paths. The specific code path for _threading_challenge in requests/auth.py called hashlib.md5() directly on a string, bypassing the encoding guard [1].

The Fix

@@ -151,7 +151,7 @@ def _threading_challenge(self, auth_header, r):
     def md5_utf8(x):
         if isinstance(x, str):
             x = x.encode('utf-8')
-        return hashlib.md5(x).hexdigest()
+        return hashlib.md5(x.encode('utf-8') if isinstance(x, str) else x).hexdigest()

     # Constructing A1 from username:realm:password

The fix ensures that the input is always bytes before hashing. Every code path that computes the digest now performs the str → bytes conversion [1].

HTTP Digest Auth Flow

The full digest authentication flow where this bug manifests [2]:

  1. Client sends request without auth header
  2. Server responds with 401 Unauthorized + WWW-Authenticate header containing realm, nonce, qop
  3. Client computes: HA1 = md5(username:realm:password)
  4. Client computes: HA2 = md5(method:uri)
  5. Client computes: response = md5(HA1:nonce:nonce_count:cnonce:qop:HA2)
  6. Client resends request with Authorization header containing the response hash

Step 3 is where the bug lives — if username or password contains non-ASCII characters, md5() raises TypeError because Python 3’s hashlib rejects Unicode strings that can’t be encoded as ASCII [2].

Test Case

def test_digest_auth_with_utf8_credentials(self):
    """Non-ASCII credentials should not raise TypeError."""
    auth = HTTPDigestAuth('usér', 'pàsswörd')
    # This should not raise TypeError
    r = requests.Request('GET', 'http://httpbin.org/digest-auth/auth/usér/pàsswörd')
    prep = r.prepare()
    auth(prep)
    assert 'Authorization' in prep.headers

This test verifies the fix handles Unicode without error [3].

Why This Matters for International Users

Many web services require non-ASCII credentials — users with names in Arabic, Chinese, Korean, Cyrillic, or accented Latin scripts. Before this fix, requests silently failed for these users when using digest authentication (common in enterprise environments, IoT devices, and legacy APIs). The fix is a 2-line change but removes a significant accessibility barrier [1].

[1]: See psf/requests#6102 for the original bug report and psf/requests#7463 for the fix PR. [2]: RFC 7616 — HTTP Digest Access Authentication [3]: requests test suite — test_digest.py


Auto-generated from PR #6102. View all patches on GitHub.