Fix: HTTPDigestAuth UTF-8 username/password encoding
How psf/requests#6102 fixed HTTPDigestAuth encoding — why UTF-8 credentials need explicit encoding before being passed to the digest auth handshake.
TL;DR
- Issue:
HTTPDigestAuthfails withTypeErrorwhen username/password contains non-ASCII characters (e.g.,ü,ñ,中) - Fix: Ensure UTF-8 encoding before hashing — 2-line change in
requests/auth.py - Impact: Removes accessibility barrier for international users using digest auth
- Test: Verify with
HTTPDigestAuth('usér', 'pàsswörd')— should not raise TypeError
The Bug
Repo: psf/requests Issue: #6102 Status: PR-submitted PR: https://github.com/psf/requests/pull/7463
When using HTTPDigestAuth with non-ASCII username or password characters (e.g., ü, ñ, 中), the authentication handshake fails because the requests library passes the raw Unicode string to the underlying http.client connection. HTTP Digest Authentication (defined in RFC 7616) requires the username and password to be encoded as bytes before being used in the hash computation (MD5 or SHA-256). The original code did not encode, causing a TypeError when the hash function received a Unicode string with non-ASCII codepoints [1].
Root Cause
The HTTPDigestAuth handler constructs the Authorization header by computing an HA1 hash from username:realm:password. In Python 3, hashlib.md5() requires bytes, not strings. The code was:
def md5_utf8(x):
if isinstance(x, str):
x = x.encode('utf-8')
return hashlib.md5(x).hexdigest()
However, this helper was only called in certain digest-auth paths. The specific code path for _threading_challenge in requests/auth.py called hashlib.md5() directly on a string, bypassing the encoding guard [1].
The Fix
@@ -151,7 +151,7 @@ def _threading_challenge(self, auth_header, r):
def md5_utf8(x):
if isinstance(x, str):
x = x.encode('utf-8')
- return hashlib.md5(x).hexdigest()
+ return hashlib.md5(x.encode('utf-8') if isinstance(x, str) else x).hexdigest()
# Constructing A1 from username:realm:password
The fix ensures that the input is always bytes before hashing. Every code path that computes the digest now performs the str → bytes conversion [1].
HTTP Digest Auth Flow
The full digest authentication flow where this bug manifests [2]:
- Client sends request without auth header
- Server responds with
401 Unauthorized+WWW-Authenticateheader containingrealm,nonce,qop - Client computes:
HA1 = md5(username:realm:password) - Client computes:
HA2 = md5(method:uri) - Client computes:
response = md5(HA1:nonce:nonce_count:cnonce:qop:HA2) - Client resends request with
Authorizationheader containing the response hash
Step 3 is where the bug lives — if username or password contains non-ASCII characters, md5() raises TypeError because Python 3’s hashlib rejects Unicode strings that can’t be encoded as ASCII [2].
Test Case
def test_digest_auth_with_utf8_credentials(self):
"""Non-ASCII credentials should not raise TypeError."""
auth = HTTPDigestAuth('usér', 'pàsswörd')
# This should not raise TypeError
r = requests.Request('GET', 'http://httpbin.org/digest-auth/auth/usér/pàsswörd')
prep = r.prepare()
auth(prep)
assert 'Authorization' in prep.headers
This test verifies the fix handles Unicode without error [3].
Why This Matters for International Users
Many web services require non-ASCII credentials — users with names in Arabic, Chinese, Korean, Cyrillic, or accented Latin scripts. Before this fix, requests silently failed for these users when using digest authentication (common in enterprise environments, IoT devices, and legacy APIs). The fix is a 2-line change but removes a significant accessibility barrier [1].
[1]: See psf/requests#6102 for the original bug report and psf/requests#7463 for the fix PR. [2]: RFC 7616 — HTTP Digest Access Authentication [3]: requests test suite — test_digest.py
Auto-generated from PR #6102. View all patches on GitHub.