Skip to content

fix: add support for non-latin1 characters in wsgi module #3648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

tnoff
Copy link

@tnoff tnoff commented Jul 22, 2025

Description

If a non latin1 character is passed as a URL, you can see the following error

Traceback (most recent call last):
  File "/home/tnorth/.envs/website/lib/python3.12/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/tnorth/.envs/website/lib/python3.12/site-packages/opentelemetry/instrumentation/django/middleware/otel_middleware.py", line 91, in __call__
    self.process_request(request)
  File "/home/tnorth/.envs/website/lib/python3.12/site-packages/opentelemetry/instrumentation/django/middleware/otel_middleware.py", line 217, in process_request
    attributes = collect_request_attributes(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tnorth/.envs/website/lib/python3.12/site-packages/opentelemetry/instrumentation/wsgi/__init__.py", line 374, in collect_request_attributes
    wsgiref_util.request_uri(environ)
  File "/usr/lib/python3.12/wsgiref/util.py", line 61, in request_uri
    path_info = quote(environ.get('PATH_INFO',''), safe='/;=,', encoding='latin1')
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/urllib/parse.py", line 917, in quote
    string = string.encode(encoding, errors)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 1-5: ordinal not in range(256)

The latin1 bit seems to be hardcoded in the underlying library

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Added unit test

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Copy link

linux-foundation-easycla bot commented Jul 22, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@xrmx
Copy link
Contributor

xrmx commented Jul 22, 2025

Puzzled by this, if it is wsgiref raising non-latin1 chars if django allows them maybe we should use something else instead of wsgiref in the first place? Have you looked at what django is doing? PEP-3333 also seems to support what we are doing https://peps.python.org/pep-3333/#unicode-issues

@tnoff
Copy link
Author

tnoff commented Jul 23, 2025

@xrmx If I'm reading this all correctly it looks like django has a lot of string handling under the hood to handle this sort of thing

https://github.com/django/django/blob/77d455ae73b177a32102f0b248828b5d63c0aa24/django/http/request.py#L215
https://github.com/django/django/blob/main/django/utils/encoding.py#L191

@xrmx
Copy link
Contributor

xrmx commented Jul 23, 2025

@tnoff
Copy link
Author

tnoff commented Jul 23, 2025

@xrmx yeah I updated yesterday with some bits pretty close to the repercent_broken_unicode function https://github.com/django/django/blob/main/django/utils/encoding.py#L213

@tnoff
Copy link
Author

tnoff commented Aug 4, 2025

@xrmx sorry for ping but did you get a chance to check out the revisions I made?

@xrmx
Copy link
Contributor

xrmx commented Aug 4, 2025

@xrmx sorry for ping but did you get a chance to check out the revisions I made?

No, sorry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants