Use os.fsencode to encode bytes #5662

snejus · 2025-03-10T07:49:18Z

No description provided.

github-actions · 2025-03-10T07:49:32Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

wisp3rwind

I'm definitely out of my depth here; to review this properly, I'm lacking knowledge on the whole en/decoding business.

That said, I left a comment on one change that seemed unclear.

Superficially, this seems plausible as a fix to #5648, since that issue is probably caused by partial conversion of en/decoding usage.

wisp3rwind · 2025-03-12T15:08:01Z

beets/autotag/hooks.py

+    str1 = unidecode(str1)
+    str2 = unidecode(str2)


Nice catch that this was a no-op these days!

wisp3rwind · 2025-03-12T15:18:20Z

beets/dbcore/query.py

@@ -216,7 +217,7 @@ def value_match(cls, pattern: P, value: Any):
        """Determine whether the value matches the pattern. The value
        may have any type.
        """
-        return cls.string_match(pattern, util.as_string(value))
+        return cls.string_match(pattern, os.fsdecode(str(value or "")))


I'm not sure about this, it definitely deviates from the previous behaviour:

str(b'abc') == "b'abc'"

Does it really make sense to involve fsdecode here? The value has no immediate relation to the filesystem, but is meant to match database fields (path queries are handled separately in library.PathQuery). Before, the ignore error handler was used: That way, there are definitely no decoding errors. Now, I'm not sure what could happen for decoding errors: Surrogates maybe? What would that mean for string_match.

FlorentLM · 2025-03-19T19:57:56Z

Damn I was about to do this exact thing, glad I checked the issues first haha

Beets is as-is unusable for me with the current util._fsencoding function (there are surrogated characters in place of some swedish chars in an Opeth album and the parsing errors and I can't import the rest of the music).

Using os.fsencode works.

Copilot

Pull Request Overview

This PR updates the codebase to use os.fsencode and os.fsdecode for handling bytes and string conversions, replacing custom encoding routines. Key changes include the removal of arg_encoding/_fsencoding functions, uniform conversions in command, query, and utility functions, and corresponding updates in tests and plugins.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
beetsplug/convert.py	Replace custom decoding with os.fsdecode for command and argument processing.
beets/dbcore/query.py	Update value conversion using os.fsdecode instead of util.as_string.
beets/util/init.py	Remove arg_encoding/_fsencoding and simplify bytestring_path and displayable_path.
Various test files	Adjust tests to match new encoding conversion methods.
beetsplug/hook.py, ipfs.py, etc	Update encoding usage consistently across plugins and command handling.

Comments suppressed due to low confidence (2)

beets/util/init.py:391

The previous try/except fallback to utf-8 was removed in bytestring_path. Please verify that os.fsencode handles all edge cases in non-standard locales or with unusual inputs.

return os.fsencode(str_path)

beets/dbcore/query.py:220

[nitpick] While os.fsdecode returns a string unchanged if given a str, wrapping str(value) in os.fsdecode may be redundant; consider clarifying the intent with a comment.

return cls.string_match(pattern, os.fsdecode(str(value or "")))

Encode paths using os.fsencode

0c00854

snejus requested review from JOJ0 and wisp3rwind March 10, 2025 07:49

snejus linked an issue Mar 10, 2025 that may be closed by this pull request

UnicodeEncodeError: 'utf-8' codec can't encode character '\udce9' in position 14: surrogates not allowed #5648

Open

snejus mentioned this pull request Mar 10, 2025

UnicodeEncodeError: 'utf-8' codec can't encode character '\udce9' in position 14: surrogates not allowed #5648

Open

snejus added 3 commits March 10, 2025 08:13

Say bye to util._fsencoding

b7a36bc

Remove arg_encoding

2d4269f

Remove as_string

0f8a191

wisp3rwind reviewed Mar 12, 2025

View reviewed changes

snejus requested a review from Copilot March 20, 2025 17:31

Copilot AI reviewed Mar 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use os.fsencode to encode bytes #5662

Use os.fsencode to encode bytes #5662

snejus commented Mar 10, 2025

github-actions bot commented Mar 10, 2025

wisp3rwind left a comment

wisp3rwind Mar 12, 2025

wisp3rwind Mar 12, 2025

FlorentLM commented Mar 19, 2025 •

edited

Loading

Copilot AI left a comment

Use os.fsencode to encode bytes #5662

Are you sure you want to change the base?

Use os.fsencode to encode bytes #5662

Conversation

snejus commented Mar 10, 2025

github-actions bot commented Mar 10, 2025

wisp3rwind left a comment

Choose a reason for hiding this comment

wisp3rwind Mar 12, 2025

Choose a reason for hiding this comment

wisp3rwind Mar 12, 2025

Choose a reason for hiding this comment

FlorentLM commented Mar 19, 2025 • edited Loading

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

FlorentLM commented Mar 19, 2025 •

edited

Loading