gh-95004: specialize access to enums and fix scaling on free-threading by kumaraditya303 · Pull Request #148184 · python/cpython

kumaraditya303 · 2026-04-06T18:21:53Z

Issue: Specialize access to Enum attributes #95004

Fidget-Spinner

Thanks for doing this, I have just 3 comments.

Lib/test/test_opcache.py

Python/specialize.c

…o enums

Fidget-Spinner

Pretty close, just one question and one minor nit.

Misc/NEWS.d/next/Core_and_Builtins/2026-04-06-18-25-53.gh-issue-95004.CQeT_H.rst

Fidget-Spinner · 2026-04-06T18:36:43Z

Python/specialize.c

+        case MUTABLE:
+            // special case for enums which has Py_TYPE(descr) == cls
+            // so guarding on type version is sufficient
+            if (Py_TYPE(descr) != cls) {


Just checking, Py_TYPE(descr) cannot change right? Like you can't change the __class__ of it later? Or does it not matter because we _GUARD_TYPE_VERSION so it's protected by version check?

Yeah, Py_TYPE(descr) is protected by the version check of _GUARD_TYPE_VERSION

colesbury

I just tried this on a 96 core AWS machine and the scaling is not great: ~2.9x faster.

Given that, I'm pretty ambivalent about the chnage
I don't like to add benchmarks to Tools/ftscalingbench/ftscalingbench.py that don't scale well

kumaraditya303 · 2026-04-06T19:28:37Z

I just tried this on a 96 core AWS machine and the scaling is not great: ~2.9x faster.

On my mac I see 4.2x faster:

❯ ./python.exe Tools/ftscalingbench/ftscalingbench.py
Running benchmarks with 10 threads
object_cfunction           3.2x faster
cmodule_function           2.8x faster
object_lookup_special      4.3x faster
context_manager            4.9x faster
mult_constant              2.3x faster
generator                  2.3x faster
pymethod                   3.0x faster
pyfunction                 2.7x faster
module_function            2.7x faster
load_string_const          4.1x faster
load_tuple_const           3.6x faster
create_pyobject            4.4x faster
create_closure             4.4x faster
create_dict                3.8x faster
create_frozendict          4.0x faster
thread_local_read          3.5x faster
method_caller              3.3x faster
instantiate_dataclass      4.9x faster
instantiate_namedtuple     4.9x faster
instantiate_typing_namedtuple  4.9x faster
super_call                 4.6x faster
classmethod_call           3.8x faster
staticmethod_call          3.5x faster
deepcopy                   1.8x slower
setattr_non_interned       4.5x faster
enum_attr                  4.2x faster

I don't like to add benchmarks to Tools/ftscalingbench/ftscalingbench.py that don't scale well

I can remove the benchmark if you prefer.

Fidget-Spinner · 2026-04-06T19:39:30Z

Could it be that the Enum itself is not using deferred refcounting? It's a LOAD_GLOBAL_MODULE which increfs and then decrefs at each LOAD_ATTR.

Fidget-Spinner · 2026-04-06T20:00:43Z

I found the problem:

It seems that this perpetually deopts at the first _GUARD_TYPE_VERSION. Then, that causes a re-specialization, which is obviously bottlenecked on a lot of things. So it seems the current specialization/deopt needs to be fixed.

Investigating now.

Fidget-Spinner · 2026-04-06T20:12:17Z

Before:

taskset -c 0,2,4,6,8,10 ./python Tools/ftscalingbench/ftscalingbench.py enum_attr -t 6 --scale 10000
Running benchmarks with 6 threads
enum_attr                  4.2x faster

After:

Running benchmarks with 6 threads
enum_attr                  6.0x faster

Seems the pre-existing specialization for METACLASS_CHECK was bugged. Diff to fix this:

diff --git a/Python/specialize.c b/Python/specialize.c
index 355b6eabdb7..bfa7b8148e4 100644
--- a/Python/specialize.c
+++ b/Python/specialize.c
@@ -1220,13 +1220,14 @@ specialize_class_load_attr(PyObject *owner, _Py_CODEUNIT *instr,
 #ifdef Py_GIL_DISABLED
             maybe_enable_deferred_ref_count(descr);
 #endif
-            write_u32(cache->type_version, tp_version);
             write_ptr(cache->descr, descr);
             if (metaclass_check) {
-                write_u32(cache->keys_version, meta_version);
+                write_u32(cache->keys_version, tp_version);
+                write_u32(cache->type_version, meta_version);
                 specialize(instr, LOAD_ATTR_CLASS_WITH_METACLASS_CHECK);
             }
             else {
+                write_u32(cache->type_version, tp_version);
                 specialize(instr, LOAD_ATTR_CLASS);
             }
             Py_XDECREF(descr);

This is bugged in 3.14 as well it seems 5d3201f

Fidget-Spinner · 2026-04-06T20:26:05Z

@colesbury can you please try this again? Thank you!

kumaraditya303 added 2 commits April 6, 2026 20:00

fix scaling of enums

212bc3c

add tests

48ec270

bedevere-app bot added the awaiting core review label Apr 6, 2026

bedevere-app bot mentioned this pull request Apr 6, 2026

Specialize access to Enum attributes #95004

Open

📜🤖 Added by blurb_it.

76e1ffc

Fidget-Spinner reviewed Apr 6, 2026

View reviewed changes

Lib/test/test_opcache.py Outdated Show resolved Hide resolved

Lib/test/test_opcache.py Outdated Show resolved Hide resolved

Python/specialize.c Outdated Show resolved Hide resolved

kumaraditya303 added 2 commits April 7, 2026 00:02

address review

4103816

Merge branch 'enums' of https://github.com/kumaraditya303/cpython int…

9c70f7a

…o enums

kumaraditya303 requested review from Fidget-Spinner and colesbury April 6, 2026 18:33

Fidget-Spinner reviewed Apr 6, 2026

View reviewed changes

fix news wording

fc85a0d

Fidget-Spinner approved these changes Apr 6, 2026

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Apr 6, 2026

kumaraditya303 added performance Performance or resource usage topic-free-threading labels Apr 6, 2026

colesbury reviewed Apr 6, 2026

View reviewed changes

Fix LOAD_ATTR_CLASS_WITH_METACLASS_CHECK cache

2702e8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-95004: specialize access to enums and fix scaling on free-threading#148184

gh-95004: specialize access to enums and fix scaling on free-threading#148184
kumaraditya303 wants to merge 7 commits intopython:mainfrom
kumaraditya303:enums

kumaraditya303 commented Apr 6, 2026 •

edited by bedevere-app bot

Loading

Uh oh!

Fidget-Spinner left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fidget-Spinner left a comment

Uh oh!

Uh oh!

Fidget-Spinner Apr 6, 2026

Uh oh!

kumaraditya303 Apr 6, 2026

Uh oh!

colesbury left a comment

Uh oh!

kumaraditya303 commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

kumaraditya303 commented Apr 6, 2026 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fidget-Spinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fidget-Spinner Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

kumaraditya303 Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

kumaraditya303 commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

Fidget-Spinner commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kumaraditya303 commented Apr 6, 2026 •

edited by bedevere-app bot

Loading

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading

Fidget-Spinner commented Apr 6, 2026 •

edited

Loading