[red-knot] treat annotated assignments without RHS in stubs as bindings #16409

mishamsk · 2025-02-27T02:12:34Z

Summary

Make a small-ish change, marking all declarations in stubs from annotated assignments as bound, even if there is no RHS. This is done by changing the implementation of category, that now takes whether it is in a stub context as an argument. I think this reflects logic encapsulation better then checking for stub context in downstream code, combining the category return with the stub flag.

However, current state of red knot would only reveal one behavior change: avoid a very rare (maybe even non-existent) false negative when a stub uses a global variable defined without a default value in the same stub.

With the current symbol boundness inference logic is not changed, unbound class and instance attributes do not emit diagnostics => there is no apparent difference between stubs & regular modules.

Fixes #16264

Test Plan

A couple of new mdtest sections.

mishamsk · 2025-02-27T04:10:19Z

@carljm need your help here before I mark this for review.

I've implemented the change, however I wanted to make sure that I fully understand what you meant in the original issue.

1. The mention of Protocol case

You mentioned Protocol as an instance of a special form in non-annotation position that will not be properly handled by knot. I was very surprised to see how it is typed in the typeshed. It doesn't match the actual implementation. In the implementation SpecialForm is not used as a base class, moreover it explicitly defines __mro_entires__ that will raise if one attempts to subclass it. It is used to mark special forms used only in annotation position. Protocol on the other hand is a proper subclass of Generic and can be used as a base class.

Running mypy on a subset of typing.pyi yields something along these lines:

ttt.pyi:7: error: Variable "ttt.Protocol" is not valid as a type  [valid-type]
ttt.pyi:7: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
ttt.pyi:7: error: Invalid base class "Protocol"  [misc]

and knot reports roughly equivalent diagnostic. Which seems legit to me.

What am I missing? What was the case you were thinking about? Check out the new test I've added, that's the only thing I could come up with.

2. Class/Instance variables

The result of this change will be invisible unless/until access to unbound class/instance variables will start emitting diagnostics. I assume it wasn't the goal of this issue/PR to tackle the re-design of symbol boundness classification. Am I right?

codspeed-hq · 2025-02-27T04:25:02Z

CodSpeed Performance Report

Merging #16409 will improve performances by 5.09%

_{Comparing mishamsk:16264-treat-declarations-in-stubs-as-bindings (f8e7136) with main (980faff)}

Summary

⚡ 1 improvements
✅ 31 untouched benchmarks

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`red_knot_check_file[incremental]`	5.5 ms	5.2 ms	+5.09%

carljm

@carljm need your help here before I mark this for review.

Oops, once I was here I kinda forgot it wasn't marked ready for review, and I went ahead and looked it over and commented; sorry if some of my comments were things you already knew about and planned to change.

FWIW the code changes here look great to me! My inline comments are only about the tests. So all in all I think this PR is very close to ready for merge.

1. The mention of Protocol case

You mentioned Protocol as an instance of a special form in non-annotation position that will not be properly handled by knot. I was very surprised to see how it is typed in the typeshed. It doesn't match the actual implementation. In the implementation SpecialForm is not used as a base class, moreover it explicitly defines __mro_entires__ that will raise if one attempts to subclass it. It is used to mark special forms used only in annotation position. Protocol on the other hand is a proper subclass of Generic and can be used as a base class.

_SpecialForm is used to annotate symbols in typeshed that type checkers cannot handle according to the "normal" rules, and will need to special case. Protocol is one of these symbols. The annotation of _SpecialForm is not intended to suggest that type-checkers should handle Protocol as if it were literally an instance of _SpecialForm at runtime. Instead it just means "you're on your own, this is going to need special-casing directly in the type checker."

You can look at our KnownClass struct (and its usage) for how we handle special-casing of many of these special forms (as well as other builtin types) already. We don't yet recognize or support typing.Protocol in KnownClass, so we currently just infer it as an instance of _SpecialForm, but this is something that will have to change for us to add any support for protocols.

It's kind of confusing that I mentioned Protocol in the issue description even though we don't have support for it yet; sorry about that. I think what happened is @sharkdp tried adding minimal recognition of Protocol in an earlier PR, and ran into the bug you are fixing in this PR, and because of that bug we never landed the Protocol recognition.

You can already see this bug with Protocol:

from typing import SupportsInt
reveal_type(SupportsInt.__mro__)

On main this reveals tuple[Literal[SupportsInt], Unknown, Literal[object]]. The Unknown in the middle of that MRO is the bug you are fixing in this PR. It's currently Unknown because local uses of a name in the same scope consider only bindings, not declarations, and (until this PR) Protocol: _SpecialForm is not considered a binding, even in a stub file, so we just consider uses of Protocol in the global scope of typing.pyi as uses of an unbound name, resulting in the type Unknown. Of course, if we fix that bug without also adding recognition of Protocol and support for using it as a base class, then we'd instead just have an error about an instance of _SpecialForm not being a valid base class, and we'd still end up with Unknown in the MRO.

All that said, I don't actually think we should add support for Protocol and a test for this in this PR. Your simpler test with uses_constant = CONSTANT is testing the same behavior fix in a simpler way, and is IMO sufficient for this PR.

Running mypy on a subset of typing.pyi yields something along these lines:
ttt.pyi:7: error: Variable "ttt.Protocol" is not valid as a type  [valid-type]
ttt.pyi:7: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
ttt.pyi:7: error: Invalid base class "Protocol"  [misc]
and knot reports roughly equivalent diagnostic. Which seems legit to me.

This kind of an experiment is I think perhaps more confusing than useful, because type checkers (including both us and mypy) special-case many names in typing.pyi, and do so based on those names occurring in a module named typing. So the behavior seen when you check that same code in a module not named typing bears little resemblance to how the same type checker treats the actual typing module. (I guess this is relevant if you want to know how a type checker would treat the same code without any special-casing, but I don't think that tells us much here.)

2. Class/Instance variables

The result of this change will be invisible unless/until access to unbound class/instance variables will start emitting diagnostics.

No, I think the change is already visible in how we interpret x: int in a class body in a stub file. Before this PR, we would consider x to be an instance-only variable (because no binding in class body) and would issue an error if its accessed directly on the class. After this PR, we should consider it a class-or-instance var, and allow access on the class. This change is visible in the removed __module__ false positive in the benchmark, and I've also suggested in my inline comments that we test it explicitly.

I assume it wasn't the goal of this issue/PR to tackle the re-design of symbol boundness classification. Am I right?

Correct, that should be a separate PR.

crates/red_knot_python_semantic/resources/mdtest/attributes.md

crates/red_knot_python_semantic/resources/mdtest/stubs/class.md

crates/red_knot_python_semantic/src/semantic_index/definition.rs

carljm · 2025-02-27T20:30:22Z

The CodSpeed perf results here are really interesting. It's nice to see the win on the incremental benchmark, but I'm not sure I understand where it's coming from! It looks like we also see a 3% regression on the cold-check benchmark. I suspect this is just because we now understand many more name references inside stub files, leading to more complex work than just "consider it unbound and fall back to Unknown". So I don't think we need to do anything about that.

Regarding the "missing key" panic, it looks like the file it panics on has just a single line of code, which we end up parsing as an annotated assignment with no RHS and a BinOp annotation, followed by a number of bare Name expressions. One possibility is that somehow this PR is triggering a query cycle analyzing this code. Per the TODO comment in infer_definition_types_cycle_recovery, a cycle can currently lead to missing expression types. If you can verify that we are newly hitting infer_definition_types_cycle_recovery when checking this example, then I think we can just add this example to the KNOWN_FAILURES list in crates/red_knot_project/tests/check.rs and move on -- I'm working on cycle issues already.

carljm · 2025-02-27T20:33:32Z

On second look, given that the name assigned to is also referenced in the annotation, I'm quite confident that the problem there is a cycle. When looking up bindings for that name before this PR, we'd have found none; after this PR, we find the same binding we are already trying to analyze (because annotations are also "lazy" in stub files, so they don't only look at previous bindings but also subsequent ones.)

move tests around properly wire both semantic index and inference for stubs

mishamsk · 2025-02-28T02:34:20Z

@carljm I've applied all suggestions and consolidated tests in stub subfolder.

I also did further investigation into the cycle related panic, and looks like I'll need your and maybe @MichaReiser help. What I've learned:

I did confirm that it is the cycle recovery by temporarily injecting some debugging code
After adding the first panic to known errors, the second popped up in https://github.com/mishamsk/ruff/blob/16264-treat-declarations-in-stubs-as-bindings/crates/ruff_linter/resources/test/fixtures/pyflakes/F401_0.py#L43

For convenience here is the snippet that causes a cycle:

class X:
    datetime: datetime

what I do not understand is why it happens in this test that uses SemanticModel API's, but if I run compiled knot on this file OR add similar code to mdtests - everything is fine, no panic/cycle. I haven't traced the differences between HasType::inferred_type that's is used in the failing test vs. regular tests/cli. But there has to be something that causes the cycle.

I tried ignoring the F401_0, but then another one panics. I didn't check how many, but this PR seems to be triggering a lot of panics. So I thought it may be more prudent to fix the root cause. Any suggestions?

mishamsk · 2025-02-28T02:39:51Z

The CodSpeed perf results here are really interesting. It's nice to see the win on the incremental benchmark, but I'm not sure I understand where it's coming from!

yes, I've also noticed this. Magic?;-) A random guess - before that a lack of binding caused back-off logic during symbol search (in infer_name_load), which is now unnecessary. If so, that would mean that this back-off mechanism is complicated enough to cause such a negative impact when symbol is not bound locally and needs to be looked up further up the chain.

carljm · 2025-02-28T02:44:17Z

what I do not understand is why it happens in this test that uses SemanticModel API's

It's because only these tests assert (as a bug-catching mechanism, via the PullTypes visitor) that every expression has a type recorded for it. In normal type inference we do not spend the time to do a full AST walk and assert this. The cycle itself does not fail (due to the cycle recovery function) but that cycle recovery function doesn't insert a type for every expression, so it causes these coverage tests to panic due to missing types for some expressions.

This will be fixed by #14029, so it's OK to just add a lot of new KNOWN_FAILURES here. The examples you've shown so far look like legitimate cycles that would be expected with the change of semantics in this PR, so I'm OK with just silencing them for now.

The other option would be to extract just the "fallback type" part of #14029 (not the fixpoint iteration part), where it gives TypeInference the possibility to have a "fallback type" used for all expressions that don't have a specific type inserted. Then we can create this TypeInference with fallback (probably just to Type::Todo(cycle handling)) in the cycle recovery function, and these panics should go away.

If we do the latter, it would probably be better to do it as a separate PR, so we can see the perf impact of just the "fallback type" feature alone. There may be some ways we can improve it if it's a regression.

carljm · 2025-02-28T02:47:54Z

A random guess - before that a lack of binding caused back-off logic during symbol search (in infer_name_load), which is now unnecessary. If so, that would mean that this back-off mechanism is complicated enough to cause such a negative impact when symbol is not bound locally and needs to be looked up further up the chain.

It could be related to this. The improvement shows up in the "incremental" benchmark, not the "cold" benchmark. In the incremental benchmark we make a no-op change to one file (add a comment), so we are not really testing the performance of type inference, we are testing how fast Salsa can re-validate that nothing has changed. So usually improvements here are because we call fewer Salsa queries, or have fewer dependencies between Salsa queries. But it's true that this change could mean many fewer inference queries on scopes in stub files now take on dependencies on a different scope (the global or builtins scopes). This makes sense, because I don't expect that fallback to actually be a lot of extra work (so we don't gain much from this in cold benchmark), but it can generate a lot of extra Salsa query dependencies (so we do gain in the incremental benchmark.)

github-actions · 2025-02-28T03:01:56Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

mishamsk · 2025-02-28T03:03:14Z

It's because only these tests assert (as a bug-catching mechanism, via the PullTypes visitor) that every expression has a type recorded for it. In normal type inference we do not spend the time to do a full AST walk and assert this. The cycle itself does not fail (due to the cycle recovery function) but that cycle recovery function doesn't insert a type for every expression, so it causes these coverage tests to panic due to missing types for some expressions.

Got it, thanks. Now it makes sense. I used bare println! to see where the cycle is being triggered, and didn't see the output when running the cli - that's why I thought the cycle wasn't even triggered. But I may just missed it among verbose logs or stdout is just getting captured somewhere along the way.

so it's OK to just add a lot of new KNOWN_FAILURES here

well it turned out to be just 3 more files. Checked all of them, seems legit, so added to KNOWN_FAILURES

The other option would be to extract just the "fallback type" part of #14029

From what I can see, you've added the fallback type on the struct and it's usage in getter API's that access hasmaps, but if I am not mistaken it is not yet populated with actual fallbacks depending on the context. So I thought it would require more work to finalize. So I think this can be merged as-is. Especially since it is only 4 new exceptions to tests.

AlexWaygood

Thanks, this is great!!

crates/red_knot_python_semantic/resources/mdtest/stubs/class.md

crates/red_knot_python_semantic/src/semantic_index/definition.rs

carljm · 2025-02-28T14:28:06Z

From what I can see, you've added the fallback type on the struct and it's usage in getter API's that access hasmaps, but if I am not mistaken it is not yet populated with actual fallbacks depending on the context.

The only place we want to populate a fallback is when we are creating a TypeInference from scratch in cycle recovery, and my PR already does this. It would look different in main than in my PR because in main there is only one cycle recovery function, so only one place we'd want to use a fallback type (right at that TODO comment about missing expression types).

But since there are so few failures it's fine to just put them in known failures instead!

minor changes to comments in tests Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

mishamsk · 2025-02-28T15:41:04Z

The only place we want to populate a fallback is when we are creating a TypeInference from scratch in cycle recovery

ah, I think I missed that usage. I was just skimming over on GitHub.

I've applied two comment changes suggested by @AlexWaygood . I think it is good to be merged.

AlexWaygood · 2025-02-28T16:45:14Z

Thanks again!

AlexWaygood added the red-knot Multi-file analysis & type inference label Feb 27, 2025

carljm reviewed Feb 27, 2025

View reviewed changes

mishamsk added 5 commits February 27, 2025 20:42

mark all augmented assignments in stubs as bound

87a993e

add module global variable test case

fa8e365

move tests around properly wire both semantic index and inference for stubs

remove benchmark false positive added in astral-sh#16036

4fa8ae4

improve tests

d15be3c

skip one test failing due to cycle

cb674b8

mishamsk force-pushed the 16264-treat-declarations-in-stubs-as-bindings branch from 8b46529 to cb674b8 Compare February 28, 2025 02:21

mishamsk marked this pull request as ready for review February 28, 2025 02:24

mishamsk requested review from MichaReiser, AlexWaygood and sharkdp as code owners February 28, 2025 02:24

add all annotation cycle related failurs to know

1000fd9

MichaReiser approved these changes Feb 28, 2025

View reviewed changes

AlexWaygood reviewed Feb 28, 2025

View reviewed changes

crates/red_knot_python_semantic/resources/mdtest/stubs/class.md Outdated Show resolved Hide resolved

crates/red_knot_python_semantic/resources/mdtest/stubs/class.md Outdated Show resolved Hide resolved

AlexWaygood reviewed Feb 28, 2025

View reviewed changes

crates/red_knot_python_semantic/src/semantic_index/definition.rs Show resolved Hide resolved

AlexWaygood approved these changes Feb 28, 2025

View reviewed changes

mishamsk and others added 2 commits February 28, 2025 10:34

Apply suggestions from code review

d0b4245

minor changes to comments in tests Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

pre-commit fixes

f8e7136

AlexWaygood merged commit fdf0915 into astral-sh:main Feb 28, 2025
21 checks passed

mishamsk deleted the 16264-treat-declarations-in-stubs-as-bindings branch February 28, 2025 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] treat annotated assignments without RHS in stubs as bindings #16409

[red-knot] treat annotated assignments without RHS in stubs as bindings #16409

mishamsk commented Feb 27, 2025 •

edited

Loading

mishamsk commented Feb 27, 2025

codspeed-hq bot commented Feb 27, 2025 •

edited

Loading

carljm left a comment

carljm commented Feb 27, 2025

carljm commented Feb 27, 2025

mishamsk commented Feb 28, 2025

mishamsk commented Feb 28, 2025

carljm commented Feb 28, 2025 •

edited

Loading

carljm commented Feb 28, 2025

github-actions bot commented Feb 28, 2025 •

edited

Loading

mishamsk commented Feb 28, 2025 •

edited

Loading

AlexWaygood left a comment

carljm commented Feb 28, 2025

mishamsk commented Feb 28, 2025

AlexWaygood commented Feb 28, 2025

[red-knot] treat annotated assignments without RHS in stubs as bindings #16409

[red-knot] treat annotated assignments without RHS in stubs as bindings #16409

Conversation

mishamsk commented Feb 27, 2025 • edited Loading

Summary

Test Plan

mishamsk commented Feb 27, 2025

codspeed-hq bot commented Feb 27, 2025 • edited Loading

CodSpeed Performance Report

Merging #16409 will improve performances by 5.09%

Summary

Benchmarks breakdown

carljm left a comment

Choose a reason for hiding this comment

carljm commented Feb 27, 2025

carljm commented Feb 27, 2025

mishamsk commented Feb 28, 2025

mishamsk commented Feb 28, 2025

carljm commented Feb 28, 2025 • edited Loading

carljm commented Feb 28, 2025

github-actions bot commented Feb 28, 2025 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

mishamsk commented Feb 28, 2025 • edited Loading

AlexWaygood left a comment

Choose a reason for hiding this comment

carljm commented Feb 28, 2025

mishamsk commented Feb 28, 2025

AlexWaygood commented Feb 28, 2025

mishamsk commented Feb 27, 2025 •

edited

Loading

codspeed-hq bot commented Feb 27, 2025 •

edited

Loading

carljm commented Feb 28, 2025 •

edited

Loading

github-actions bot commented Feb 28, 2025 •

edited

Loading

`ruff-ecosystem` results

mishamsk commented Feb 28, 2025 •

edited

Loading