-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] treat annotated assignments without RHS in stubs as bindings #16409
[red-knot] treat annotated assignments without RHS in stubs as bindings #16409
Conversation
@carljm need your help here before I mark this for review. I've implemented the change, however I wanted to make sure that I fully understand what you meant in the original issue. 1. The mention of Protocol case You mentioned Running mypy on a subset of
and knot reports roughly equivalent diagnostic. Which seems legit to me. What am I missing? What was the case you were thinking about? Check out the new test I've added, that's the only thing I could come up with. 2. Class/Instance variables The result of this change will be invisible unless/until access to unbound class/instance variables will start emitting diagnostics. I assume it wasn't the goal of this issue/PR to tackle the re-design of symbol boundness classification. Am I right? |
CodSpeed Performance ReportMerging #16409 will improve performances by 5.09%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carljm need your help here before I mark this for review.
Oops, once I was here I kinda forgot it wasn't marked ready for review, and I went ahead and looked it over and commented; sorry if some of my comments were things you already knew about and planned to change.
FWIW the code changes here look great to me! My inline comments are only about the tests. So all in all I think this PR is very close to ready for merge.
1. The mention of Protocol case
You mentioned
Protocol
as an instance of a special form in non-annotation position that will not be properly handled by knot. I was very surprised to see how it is typed in the typeshed. It doesn't match the actual implementation. In the implementationSpecialForm
is not used as a base class, moreover it explicitly defines__mro_entires__
that will raise if one attempts to subclass it. It is used to mark special forms used only in annotation position.Protocol
on the other hand is a proper subclass ofGeneric
and can be used as a base class.
_SpecialForm
is used to annotate symbols in typeshed that type checkers cannot handle according to the "normal" rules, and will need to special case. Protocol
is one of these symbols. The annotation of _SpecialForm
is not intended to suggest that type-checkers should handle Protocol
as if it were literally an instance of _SpecialForm
at runtime. Instead it just means "you're on your own, this is going to need special-casing directly in the type checker."
You can look at our KnownClass
struct (and its usage) for how we handle special-casing of many of these special forms (as well as other builtin types) already. We don't yet recognize or support typing.Protocol
in KnownClass
, so we currently just infer it as an instance of _SpecialForm
, but this is something that will have to change for us to add any support for protocols.
It's kind of confusing that I mentioned Protocol
in the issue description even though we don't have support for it yet; sorry about that. I think what happened is @sharkdp tried adding minimal recognition of Protocol
in an earlier PR, and ran into the bug you are fixing in this PR, and because of that bug we never landed the Protocol
recognition.
You can already see this bug with Protocol
:
from typing import SupportsInt
reveal_type(SupportsInt.__mro__)
On main this reveals tuple[Literal[SupportsInt], Unknown, Literal[object]]
. The Unknown
in the middle of that MRO is the bug you are fixing in this PR. It's currently Unknown
because local uses of a name in the same scope consider only bindings, not declarations, and (until this PR) Protocol: _SpecialForm
is not considered a binding, even in a stub file, so we just consider uses of Protocol
in the global scope of typing.pyi
as uses of an unbound name, resulting in the type Unknown
. Of course, if we fix that bug without also adding recognition of Protocol
and support for using it as a base class, then we'd instead just have an error about an instance of _SpecialForm
not being a valid base class, and we'd still end up with Unknown
in the MRO.
All that said, I don't actually think we should add support for Protocol
and a test for this in this PR. Your simpler test with uses_constant = CONSTANT
is testing the same behavior fix in a simpler way, and is IMO sufficient for this PR.
Running mypy on a subset of
typing.pyi
yields something along these lines:ttt.pyi:7: error: Variable "ttt.Protocol" is not valid as a type [valid-type] ttt.pyi:7: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases ttt.pyi:7: error: Invalid base class "Protocol" [misc]
and knot reports roughly equivalent diagnostic. Which seems legit to me.
This kind of an experiment is I think perhaps more confusing than useful, because type checkers (including both us and mypy) special-case many names in typing.pyi
, and do so based on those names occurring in a module named typing
. So the behavior seen when you check that same code in a module not named typing
bears little resemblance to how the same type checker treats the actual typing
module. (I guess this is relevant if you want to know how a type checker would treat the same code without any special-casing, but I don't think that tells us much here.)
2. Class/Instance variables
The result of this change will be invisible unless/until access to unbound class/instance variables will start emitting diagnostics.
No, I think the change is already visible in how we interpret x: int
in a class body in a stub file. Before this PR, we would consider x
to be an instance-only variable (because no binding in class body) and would issue an error if its accessed directly on the class. After this PR, we should consider it a class-or-instance var, and allow access on the class. This change is visible in the removed __module__
false positive in the benchmark, and I've also suggested in my inline comments that we test it explicitly.
I assume it wasn't the goal of this issue/PR to tackle the re-design of symbol boundness classification. Am I right?
Correct, that should be a separate PR.
crates/red_knot_python_semantic/resources/mdtest/stubs/class.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/stubs/class.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/stubs/class.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/src/semantic_index/definition.rs
Outdated
Show resolved
Hide resolved
The CodSpeed perf results here are really interesting. It's nice to see the win on the incremental benchmark, but I'm not sure I understand where it's coming from! It looks like we also see a 3% regression on the cold-check benchmark. I suspect this is just because we now understand many more name references inside stub files, leading to more complex work than just "consider it unbound and fall back to Unknown". So I don't think we need to do anything about that. Regarding the "missing key" panic, it looks like the file it panics on has just a single line of code, which we end up parsing as an annotated assignment with no RHS and a BinOp annotation, followed by a number of bare Name expressions. One possibility is that somehow this PR is triggering a query cycle analyzing this code. Per the TODO comment in |
On second look, given that the name assigned to is also referenced in the annotation, I'm quite confident that the problem there is a cycle. When looking up bindings for that name before this PR, we'd have found none; after this PR, we find the same binding we are already trying to analyze (because annotations are also "lazy" in stub files, so they don't only look at previous bindings but also subsequent ones.) |
move tests around properly wire both semantic index and inference for stubs
8b46529
to
cb674b8
Compare
@carljm I've applied all suggestions and consolidated tests in stub subfolder. I also did further investigation into the cycle related panic, and looks like I'll need your and maybe @MichaReiser help. What I've learned:
For convenience here is the snippet that causes a cycle: class X:
datetime: datetime what I do not understand is why it happens in this test that uses I tried ignoring the |
yes, I've also noticed this. Magic?;-) A random guess - before that a lack of binding caused back-off logic during symbol search (in |
It's because only these tests assert (as a bug-catching mechanism, via the PullTypes visitor) that every expression has a type recorded for it. In normal type inference we do not spend the time to do a full AST walk and assert this. The cycle itself does not fail (due to the cycle recovery function) but that cycle recovery function doesn't insert a type for every expression, so it causes these coverage tests to panic due to missing types for some expressions. This will be fixed by #14029, so it's OK to just add a lot of new The other option would be to extract just the "fallback type" part of #14029 (not the fixpoint iteration part), where it gives If we do the latter, it would probably be better to do it as a separate PR, so we can see the perf impact of just the "fallback type" feature alone. There may be some ways we can improve it if it's a regression. |
It could be related to this. The improvement shows up in the "incremental" benchmark, not the "cold" benchmark. In the incremental benchmark we make a no-op change to one file (add a comment), so we are not really testing the performance of type inference, we are testing how fast Salsa can re-validate that nothing has changed. So usually improvements here are because we call fewer Salsa queries, or have fewer dependencies between Salsa queries. But it's true that this change could mean many fewer inference queries on scopes in stub files now take on dependencies on a different scope (the global or builtins scopes). This makes sense, because I don't expect that fallback to actually be a lot of extra work (so we don't gain much from this in cold benchmark), but it can generate a lot of extra Salsa query dependencies (so we do gain in the incremental benchmark.) |
|
Got it, thanks. Now it makes sense. I used bare
well it turned out to be just 3 more files. Checked all of them, seems legit, so added to
From what I can see, you've added the fallback type on the struct and it's usage in getter API's that access hasmaps, but if I am not mistaken it is not yet populated with actual fallbacks depending on the context. So I thought it would require more work to finalize. So I think this can be merged as-is. Especially since it is only 4 new exceptions to tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is great!!
crates/red_knot_python_semantic/resources/mdtest/stubs/class.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/stubs/class.md
Outdated
Show resolved
Hide resolved
The only place we want to populate a fallback is when we are creating a But since there are so few failures it's fine to just put them in known failures instead! |
minor changes to comments in tests Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
ah, I think I missed that usage. I was just skimming over on GitHub. I've applied two comment changes suggested by @AlexWaygood . I think it is good to be merged. |
Thanks again! |
Summary
Make a small-ish change, marking all declarations in stubs from annotated assignments as bound, even if there is no RHS. This is done by changing the implementation of
category
, that now takes whether it is in a stub context as an argument. I think this reflects logic encapsulation better then checking for stub context in downstream code, combining the category return with the stub flag.However, current state of red knot would only reveal one behavior change: avoid a very rare (maybe even non-existent) false negative when a stub uses a global variable defined without a default value in the same stub.
With the current symbol boundness inference logic is not changed, unbound class and instance attributes do not emit diagnostics => there is no apparent difference between stubs & regular modules.
Fixes #16264
Test Plan
A couple of new mdtest sections.