-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
DOC: definition of a scalar in rich comparison methods #62191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DOC: definition of a scalar in rich comparison methods #62191
Conversation
Anything can be an element of a series. |
Hi @jbrockmendel, thank you for the quick review. The original docs can potentially imply that only scalars recognised by Pandas can be accepted as arguments in This is not true. An Enum, for example, fails the This PR aims to remove the potential misinterpretation of the original documentation. |
pandas/core/series.py
Outdated
@@ -6072,7 +6072,7 @@ def eq( | |||
|
|||
Parameters | |||
---------- | |||
other : Series or scalar value | |||
other : Series or scalar value (that can be the element of a Series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parentheses added here do not help the issue since any Python object can be an element of a Series (even a Series itself can be an element of a Series!). We want to indicate that this method treats lists, tuples, ndarrays, and Series differently from any other Python object. Namely, these types will be evaluated element-by-element whereas every other Python object will be evaluated as a whole against each element of the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
288f06e does it look better?
@cmp0xff - see #62063 (comment) |
Hi, I adapted the new idea and added a test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
|
||
|
||
def test_eq_objects(): | ||
"""Test eq with Enum and List elements""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a comment; our tests (mostly) do not utilize docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a reference to the PR as a comment here:
# GH#62191
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SECOND = auto() | ||
|
||
left = pd.Series([Thing.FIRST, Thing.SECOND]) | ||
tm.assert_series_equal(left.eq(Thing.FIRST), left == Thing.FIRST) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you setup the asserts in the following pattern:
expected = ...
result = ...
tm.assert_series_equal(result, expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left_non_scalar = pd.Series([[1, 2], [3, 4]]) | ||
with pytest.raises(AssertionError): | ||
tm.assert_series_equal(left_non_scalar.eq([1, 2]), pd.Series([True, False])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of checking that the result does not equal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was discussed that eq
does not work properly when the elements of a Series
is a list
or other array-like objects. I wanted to verify this point, and show that the result is not as one would expected.
If this is not necessary, I can remove it.
match = r"Can only compare identically-labeled Series objects" | ||
with pytest.raises(ValueError, match=match): | ||
_ = left == pd_s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test does not belong here and is already tested for elsewhere; can you remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tm.assert_series_equal(left_non_scalar.eq([1, 2]), pd.Series([True, False])) | ||
|
||
|
||
def test_eq_with_index(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same requests with this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean removing the whole test test_eq_with_index
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the requests in the test above also to be done in this test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rhshadrach , from the context I am not too sure which lines to be remove. Could you be more specific and quote the line numbers? Thanks.
I have checked
left_non_scalar = pd.Series([[1, 2], [3, 4]])
result_non_scalar = left_non_scalar.eq([1, 2])
expected_would_be = pd.Series([True, False])
with pytest.raises(AssertionError):
tm.assert_series_equal(result_non_scalar, expected_would_be)
which seems to have analogy with the lines I removed earlier. That seems to be a new test to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review 😄
tm.assert_series_equal(left_non_scalar.eq([1, 2]), pd.Series([True, False])) | ||
|
||
|
||
def test_eq_with_index(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean removing the whole test test_eq_with_index
?
left_non_scalar = pd.Series([[1, 2], [3, 4]]) | ||
with pytest.raises(AssertionError): | ||
tm.assert_series_equal(left_non_scalar.eq([1, 2]), pd.Series([True, False])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was discussed that eq
does not work properly when the elements of a Series
is a list
or other array-like objects. I wanted to verify this point, and show that the result is not as one would expected.
If this is not necessary, I can remove it.
|
||
|
||
def test_eq_objects(): | ||
"""Test eq with Enum and List elements""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SECOND = auto() | ||
|
||
left = pd.Series([Thing.FIRST, Thing.SECOND]) | ||
tm.assert_series_equal(left.eq(Thing.FIRST), left == Thing.FIRST) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match = r"Can only compare identically-labeled Series objects" | ||
with pytest.raises(ValueError, match=match): | ||
_ = left == pd_s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was inspired by #62063 (comment) and will relax the philosophical tension that caused the closure of pandas-dev/pandas-stubs#1288.