You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a Doc's entity is in the second to the last sentence, and the last sentence consists only of one token, entity.sents includes that last 1-token sentence (even though the entity is fully contained by the previous sentence.
How to reproduce the behaviour
text = "This is a sentence. This is another sentence. Third"
doc = nlp.tokenizer(text)
doc[0].is_sent_start = True
doc[5].is_sent_start = True
doc[10].is_sent_start = True
doc.ents = [('ENTITY', 7, 9)] # "another sentence" phrase in the second sentence
entity = doc.ents[0]
print(f"Entity: {entity}. Sentence: {entity.sent} Sentences: {list(entity.sents)}")
Output:
Entity: another sentence. Sentence: This is another sentence. Sentences: [This is another sentence., Third]
Your Environment
Operating System:
Python Version Used:
spaCy Version Used:
Environment Information:
The text was updated successfully, but these errors were encountered:
When a
Doc
's entity is in the second to the last sentence, and the last sentence consists only of one token,entity.sents
includes that last 1-token sentence (even though the entity is fully contained by the previous sentence.How to reproduce the behaviour
Output:
Your Environment
The text was updated successfully, but these errors were encountered: