You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
shard_example_idx should be 2 instead of 6
If we run with num_shards=2, then shard_example_idx is 3 instead of 2 and so on.
Environment info
datasets version: 3.4.1
Platform: macOS-14.6.1-arm64-arm-64bit
Python version: 3.12.9
huggingface_hub version: 0.29.3
PyArrow version: 19.0.1
Pandas version: 2.2.3
fsspec version: 2024.12.0
The text was updated successfully, but these errors were encountered:
bruno-hays
changed the title
IterableDataset's state_dict shard_example_idx is always the same
IterableDataset's state_dict shard_example_idx is always equal to the number of samples in a shard
Mar 25, 2025
Describe the bug
I've noticed a strange behaviour with Iterable state_dict: the value of shard_example_idx is always equal to the amount of samples in a shard.
Steps to reproduce the bug
I am reusing the example from the doc
Returns:
Expected behavior
shard_example_idx should be 2 instead of 6
If we run with num_shards=2, then shard_example_idx is 3 instead of 2 and so on.
Environment info
datasets
version: 3.4.1huggingface_hub
version: 0.29.3fsspec
version: 2024.12.0The text was updated successfully, but these errors were encountered: