You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: doc/datasets_and_dataloaders.md
+8-53
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@ They are implemented in `torch_em.data.datasets`. See `scripts/datasets` for exa
8
8
9
9
All datasets in `torch_em.data.datasets` are implemented according to the following logic:
10
10
- The function `get_..._data` downloads the respective datasets. Note that some datasets cannot be downloaded automatically. In these cases the function will raise an error with a message that explains how to download the data.
11
+
- The function `get_..._paths` returns the filepaths to the downloaded inputs.
11
12
- The function `get_..._dataset` returns the PyTorch Dataset for the corresponding dataset.
12
13
- The function `get_..._dataloader` returns the PyTorch DataLoader for the corresponding dataset.
13
14
@@ -21,11 +22,11 @@ We provide several electron microscopy datasets. See `torch_em.data.datasets.ele
21
22
22
23
### Histopathology
23
24
24
-
`torch_em.data.datasets.histopathology`
25
+
We provide several histopathology datasets. See `torch_em.data.datasets.histopathology` for an overview.
25
26
26
27
### Medical Imaging
27
28
28
-
`torch_em.data.datasets.medical`
29
+
We provide several medical imaging datasets. See `torch_em.data.datasets.medical` for an overview.
29
30
30
31
31
32
## How to create your own dataloader?
@@ -48,9 +49,11 @@ Let's say you have a specific dataset of interest and would want to create a PyT
48
49
- ✅ different sizes (i.e. images have shapes like (256, 256), (378, 378), (512, 512), etc., for example)
49
50
- use `ImageCollectionDataset`
50
51
- Multi-channel inputs of:
51
-
-> NOTE: It's important to convert the images to be channels first (see above for the expected format)
52
-
- ✅ same size (i.e. all images have shape (3, 256, 256), for example)
53
-
- use `SegmentationDataset` (recommended) or `ImageCollectionDataset`
52
+
-> The ideal expectation of inputs with channels is to have channels first (eg. RGB format -> (256, 256, 3) to channels-first format -> (3, 256, 256))
53
+
- CASE 1: I would like to keep the inputs as RGB format (you must stick to `ImageCollectionDataset` or `is_seg_dataset=False`)
54
+
- CASE 2: I would like to convert the inputs to channels-first (you can be flexible and follow the instructions below)
55
+
- ✅ same size (i.e. all images have shape)
56
+
- use `SegmentationDataset` (recommended for inputs with channels first) or `ImageCollectionDataset` (for inputs in RGB format)
54
57
- ✅ different sizes (i.e. images have shapes like (3, 256, 256), (3, 378, 378), (3, 512, 512), etc., for example)
# there are other optional parameters, see `torch_em.data.raw_image_collection_dataset.py` for details.
136
139
)
137
140
```
138
-
139
-
140
-
<!-- I would rather auto-generate this in a proper pdoc documentation.
141
-
- ASEM (`asem.py`): Segmentation of organelles in FIB-SEM cells.
142
-
- AxonDeepSeg (`axondeepseg.py`): Segmentation of myelinated axons in electron microscopy.
143
-
- MitoLab* (`cem.py`):
144
-
- CEM MitoLab: Segmentation of mitochondria in electron microscopy.
145
-
- CEM Mito Benchmark: Segmentation of mitochondria in 7 benchmark electron microscopy datasets.
146
-
- Covid IF (`covidif.py`): Segmentation of cells and nuclei in immunofluoroscence.
147
-
- CREMI (`cremi.py`): Segmentation of neurons in electron microscopy.
148
-
- Cell Tracking Challenge (`ctc.py`): Segmentation data for cell tracking challenge (consists of 10 datasets).
149
-
- DeepBacs (`deepbacs.py`): Segmentation of bacteria in light microscopy.
150
-
- DSB (`dsb.py`): Segmentation of nuclei in light microscopy.
151
-
- DynamicNuclearNet* (`dynamicnuclearnet.py`): Segmentation of nuclei in fluorescence microscopy.
152
-
- HPA (`hpa.py`): Segmentation of cells in light microscopy.
153
-
- ISBI (`isbi2012.py`): Segmentation of neurons in electron microscopy.
154
-
- Kasthuri (`kasthuri.py`): Segmentation of mitochondria in electron microscopy.
155
-
- LIVECell (`livecell.py`): Segmentation of cells in phase-contrast microscopy.
156
-
- Lucchi (`lucchi.py`): Segmentation of mitochondria in electron microscopy.
157
-
- MitoEM (`mitoem.py`): Segmentation of mitochondria in electron microscopy.
158
-
- Mouse Embryo (`mouse_embryo.py`): Segmentation of nuclei in confocal microscopy.
159
-
- NeurIPS CellSeg (`neurips_cell_seg.py`): Segmentation of cells in multi-modality light microscopy datasets.
160
-
- NucMM (`nuc_mm.py`): Segmentation of nuclei in electron microscopy and micro-CT.
161
-
- PlantSeg (`plantseg.py`): Segmentation of cells in confocal and light-sheet microscopy.
162
-
- Platynereis (`platynereis.py`): Segmentation of nuclei in electron microscopy.
163
-
- PNAS* (`pnas_arabidopsis.py`): TODO
164
-
- SNEMI (`snemi.py`): Segmentation of neurons in electron microscopy.
165
-
- Sponge EM (`sponge_em.py`): Segmentation of sponge cells and organelles in electron microscopy.
166
-
- TissueNet* (`tissuenet.py`): Segmentation of cellls in tissue imaged with light microscopy.
167
-
- UroCell (`uro_cell.py`): Segmentation of mitochondria and other organelles in electron microscopy.
168
-
- VNC (`vnc.py`): Segmentation of mitochondria in electron microscopy
169
-
170
-
### Histopathology
171
-
172
-
- BCSS (`bcss.py`): Segmentation of breast cancer tissue in histopathology.
173
-
- Lizard* (`lizard.py`): Segmentation of nuclei in histopathology.
174
-
- MoNuSaC (`monusac.py`): Segmentation of multi-organ nuclei in histopathology.
175
-
- MoNuSeg (`monuseg.py`): Segmentation of multi-organ nuclei in histopathology.
176
-
- PanNuke (`pannuke.py`): Segmentation of nuclei in histopathology.
177
-
178
-
### Medical Imaging
179
-
180
-
- AutoPET* (`medical/autopet.py`): Segmentation of lesions in whole-body FDG-PET/CT.
181
-
- BTCV* (`medical/btcv.py`): Segmentation of multiple organs in CT.
182
-
183
-
### NOTE:
184
-
- \* - These datasets cannot be used out of the box (mostly because of missing automatic downloading). Please take a look at the scripts and the dataset object for details.
0 commit comments