Skip to content

Commit 35eb5fb

Browse files
authored
Update datasets_and_dataloaders.md (#387)
1 parent 2e34c3e commit 35eb5fb

File tree

1 file changed

+8
-53
lines changed

1 file changed

+8
-53
lines changed

doc/datasets_and_dataloaders.md

+8-53
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ They are implemented in `torch_em.data.datasets`. See `scripts/datasets` for exa
88

99
All datasets in `torch_em.data.datasets` are implemented according to the following logic:
1010
- The function `get_..._data` downloads the respective datasets. Note that some datasets cannot be downloaded automatically. In these cases the function will raise an error with a message that explains how to download the data.
11+
- The function `get_..._paths` returns the filepaths to the downloaded inputs.
1112
- The function `get_..._dataset` returns the PyTorch Dataset for the corresponding dataset.
1213
- The function `get_..._dataloader` returns the PyTorch DataLoader for the corresponding dataset.
1314

@@ -21,11 +22,11 @@ We provide several electron microscopy datasets. See `torch_em.data.datasets.ele
2122

2223
### Histopathology
2324

24-
`torch_em.data.datasets.histopathology`
25+
We provide several histopathology datasets. See `torch_em.data.datasets.histopathology` for an overview.
2526

2627
### Medical Imaging
2728

28-
`torch_em.data.datasets.medical`
29+
We provide several medical imaging datasets. See `torch_em.data.datasets.medical` for an overview.
2930

3031

3132
## How to create your own dataloader?
@@ -48,9 +49,11 @@ Let's say you have a specific dataset of interest and would want to create a PyT
4849
- ✅ different sizes (i.e. images have shapes like (256, 256), (378, 378), (512, 512), etc., for example)
4950
- use `ImageCollectionDataset`
5051
- Multi-channel inputs of:
51-
- > NOTE: It's important to convert the images to be channels first (see above for the expected format)
52-
- ✅ same size (i.e. all images have shape (3, 256, 256), for example)
53-
- use `SegmentationDataset` (recommended) or `ImageCollectionDataset`
52+
- > The ideal expectation of inputs with channels is to have channels first (eg. RGB format -> (256, 256, 3) to channels-first format -> (3, 256, 256))
53+
- CASE 1: I would like to keep the inputs as RGB format (you must stick to `ImageCollectionDataset` or `is_seg_dataset=False`)
54+
- CASE 2: I would like to convert the inputs to channels-first (you can be flexible and follow the instructions below)
55+
- ✅ same size (i.e. all images have shape)
56+
- use `SegmentationDataset` (recommended for inputs with channels first) or `ImageCollectionDataset` (for inputs in RGB format)
5457
- ✅ different sizes (i.e. images have shapes like (3, 256, 256), (3, 378, 378), (3, 512, 512), etc., for example)
5558
- use `ImageCollectionDataset`
5659

@@ -135,51 +138,3 @@ dataset = RawImageCollectionDataset(
135138
# there are other optional parameters, see `torch_em.data.raw_image_collection_dataset.py` for details.
136139
)
137140
```
138-
139-
140-
<!-- I would rather auto-generate this in a proper pdoc documentation.
141-
- ASEM (`asem.py`): Segmentation of organelles in FIB-SEM cells.
142-
- AxonDeepSeg (`axondeepseg.py`): Segmentation of myelinated axons in electron microscopy.
143-
- MitoLab* (`cem.py`):
144-
- CEM MitoLab: Segmentation of mitochondria in electron microscopy.
145-
- CEM Mito Benchmark: Segmentation of mitochondria in 7 benchmark electron microscopy datasets.
146-
- Covid IF (`covidif.py`): Segmentation of cells and nuclei in immunofluoroscence.
147-
- CREMI (`cremi.py`): Segmentation of neurons in electron microscopy.
148-
- Cell Tracking Challenge (`ctc.py`): Segmentation data for cell tracking challenge (consists of 10 datasets).
149-
- DeepBacs (`deepbacs.py`): Segmentation of bacteria in light microscopy.
150-
- DSB (`dsb.py`): Segmentation of nuclei in light microscopy.
151-
- DynamicNuclearNet* (`dynamicnuclearnet.py`): Segmentation of nuclei in fluorescence microscopy.
152-
- HPA (`hpa.py`): Segmentation of cells in light microscopy.
153-
- ISBI (`isbi2012.py`): Segmentation of neurons in electron microscopy.
154-
- Kasthuri (`kasthuri.py`): Segmentation of mitochondria in electron microscopy.
155-
- LIVECell (`livecell.py`): Segmentation of cells in phase-contrast microscopy.
156-
- Lucchi (`lucchi.py`): Segmentation of mitochondria in electron microscopy.
157-
- MitoEM (`mitoem.py`): Segmentation of mitochondria in electron microscopy.
158-
- Mouse Embryo (`mouse_embryo.py`): Segmentation of nuclei in confocal microscopy.
159-
- NeurIPS CellSeg (`neurips_cell_seg.py`): Segmentation of cells in multi-modality light microscopy datasets.
160-
- NucMM (`nuc_mm.py`): Segmentation of nuclei in electron microscopy and micro-CT.
161-
- PlantSeg (`plantseg.py`): Segmentation of cells in confocal and light-sheet microscopy.
162-
- Platynereis (`platynereis.py`): Segmentation of nuclei in electron microscopy.
163-
- PNAS* (`pnas_arabidopsis.py`): TODO
164-
- SNEMI (`snemi.py`): Segmentation of neurons in electron microscopy.
165-
- Sponge EM (`sponge_em.py`): Segmentation of sponge cells and organelles in electron microscopy.
166-
- TissueNet* (`tissuenet.py`): Segmentation of cellls in tissue imaged with light microscopy.
167-
- UroCell (`uro_cell.py`): Segmentation of mitochondria and other organelles in electron microscopy.
168-
- VNC (`vnc.py`): Segmentation of mitochondria in electron microscopy
169-
170-
### Histopathology
171-
172-
- BCSS (`bcss.py`): Segmentation of breast cancer tissue in histopathology.
173-
- Lizard* (`lizard.py`): Segmentation of nuclei in histopathology.
174-
- MoNuSaC (`monusac.py`): Segmentation of multi-organ nuclei in histopathology.
175-
- MoNuSeg (`monuseg.py`): Segmentation of multi-organ nuclei in histopathology.
176-
- PanNuke (`pannuke.py`): Segmentation of nuclei in histopathology.
177-
178-
### Medical Imaging
179-
180-
- AutoPET* (`medical/autopet.py`): Segmentation of lesions in whole-body FDG-PET/CT.
181-
- BTCV* (`medical/btcv.py`): Segmentation of multiple organs in CT.
182-
183-
### NOTE:
184-
- \* - These datasets cannot be used out of the box (mostly because of missing automatic downloading). Please take a look at the scripts and the dataset object for details.
185-
-->

0 commit comments

Comments
 (0)