Datasets
This folder contains the following files
batch.py
'Batch' pydantic class, to hold batch data in. An 'Example' is one item in the batch. 'BatchML' pydantic class, holds data for a batch, ready for ML models.
datamodule.py
Contains a class NowcastingDataModule - pl.LightningDataModule This handles the - amalgamation of all different data sources, - making valid datetimes across all the sources, - splitting into train and validation datasets
datasets.py
This file contains the following classes
NetCDFDataset - torch.utils.data.Dataset: Use for loading pre-made batches NowcastingDataset - torch.utils.data.IterableDataset: Dataset for making batches
subset.py
Function to subset the 'Batch'
fake.py
A fake dataset, perhaps useful outside this repo.