config
Configuration of the dataset
config.load
Loading configuration functions
load_yaml_configuration
def load_yaml_configuration(filename: Union[str, Pathy]) -> Configuration
Load a yaml file which has a configuration in it
Arguments:
filename
- the file name that you want to load. Will load from local, AWS, or GCP depending on the protocol suffix (e.g. 's3://bucket/config.yaml').
Returns:pydantic class
config.save
Save functions for the configuration model
save_yaml_configuration
def save_yaml_configuration(configuration: Configuration, filename: Optional[Union[str, Pathy]] = None)
Save a local yaml file which has the a configuration in it.
If filename
is None then saves to configuration.output_data.filepath / configuration.yaml.
Will save to GCP, AWS, or local, depending on the protocol suffix of filepath.
config.model
Configuration model for the dataset.
All paths must include the protocol prefix. For local files, it's sufficient to just start with a '/'. For aws, start with 's3://', for gcp start with 'gs://'.
This file is mostly about configuring the DataSources.
Separate Pydantic models in
nowcasting_dataset/data_sources/<data_source_name>/<data_source_name>_model.py
are used to validate the values of the data itself.
General Objects
class General(BaseModel)
General pydantic model
Git Objects
class Git(BaseModel)
Git model
DataSourceMixin Objects
class DataSourceMixin(BaseModel)
Mixin class, to add forecast and history minutes
seq_length_30_minutes
@property
def seq_length_30_minutes()
How many steps are there in 30 minute datasets
seq_length_5_minutes
@property
def seq_length_5_minutes()
How many steps are there in 5 minute datasets
seq_length_60_minutes
@property
def seq_length_60_minutes()
How many steps are there in 60 minute datasets
PV Objects
class PV(DataSourceMixin)
PV configuration model
Satellite Objects
class Satellite(DataSourceMixin)
Satellite configuration model
HRVSatellite Objects
class HRVSatellite(DataSourceMixin)
Satellite configuration model for HRV data
NWP Objects
class NWP(DataSourceMixin)
NWP configuration model
GSP Objects
class GSP(DataSourceMixin)
GSP configuration model
history_minutes_divide_by_30
@validator("history_minutes")
def history_minutes_divide_by_30(cls, v)
Validate 'history_minutes'
forecast_minutes_divide_by_30
@validator("forecast_minutes")
def forecast_minutes_divide_by_30(cls, v)
Validate 'forecast_minutes'
Topographic Objects
class Topographic(DataSourceMixin)
Topographic configuration model
Sun Objects
class Sun(DataSourceMixin)
Sun configuration model
InputData Objects
class InputData(BaseModel)
Input data model.
default_seq_length_5_minutes
@property
def default_seq_length_5_minutes()
How many steps are there in 5 minute datasets
set_forecast_and_history_minutes
@root_validator
def set_forecast_and_history_minutes(cls, values)
Set default history and forecast values, if needed.
Run through the different data sources and if the forecast or history minutes are not set, then set them to the default values
set_all_to_defaults
@classmethod
def set_all_to_defaults(cls)
Returns an InputData instance with all fields set to their default values.
Used for unittests.
OutputData Objects
class OutputData(BaseModel)
Output data model
filepath_pathy
@validator("filepath")
def filepath_pathy(cls, v)
Make sure filepath is a Pathy object
Process Objects
class Process(BaseModel)
Pydantic model of how the data is processed
local_temp_path_to_path_object_expanduser
@validator("local_temp_path")
def local_temp_path_to_path_object_expanduser(cls, v)
Convert the path in string format to a pathlib.PosixPath
object
and call expanduser
on the latter.
Configuration Objects
class Configuration(BaseModel)
Configuration model for the dataset
set_base_path
def set_base_path(base_path: str)
Append base_path to all paths. Mostly used for testing.
set_git_commit
def set_git_commit(configuration: Configuration)
Set the git information in the configuration file
Arguments:
-
configuration
- configuration object -
Returns
- configuration object with git information