
Satellite data sources and functions


Model for output of satellite data

Satellite Objects

class Satellite(DataSourceOutput)

Class to store satellite data as a xr.Dataset with some validation


def model_validation(cls, v)

Check that all values are non negative

SatelliteML Objects

class SatelliteML(DataSourceOutputML)

Model for output of satellite data


def fake(batch_size=32, seq_length_5=19, satellite_image_size_pixels=64, number_sat_channels=7, time_5=None)

Create fake data


def get_datetime_index() -> Array

Get the datetime index of this data


def from_xr_dataset(xr_dataset: xr.Dataset)

Change xr dataset to model.


Satellite Data Source

SatelliteDataSource Objects

class SatelliteDataSource(ZarrDataSource)

Satellite Data Source

filename: Must start with 'gs://' if on GCP.


def __post_init__(image_size_pixels: int, meters_per_pixel: int)

Post Init


def open() -> None

Open Satellite data

We don't want to open_sat_data in init. If we did that, then we couldn't copy SatelliteDataSource instances into separate processes. Instead, call open() after creating separate processes.


def get_batch(t0_datetimes: pd.DatetimeIndex, x_locations: Iterable[Number], y_locations: Iterable[Number]) -> Satellite

Get batch data

Load the first _n_timesteps_per_batch concurrently. This loads the timesteps from disk concurrently, and fills the cache. If we try loading all examples concurrently, then SatelliteDataSource will try reading from empty caches, and things are much slower!


  • t0_datetimes - list of timestamps for the datetime of the batches. The batch will also include data for historic and future depending on 'history_minutes' and 'future_minutes'.
  • x_locations - x center batch locations
  • y_locations - y center batch locations

  • Returns - Batch data


def datetime_index(remove_night: bool = True) -> pd.DatetimeIndex

Returns a complete list of all available datetimes


  • remove_night - If True then remove datetimes a night.


def open_sat_data(filename: str, consolidated: bool) -> xr.DataArray

Lazily opens the Zarr store.

Adds 1 minute to the 'time' coordinates, so the timestamps are at 00, 05, ..., 55 past the hour.


  • filename - Cloud URL or local path. If GCP URL, must start with 'gs://'
  • consolidated - Whether or not the Zarr metadata is consolidated.