data_sources.satellite

Satellite data sources and functions

data_sources.satellite.satellite_model

Model for output of satellite data

Satellite Objects

class Satellite(DataSourceOutput)

Class to store satellite data as a xr.Dataset with some validation

model_validation

@classmethod
def model_validation(cls, v)

Check that all values are non negative

SatelliteML Objects

class SatelliteML(DataSourceOutputML)

Model for output of satellite data

fake

@staticmethod
def fake(batch_size=32, seq_length_5=19, satellite_image_size_pixels=64, number_sat_channels=7, time_5=None)

Create fake data

get_datetime_index

def get_datetime_index() -> Array

Get the datetime index of this data

from_xr_dataset

@staticmethod
def from_xr_dataset(xr_dataset: xr.Dataset)

Change xr dataset to model.

data_sources.satellite.satellite_data_source

Satellite Data Source

SatelliteDataSource Objects

@dataclass
class SatelliteDataSource(ZarrDataSource)

Satellite Data Source

filename: Must start with 'gs://' if on GCP.

__post_init__

def __post_init__(image_size_pixels: int, meters_per_pixel: int)

Post Init

open

def open() -> None

Open Satellite data

We don't want to open_sat_data in init. If we did that, then we couldn't copy SatelliteDataSource instances into separate processes. Instead, call open() after creating separate processes.

get_batch

def get_batch(t0_datetimes: pd.DatetimeIndex, x_locations: Iterable[Number], y_locations: Iterable[Number]) -> Satellite

Get batch data

Load the first _n_timesteps_per_batch concurrently. This loads the timesteps from disk concurrently, and fills the cache. If we try loading all examples concurrently, then SatelliteDataSource will try reading from empty caches, and things are much slower!

Arguments:

  • t0_datetimes - list of timestamps for the datetime of the batches. The batch will also include data for historic and future depending on 'history_minutes' and 'future_minutes'.
  • x_locations - x center batch locations
  • y_locations - y center batch locations

  • Returns - Batch data

datetime_index

def datetime_index(remove_night: bool = True) -> pd.DatetimeIndex

Returns a complete list of all available datetimes

Arguments:

  • remove_night - If True then remove datetimes a night.

open_sat_data

def open_sat_data(filename: str, consolidated: bool) -> xr.DataArray

Lazily opens the Zarr store.

Adds 1 minute to the 'time' coordinates, so the timestamps are at 00, 05, ..., 55 past the hour.

Arguments:

  • filename - Cloud URL or local path. If GCP URL, must start with 'gs://'
  • consolidated - Whether or not the Zarr metadata is consolidated.