data_sources.satellite.satellite_data_source

Satellite Data Source

SatelliteDataSource Objects

@dataclass
class SatelliteDataSource(ZarrDataSource)

Satellite Data Source.

__post_init__

def __post_init__(image_size_pixels_height: int, image_size_pixels_width: int,
                  meters_per_pixel: int)

Post Init

sample_period_minutes

@property
def sample_period_minutes() -> int

Override the default sample minutes

open

def open() -> None

Open Satellite data

We don't want to open_sat_data in init. If we did that, then we couldn't copy SatelliteDataSource instances into separate processes. Instead, call open() after creating separate processes.

get_data_model_for_batch

@staticmethod
def get_data_model_for_batch()

Get the model that is used in the batch

get_spatial_region_of_interest

def get_spatial_region_of_interest(data_array: xr.DataArray,
                                   x_center_osgb: Number,
                                   y_center_osgb: Number) -> xr.DataArray

Gets the satellite image as a square around the center

Ignores x and y coordinates as for the original satellite projection each pixel varies in both its x and y distance from other pixels. See Issue 401 for more details.

This results, in 'real' spatial terms, each image covering about 2x as much distance in the x direction as in the y direction.

Arguments:

  • data_array - DataArray to subselect from
  • x_center_osgb - Center of the image x coordinate in OSGB coordinates
  • y_center_osgb - Center of image y coordinate in OSGB coordinates

Returns:

The selected data around the center

get_example

def get_example(location: SpaceTimeLocation) -> xr.Dataset

Get Example data

Arguments:

  • location - A location object of the example which contains
  • a timestamp of the example (t0_datetime_utc),
  • the x center location of the example (x_location_osgb)
  • the y center location of the example(y_location_osgb)

  • Returns - Example Data

datetime_index

def datetime_index(remove_night: bool = True) -> pd.DatetimeIndex

Returns a complete list of all available datetimes

Arguments:

  • remove_night - If True then remove datetimes at night. We're interested in forecasting solar power generation, so we don't care about nighttime data :)

In the UK in summer, the sun rises first in the north east, and sets last in the north west [1]. In summer, the north gets more hours of sunshine per day.

In the UK in winter, the sun rises first in the south east, and sets last in the south west [2]. In winter, the south gets more hours of sunshine per day.

Summer Winter
Sun rises first in N.E. S.E.
Sun sets last in N.W. S.W.
Most hours of sunlight North South

Before training, we select timesteps which have at least some sunlight. We do this by computing the clearsky global horizontal irradiance (GHI) for the four corners of the satellite imagery, and for all the timesteps in the dataset. We only use timesteps where the maximum global horizontal irradiance across all four corners is above some threshold.

The 'clearsky solar irradiance' is the amount of sunlight we'd expect on a clear day at a specific time and location. The SI unit of irradiance is watt per square meter. The 'global horizontal irradiance' (GHI) is the total sunlight that would hit a horizontal surface on the surface of the Earth. The GHI is the sum of the direct irradiance (sunlight which takes a direct path from the Sun to the Earth's surface) and the diffuse horizontal irradiance (the sunlight scattered from the atmosphere). For more info, see: https://en.wikipedia.org/wiki/Solar_irradiance

References:

  1. Video of June 2019
  2. Video of Jan 2019

geospatial_border

def geospatial_border() -> list[tuple[Number, Number]]

Get 'corner' coordinates for a rectangle within the boundary of the data.

Returns List of 2-tuples of the x and y coordinates of each corner, in OSGB projection.

HRVSatelliteDataSource Objects

class HRVSatelliteDataSource(SatelliteDataSource)

Satellite Data Source for HRV data.

dedupe_time_coords

def dedupe_time_coords(dataset: xr.Dataset,
                       logger: logging.Logger) -> xr.Dataset

Preprocess datasets by de-duplicating the time coordinates.

Arguments:

  • dataset - xr.Dataset to preprocess
  • logger - logger object to write to

Returns:

dataset with time coords de-duped.

open_sat_data

def open_sat_data(zarr_path: str,
                  consolidated: bool,
                  logger: logging.Logger,
                  sample_period_minutes: int = 15) -> xr.DataArray

Lazily opens the Zarr store.

Adds 1 minute to the 'time' coordinates, so the timestamps are at 00, 05, ..., 55 past the hour.

Arguments:

  • zarr_path - Cloud URL or local path pattern. If GCP URL, must start with 'gs://'
  • consolidated - Whether or not the Zarr metadata is consolidated.
  • logger - logger object to write to
  • sample_period_minutes - The sample period minutes that the data should be reduced to.

data_sources.satellite

Satellite data sources and functions

data_sources.satellite.satellite_model

Model for output of satellite data

Satellite Objects

class Satellite(DataSourceOutput)

Class to store satellite data as a xr.Dataset with some validation

model_validation

@classmethod
def model_validation(cls, v)

Check that all values are non negative

HRVSatellite Objects

class HRVSatellite(Satellite)

Class to store HRV satellite data as a xr.Dataset with some validation