data_sources.gsp
GSP data sources and functions
data_sources.gsp.eso
This file has a few functions that are used to get GSP (Grid Supply Point) information from National Grid ESO.
ESO - Electricity System Operator. General information can be found here - https://data.nationalgrideso.com/system/gis-boundaries-for-gb-grid-supply-points
get_gsp_metadata_from_eso: gets the gsp metadata get_gsp_shape_from_eso: gets the shape of the gsp regions get_list_of_gsp_ids: gets a list of gsp_ids, by using 'get_gsp_metadata_from_eso'
Peter Dudfield 2021-09-13
get_gsp_metadata_from_eso
def get_gsp_metadata_from_eso(calculate_centroid: bool = True) -> pd.DataFrame
Get the metadata for the gsp, from ESO.
Arguments:
-
calculate_centroid
- Load the shape file also, and calculate the Centroid -
Returns
- Dataframe of ESO Metadata
get_gsp_shape_from_eso
def get_gsp_shape_from_eso(join_duplicates: bool = True, load_local_file: bool = True, save_local_file: bool = False) -> gpd.GeoDataFrame
Get the the gsp shape file from ESO (or a local file)
Arguments:
join_duplicates
- If True, any RegionIDs which have multiple entries, will be joined together to give one entryload_local_file
- Load from a local file, not from ESO-
save_local_file
- Save to a local file, only need to do this is Data is updated. -
Returns
- Geo Pandas dataframe of GSP shape data
get_list_of_gsp_ids
def get_list_of_gsp_ids(maximum_number_of_gsp: Optional[int] = None) -> List[int]
Get list of gsp ids from ESO metadata
Arguments:
-
maximum_number_of_gsp
- Truncate list of GSPs to be no larger than this number of GSPs. Set to None to disable truncation. -
Returns
- list of gsp ids
data_sources.gsp.gsp_data_source
GSP Data Source. GSP - Grid Supply Points
Read more https://data.nationalgrideso.com/system/gis-boundaries-for-gb-grid-supply-points
GSPDataSource Objects
@dataclass
class GSPDataSource(ImageDataSource)
Data source for GSP PV Data
30 mins data is taken from 'PV Live' from https://www.solar.sheffield.ac.uk/pvlive/ meta data is taken from ESO
__post_init__
def __post_init__(image_size_pixels: int, meters_per_pixel: int)
Set random seed and load data
load
def load()
Load the meta data and load the GSP power data
datetime_index
def datetime_index()
Return the datetimes that are available
get_locations_for_batch
def get_locations_for_batch(t0_datetimes: pd.DatetimeIndex) -> Tuple[List[Number], List[Number]]
Get x and y locations for a batch. Assume that all data is available for all GSP.
Random GSP are taken, and the locations of them are returned. This is useful as other datasources need to know which x,y locations to get
Arguments:
-
t0_datetimes
- list of datetimes that the batches locations have data for -
Returns
- list of x and y locations
get_example
def get_example(t0_dt: pd.Timestamp, x_meters_center: Number, y_meters_center: Number) -> GSP
Get data example from one time point (t0_dt) and for x and y coords (x_meters_center), (y_meters_center).
Get data at the location of x,y and get surrounding GSP power data also.
Arguments:
t0_dt
- datetime of "now". History and forecast are also returnedx_meters_center
- x location of center GSP.-
y_meters_center
- y location of center GSP. -
Returns
- Dictionary with GSP data in it
drop_gsp_by_threshold
def drop_gsp_by_threshold(gsp_power: pd.DataFrame, meta_data: pd.DataFrame, threshold_mw: int = 20)
Drop GSP where the max power is below a certain threshold
Arguments:
gsp_power
- GSP power datameta_data
- the GSP meta data-
threshold_mw
- the threshold where we only taken GSP with a maximum power, above this value. -
Returns
- power data and metadata
load_solar_gsp_data
def load_solar_gsp_data(filename: Union[str, Path], start_dt: Optional[datetime] = None, end_dt: Optional[datetime] = None) -> pd.DataFrame
Load solar PV GSP data
Arguments:
filename
- filename of file to be loaded, can put 'gs://' files in here toostart_dt
- the start datetime, which to trim the data to-
end_dt
- the end datetime, which to trim the data to -
Returns
- dataframe of pv data
data_sources.gsp.gsp_model
Model for output of GSP data
GSP Objects
class GSP(DataSourceOutput)
Class to store GSP data as a xr.Dataset with some validation
data_sources.gsp.pvlive
Functions used to query the PVlive api
load_pv_gsp_raw_data_from_pvlive
def load_pv_gsp_raw_data_from_pvlive(start: datetime, end: datetime, number_of_gsp: int = None, normalize_data: bool = True) -> pd.DataFrame
Load raw pv gsp data from pvlive. Note that each gsp is loaded separately. Also the data is loaded in 30 day chunks.
Arguments:
start
- the start date for gsp data to loadend
- the end date for gsp data to loadnumber_of_gsp
- The number of gsp to load. Note that on 2021-09-01 there were 338 to load.-
normalize_data
- Option to normalize the generation according to installed capacity -
Returns
- Data frame of time series of gsp data. Shows PV data for each GSP from {start} to {end}
get_installed_capacity
def get_installed_capacity(start: Optional[datetime] = datetime(2021, 1, 1, tzinfo=pytz.utc), maximum_number_of_gsp: Optional[int] = None) -> pd.Series
Get the installed capacity of each gsp
This can take ~30 seconds for getting the full list
Arguments:
start
- optional datetime when the installed cpapcity is collected-
maximum_number_of_gsp
- Truncate list of GSPs to be no larger than this number of GSPs. Set to None to disable truncation. -
Returns
- pd.Series of installed capacity indexed by gsp_id