pytesmo.validation_framework.adapters module

Module containing adapters that can be used together with the validation framework.

class pytesmo.validation_framework.adapters.AdvancedMaskingAdapter(cls, filter_list, ignore_nans: bool = False, **kwargs)[source]

Bases: BasicAdapter

Transform the given (reader) class to return a dataset that is masked based on the given list of filters. A filter is a 3-tuple of column_name, operator, and threshold. This class calls the reading method of the given reader instance, applies all filters separately, ANDs all filters together, and masks the whole dataframe with the result.

Parameters:

cls (object) – Reader object, has to have a read_ts or read method or a method name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.
filter_list (list[tuple]) –
[(column_name, operator, threshold), …] ‘column_name’: string

name of the column to apply the operator to

’operator’: Callable or str;
string needs to be one of ‘<’, ‘<=’, ‘==’, ‘>=’, ‘>’, ‘!=’ or a function that takes data and threshold as arguments.

’threshold’:
value to use as the threshold combined with the operator;
data_property_name (str, optional (default: "data")) – Attribute name under which the pandas DataFrame containing the time series is found in the object returned by the read function of the original reader. Ignored if no attribute of this name is found. Then it is required that the DataFrame is already the return value of the read function.
read_name (str, optional (default: None)) – To enable the adapter for a method other than read or read_ts give the function name here (a function of that name must exist in cls). A method of the same name will be added to the adapted Reader, which takes the same arguments as the base method. The output of this method will be changed by the adapter. If None is passed, only data from read and read_ts of cls will be adapted.
ignore_nans (bool, optional (default: False)) – Should be set to True in case the NaNs in the mask field(s) should be ignored, i.e. the main field should not be masked when NaNs are present elswehere in the row

class pytesmo.validation_framework.adapters.AnomalyAdapter(cls, window_size=35, columns=None, **kwargs)[source]

Bases: BasicAdapter

Takes the pandas DataFrame that reader returns and calculates the anomaly of the time series based on a moving average.

Parameters:

cls (object) – Reader object, has to have a read_ts or read method or a method name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.
window_size (float, optional (default: 35)) – The window-size [days] of the moving-average window to calculate the anomaly reference.
columns (list, optional) – columns in the dataset for which to calculate anomalies.
data_property_name (str, optional (default: "data")) – Attribute name under which the pandas DataFrame containing the time series is found in the object returned by the read function of the original reader. Ignored if no attribute of this name is found. Then it is required that the DataFrame is already the return value of the read function.
read_name (str, optional (default: None)) – To enable the adapter for a method other than read or read_ts give the function name here (a function of that name must exist in cls). A method of the same name will be added to the adapted Reader, which takes the same arguments as the base method. The output of this method will be changed by the adapter. If None is passed, only data from read and read_ts of cls will be adapted.

class pytesmo.validation_framework.adapters.AnomalyClimAdapter(cls, columns=None, return_clim=False, **kwargs)[source]

Bases: BasicAdapter

Takes the pandas DataFrame that reader returns and calculates the anomaly of the time series based on the (long-term) average of the series. :param cls: Reader object, has to have a read_ts or read method or a method

name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.

Parameters:

columns (list, optional (default: None)) – Columns in the dataset for which to calculate anomalies. If None is passed, the anomaly is calculated for all columns.
data_property_name (str, optional (default: "data")) – Attribute name under which the pandas DataFrame containing the time series is found in the object returned by the read function of the original reader. Ignored if no attribute of this name is found. Then it is required that the DataFrame is already the return value of the read function.
read_name (str, optional (default: None)) – To enable the adapter for a method other than read or read_ts give the function name here (a function of that name must exist in cls). A method of the same name will be added to the adapted Reader, which takes the same arguments as the base method. The output of this method will be changed by the adapter. If None is passed, only data from read and read_ts of cls will be adapted.
return_clim (bool, optional (default: False)) – If True, then a column for the climatology is added to the DataFrame returned by the read function.
kwargs – Any remaining keyword arguments will be given to pytesmo.time_series.anomaly.calc_climatology()

class pytesmo.validation_framework.adapters.BasicAdapter(cls, data_property_name='data', read_name=None)[source]

Bases: object

Adapter to modify the return value of reading functions from base class. - Pick data frame from objects that have a data_property_name,

i.e. ascat time series objects.

Removes unnecessary timezone information in pandas data frames which pytesmo can not use.
adds a method with the name given in read_name that calls the same method from cls but modifies the returned data frame.

property grid: Returns grid of wrapped class if it exists, otherwise None.

read(*args, **kwargs)[source]

read_ts(*args, **kwargs)[source]

class pytesmo.validation_framework.adapters.ColumnCombineAdapter(cls, func, func_kwargs=None, columns=None, new_name='merged', **kwargs)[source]

Bases: BasicAdapter

Takes the pandas DataFrame that the read_ts or read method of the instance returns and applies a function to merge multiple columns into one. E.g. when there are 2 Soil Moisture parameters in a dataset that should be averaged on reading. Will add one additional column to the input data frame.

class pytesmo.validation_framework.adapters.MaskingAdapter(cls, op, threshold, column_name=None, **kwargs)[source]

Bases: BasicAdapter

Transform the given class to return a boolean dataset given the operator and threshold. This class calls the read_ts and read methods of the given instance and applies boolean masking to the returned data using the given operator and threshold. This adapter does not filter the time series (see the AdvancedMaskingAdapter and SelfMaskingAdapter for that) but only turns it into a boolean dataset.

Parameters:

cls (object) – Reader object, has to have a read_ts or read method or a method name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.
op (str or Callable) – Either a string to look up a function from pytesmo/validation_framework/adapters.py._op_lookup or a function that takes data and threshold as arguments.
threshold (Any) – Value to use as the threshold combined with the operator to mask elements in column_name
column_name (str, optional (default: None)) – Name of the column to apply op to. If None is passed, nothing happens.
data_property_name (str, optional (default: "data")) – Attribute name under which the pandas DataFrame containing the time series is found in the object returned by the read function of the original reader. Ignored if no attribute of this name is found. Then it is required that the DataFrame is already the return value of the read function.
read_name (str, optional (default: None)) – To enable the adapter for a method other than read or read_ts give the function name here (a function of that name must exist in cls). A method of the same name will be added to the adapted Reader, which takes the same arguments as the base method. The output of this method will be changed by the adapter. If None is passed, only data from read and read_ts of cls will be adapted.

class pytesmo.validation_framework.adapters.SelfMaskingAdapter(cls, op, threshold, column_name, **kwargs)[source]

Bases: BasicAdapter

Transform the given (reader) class to return a dataset that is masked based on the given column, operator, and threshold. This class calls the read_ts or read method of the given reader instance, applies the operator/threshold to the specified column, and masks the whole dataframe with the result.

Parameters:

cls (object) – Reader object, has to have a read_ts or read method or a method name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.
op (str or Callable) – Either a string to look up a function from pytesmo/validation_framework/adapters.py._op_lookup or a function that takes data and threshold as arguments.
threshold (Any) – Value to use as the threshold combined with the operator to mask elements in column_name
column_name (str) – Name of the column to apply op to
data_property_name (str, optional (default: "data")) – Attribute name under which the pandas DataFrame containing the time series is found in the object returned by the read function of the original reader. Ignored if no attribute of this name is found. Then it is required that the DataFrame is already the return value of the read function.
read_name (str, optional (default: None)) – To enable the adapter for a method other than read or read_ts give the function name here (a function of that name must exist in cls). A method of the same name will be added to the adapted Reader, which takes the same arguments as the base method. The output of this method will be changed by the adapter. If None is passed, only data from read and read_ts of cls will be adapted.

class pytesmo.validation_framework.adapters.TimestampAdapter(cls: object, time_offset_fields: str, time_units: str = 's', base_time_field: str | None = None, base_time_reference: str | None = None, base_time_units: str = 'D', replace_index: bool = True, output_field: str | None = None, drop_original: bool = True, **kwargs)[source]

Bases: BasicAdapter

Class that combines two or more timestamp fields to a single exact measurement time. The fields of interest specify:

A basic observation time (e.g. days at midnight) which can
be expressed in timestamp (YYYY-mm-dd) or with respect to a reference time (days since YYYY-mm-dd)
One or more (minute, s, µs) offset times to be added cumulatively

variable base_time [w.r.t. 2005-02-01] offset [min] offset [sec]

100 0.889751 100.0 38.0 999.0 101 0.108279 101.0 40.0 1000.0 102 -1.201708 102.0 39.0 999.0

Example output:

variable

2005-05-12 00:55:42 0.889751 2005-05-13 00:57:39 0.108279 2005-05-14 00:56:38 -1.201708

Parameters:: cls (object) – Reader object, has to have a read_ts or read method or a method name must be specified in the read_name kwarg. The same method will be available for the adapted version of the reader.

time_offset_fields: str, list or None: name or list of names of the fields that provide information on the time offset. If a list is given, all values will contribute to the offset, assuming that each refers to the previous. For instance: offset = minutes + seconds in the minute + µs in the second NOTE: np.nan values are counted as 0 offset NOTE: if None, no offset is considered
time_units: str or list: time units that the time_offset_fields are specified in. If a list is given, it should have the same size as the ‘time_offset_fields’ parameter. Can be any of the np.datetime[64] units: https://numpy.org/doc/stable/reference/arrays.datetime.html
base_time_field: str, optional. Default is None.: If a name is provided, the generic time field will be searched for in the columns; otherwise, it is assumed to be the index NOTE: np.nan values in this field are dropped
base_time_reference: str, optional. Default is None.: String of format ‘YYYY-mm-dd’ that can be specified to tranform the ‘base_time_field’ from [units since base_time_reference] to np.datetime[64]. If not provided, it will be assumed that the base_time_field is already in np.datetime[64] units
base_time_units: str, optional. Default is “D”: Units that the base_time_field is specified in. Only applicable with ‘base_time_reference’
replace_index: bool, optional. Default is True.: If True, the exact timestamp is used as index. Else, it will be added to the dataframe on the column ‘output_field’
output_field: str, optional. Default is None.: If a name is specified, an additional column is generated under the name, with the exact timestamp. Only with ‘replace_index’ == False
drop_original: bool, optional. Default is True.: Whether the base_time_field and time_offset_fields should be dropped in the final DataFrame

add_offset_cumulative(data: DataFrame) → array[source]: Return an array of timedelta calculated with all the time_offset_fields

convert_generic(time_arr: array, units: str = 'D') → array[source]: Convert the generic time field to np.datetime[64] dtype