pytesmo.time_series.anomaly module

Created on June 20, 2013

pytesmo.time_series.anomaly.calc_anomaly(Ser, window_size=35, climatology=None, respect_leap_years=True, return_clim=False)[source]

Calculates the anomaly of a time series (Pandas series). Both, climatology based, or moving-average based anomalies can be calculated

Parameters:

Ser (pandas.Series) – Input data (index must be a DateTimeIndex)
window_size (float, optional (default: 35)) – The window-size [days] of the moving-average window to calculate the anomaly reference (only used if climatology is not provided)
climatology (pandas.Series (index: 1-366), optional (default: None)) – if provided, anomalies will be based on the climatology
timespan ([timespan_from, timespan_to], datetime.datetime(y,m,d), optional) – If set, only a subset
respect_leap_years (boolean, optional (default: True)) – If set then leap years will be respected during matching of the climatology to the time series
return_clim (boolean, optional (default: False)) – if set to true the return argument will be a DataFrame which also contains the climatology time series. Only has an effect if climatology is used.

Returns:

anomaly – Series containing the calculated anomalies. If return_clim is set to true, a DataFrame will be returned, where one column contains the anomalies and another the climatology broadcasted over the whole index. If a climatology with a ‘std’ column was passed initially, this column will also be returned in the DataFrame if return_clim is chosen.

Return type:

pandas.Series or pandas.DataFrame

pytesmo.time_series.anomaly.calc_climatology(Ser, moving_avg_orig=5, moving_avg_clim=None, median=False, std=False, timespan=None, fill=nan, wraparound=True, respect_leap_years=False, interpolate_leapday=False, fillna=True, min_obs_orig=1, min_obs_clim=1, output_freq='day')[source]

Calculates the climatology of a data set.

Parameters:

Ser (pandas.Series) – Time series to compute climatology for (index must be a DateTimeIndex or julian date)
moving_avg_orig (float, optional (default: 5)) – The size of the moving_average window [days] that will be applied on the input Series (gap filling, short-term rainfall correction)
moving_avg_clim (float, optional (default: None)) –
The size of the moving_average window in days that will be applied on the calculated climatology (long-term event correction). If None is passed, it will be calculated from the ‘output_freq’ value:
- ’day’: 35
- ’month’: 3
median (boolean, optional (default: False)) – if set to True, the climatology will be based on the median conditions
std (boolean, optional (default: False)) – if set to True, there will be 2 columns, one for the median or mean and one of the standard deviation of the aggregated data points.
timespan ([timespan_from, timespan_to], datetime.datetime(y,m,d), optional) – Set this to calculate the climatology based on a subset of the input Series
fill (float or int, optional (default: np.nan)) – Fill value to use for days on which no climatology exists
wraparound (boolean, optional (default: True)) – If set then the climatology is wrapped around at the edges before doing the second running average (long-term event correction)
respect_leap_years (boolean, optional (default: False)) – If set then leap years will be respected during the calculation of the climatology. Only valid with ‘output_freq’ value set to ‘day’. Default: False
interpolate_leapday (boolean, optional (default: False)) – <description>. Only valid with ‘output_freq’ value set to ‘day’. Default: False
fillna (boolean, optional (default: True)) – If set, then the moving average used for the calculation of the climatology will be filled at the nan-values
min_obs_orig (int (default: 1)) – Minimum observations required to give a valid output in the first moving average applied on the input series
min_obs_clim (int (default: 1)) – Minimum observations required to give a valid output in the second moving average applied on the calculated climatology
output_freq (str, optional (default: 'day')) – Determines the output frequency (time unit) of the climatology calculation (independently of the ‘Ser’ input frequency). Currently, supported options are ‘day’, ‘month’.

Returns:

climatology – Containing the calculated climatology. The size of the series depends on the type of climatology being calculated, based on the value of ‘output_freq’:

366 values for a daily climatology, behaving as a leap year

12 values for a monthly climatology

If ‘std’ is set to True, the output will be a DataFrame with 2 columns:: ’climatology’ and ‘std’.

Return type:

pandas.Series or pandas.DataFrame