pytesmo.time_series package¶

Submodules¶

pytesmo.time_series.anomaly module¶

Created on June 20, 2013

pytesmo.time_series.anomaly.calc_anomaly(Ser, window_size=35, climatology=None, respect_leap_years=True, return_clim=False)[source]¶

Calculates the anomaly of a time series (Pandas series). Both, climatology based, or moving-average based anomalies can be calculated

Parameters

Ser (pandas.Series (index must be a DateTimeIndex)) –
window_size (float, optional) – The window-size [days] of the moving-average window to calculate the anomaly reference (only used if climatology is not provided) Default: 35 (days)
climatology (pandas.Series (index: 1-366), optional) – if provided, anomalies will be based on the climatology
timespan ([timespan_from, timespan_to], datetime.datetime(y,m,d), optional) – If set, only a subset
respect_leap_years (boolean, optional) – If set then leap years will be respected during matching of the climatology to the time series
return_clim (boolean, optional) – if set to true the return argument will be a DataFrame which also contains the climatology time series. Only has an effect if climatology is used.

Returns

anomaly – Series containing the calculated anomalies

Return type

pandas.Series

pytesmo.time_series.anomaly.calc_climatology(Ser, moving_avg_orig=5, moving_avg_clim=30, median=False, timespan=None, fill=nan, wraparound=False, respect_leap_years=False, interpolate_leapday=False, fillna=True, min_obs_orig=1, min_obs_clim=1)[source]¶

Calculates the climatology of a data set.

Parameters

Ser (pandas.Series (index must be a DateTimeIndex or julian date)) –
moving_avg_orig (float, optional) – The size of the moving_average window [days] that will be applied on the input Series (gap filling, short-term rainfall correction) Default: 5
moving_avg_clim (float, optional) – The size of the moving_average window [days] that will be applied on the calculated climatology (long-term event correction) Default: 35
median (boolean, optional) – if set to True, the climatology will be based on the median conditions
timespan ([timespan_from, timespan_to], datetime.datetime(y,m,d), optional) – Set this to calculate the climatology based on a subset of the input Series
fill (float or int, optional) – Fill value to use for days on which no climatology exists
wraparound (boolean, optional) – If set then the climatology is wrapped around at the edges before doing the second running average (long-term event correction)
respect_leap_years (boolean, optional) – If set then leap years will be respected during the calculation of the climatology Default: False
fillna (boolean, optional) – If set, then the moving average used for the calculation of the climatology will be filled at the nan-values
min_obs_orig (int) – Minimum observations required to give a valid output in the first moving average applied on the input series
min_obs_clim (int) – Minimum observations required to give a valid output in the second moving average applied on the calculated climatology

Returns

climatology – Series containing the calculated climatology Always has 366 values behaving like a leap year

Return type

pandas.Series

pytesmo.time_series.filtering module¶

Created on Oct 16, 2013

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

pytesmo.time_series.filtering.moving_average(Ser, window_size=1, fillna=False, min_obs=1)[source]¶

Applies a moving average (box) filter on an input time series

Parameters

Ser (pandas.Series (index must be a DateTimeIndex or julian date)) –
window_size (float, optional) – The size of the moving_average window [days] that will be applied on the input Series Default: 1
fillna (bool, optional) – Fill nan values at the center window value
min_obs (int) – The minimum amount of observations necessary for a valid moving average

Returns

Ser – moving-average filtered time series

Return type

pandas.Series

pytesmo.time_series.filters module¶

Created on Oct 16, 2013

Fast cython functions for calculating various filters

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

pytesmo.time_series.filters.boxcar_filter(ndarray in_data, ndarray in_jd, float window=1, double nan=-999999.0, bool fillna=0, int min_obs=1)¶

Calculates filtered time series using a boxcar filter - basically a moving average calculation

Parameters

in_data (double numpy.array) – input data
in_jd (double numpy.array) – julian dates of input data
window (int) – characteristic time used for calculating the weight
nan (double) – nan values to exclude from calculation

pytesmo.time_series.filters.exp_filter(ndarray in_data, ndarray in_jd, int ctime=10, double nan=-999999.0)¶

Calculates exponentially smoothed time series using an iterative algorithm

Parameters

in_data (double numpy.array) – input data
in_jd (double numpy.array) – julian dates of input data
ctime (int) – characteristic time used for calculating the weight
nan (double) – nan values to exclude from calculation

pytesmo.time_series.grouping module¶

Module provides grouping functions that can be used together with pandas to create a few strange timegroupings like e.g. decadal products were there are three products per month with timestamps on the 10th 20th and last of the month

pytesmo.time_series.grouping.group_by_day_bin(df, bins=[1, 11, 21, 32], start=False, dtindex=None)[source]¶

Calculates timegroups for a given daterange. Groups are from day 1-10, 11-20, 21-last day of each month.

Parameters

df (pandas.DataFrame) – DataFrame with DateTimeIndex for which the grouping should be done
bins (list, optional) – bins in day of the month, default is for dekadal grouping
start (boolean, optional) – if set to True the start of the bin will be the timestamp for each observations
dtindex (pandas.DatetimeIndex, optional) – precomputed DatetimeIndex that should be used for resulting groups, useful for processing of numerous datasets since it does not have to be computed for every call

Returns

grouped (pandas.core.groupby.DataFrameGroupBy) – DataFrame groupby object according the the day bins on this object functions like sum() or mean() can be called to get the desired aggregation.
dtindex (pandas.DatetimeIndex) – returned so that it can be reused if possible

pytesmo.time_series.grouping.grouped_dates_between(start_date, end_date, bins=[1, 11, 21, 32], start=False)[source]¶

Between a start and end date give all dates that represent a bin See test for example.

Parameters

start_date (date) – start date
end_date (date) – end date
bins (list, optional) – bin start values as days in a month e.g. [0,11,21] would be two bins one with values 0<=x<11 and the second one with 11<=x<21
start (boolean, optional) – if True the start of the bins is the representative date

Returns

tstamps – list of representative dates between start and end date

Return type

list of datetimes

pytesmo.time_series.grouping.grp_to_datetimeindex(grps, bins, dtindex, start=False)[source]¶

Makes a datetimeindex that has for each entry the timestamp of the bin beginning or end this entry belongs to.

Parameters

grps (numpy.array) – group numbers made by np.digitize(data, bins)
bins (list) – bin start values e.g. [0,11,21] would be two bins one with values 0<=x<11 and the second one with 11<=x<21
dtindex (pandas.DatetimeIndex) – same length as grps, gives the basis datetime for each group
start (boolean, optional) – if set to True the start of the bin will be the timestamp for each observations

Returns

grpdt – Datetimeindex where every date is the end of the bin the datetime ind the input dtindex belongs to

Return type

pd.DatetimeIndex

pytesmo.time_series.plotting module¶

Created on Mar 7, 2014

Plot anomalies around climatology using colors

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

pytesmo.time_series.plotting.plot_clim_anom(df, clim=None, axes=None, markersize=0.75, mfc='0.3', mec='0.3', clim_color='0.0', clim_linewidth=0.5, clim_linestyle='-', pos_anom_color='#799ADA', neg_anom_color='#FD8086', anom_linewidth=0.2, add_titles=True)[source]¶

Takes a pandas DataFrame and calculates the climatology and anomaly and plots them in a nice way for each column

Parameters

df (pandas.DataFrame) –
clim (pandas.DataFrame, optional) – if given these climatologies will be used if not given then climatologies will be calculated this DataFrame must have the same number of columns as df and also the column names. each climatology must have doy as index.
axes (list of matplotlib.Axes, optional) – list of axes on which each column should be plotted if not given a standard layout is generated
markersize (float, optional) – size of the markers for the datapoints
mfc (matplotlib color, optional) – markerfacecolor, color of the marker face
mec (matplotlib color, optional) – markeredgecolor
clim_color (matplotlib color, optional) – color of the climatology
clim_linewidth (float, optional) – linewidth of the climatology
clim_linestyle (string, optional) – linestyle of the climatology
pos_anom_color (matplotlib color, optional) – color of the positive anomaly
neg_anom_color (matplotlib color, optional) – color of the negative anomaly
anom_linewidth (float, optional) – linewidth of the anomaly lines
add_titles (boolean, optional) – if set each subplot will have it’s column name as title Default : True

Returns

Figure (matplotlib.Figure) – if no axes were given
axes (list of matploblib.Axes) – if no axes were given

pytesmo.time_series package¶

Submodules¶

pytesmo.time_series.anomaly module¶

pytesmo.time_series.filtering module¶

pytesmo.time_series.filters module¶

pytesmo.time_series.grouping module¶

pytesmo.time_series.plotting module¶

Module contents¶