pytesmo.df_metrics module

Module contains wrappers for methods in pytesmo.metrics which can be given pandas.DataFrames instead of single numpy.arrays. If the DataFrame has more columns than the function has input parameters the function will be applied pairwise, resp. to triples.

pytesmo.df_metrics.RSS(df)

Wrapper to call pytesmo.metrics.RSS() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.RSS() docstring.

pytesmo.df_metrics.bias(df)

Wrapper to call pytesmo.metrics.bias() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.bias() docstring.

pytesmo.df_metrics.kendall_tau(df)

Wrapper to call pytesmo.metrics.kendall_tau() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.kendall_tau() docstring.

pytesmo.df_metrics.kendalltau(df)[source]

Wrapper for scipy.stats.kendalltau

Returns:

result – with column names of df for which the calculation was done as name of the element separated by ‘_and_’

Return type:

namedtuple

See also

pytesmo.metrics.kendalltau, scipy.stats.kendalltau

pytesmo.df_metrics.msd(df)

Wrapper to call pytesmo.metrics.msd() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.msd() docstring.

pytesmo.df_metrics.mse(df)[source]

Deprecated: use pytesmo.df_metrics.msd() and the functions for the individual components instead, or pytesmo.df_metrics.msd_decomposition() for the old functionality with better performance.

Mean square error (MSE) as a decomposition of the RMSD into individual error components

Returns:

result – with column names of df for which the calculation was done as name of the element separated by ‘_and_’

Return type:

namedtuple

See also

pytesmo.metrics.mse

pytesmo.df_metrics.mse_bias(df)

Wrapper to call pytesmo.metrics.mse_bias() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.mse_bias() docstring.

pytesmo.df_metrics.mse_corr(df)

Wrapper to call pytesmo.metrics.mse_corr() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.mse_corr() docstring.

pytesmo.df_metrics.mse_decomposition(df)[source]

Mean square error (MSE) and decomposition of the MSE into individual error components.

Returns:

result – with column names of df for which the calculation was done as name of the element separated by ‘_and_’

Return type:

namedtuple

pytesmo.df_metrics.mse_var(df)

Wrapper to call pytesmo.metrics.mse_var() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.mse_var() docstring.

pytesmo.df_metrics.n_combinations(iterable, n, must_include=None, permutations=False)[source]

Create possible combinations of an input iterable.

Parameters:
  • iterable (Iterable) – Elements from this iterable are combined.

  • n (int) – Number of elements per combination.

  • must_include (Iterable, optional (default: None)) – One or more element(s) of iterable that MUST be in each combination.

  • permutations (bool, optional (default: False)) – Create combinations of n elements, order matters: e.g. AB -> AB, BA If this is False, the output combinations will be sorted.

  • Returns

  • ---------

  • combs (iterable) – The possible combinations of n elements.

pytesmo.df_metrics.nash_sutcliffe(df)

Wrapper to call pytesmo.metrics.nash_sutcliffe() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.nash_sutcliffe() docstring.

pytesmo.df_metrics.nrmsd(df)

Wrapper to call pytesmo.metrics.nrmsd() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.nrmsd() docstring.

pytesmo.df_metrics.nwise_apply(df, method, n=2, comm=False, as_df=False, ds_names=True, must_include=None, **method_kwargs)[source]

Compute given method for column combinations of a data frame, excluding NA/null values.

Parameters:
  • df (pd.DataFrame) – Input data, method will be applied to combinations of columns of this df.

  • method (function) – method to apply to each column pair. Has to take 2 input arguments of type numpy.array and return one value or tuple of values

  • n (int, optional (default: 2)) – Number of columns that are combined. The default n=2 is the same as the previous pairwise_apply() function.

  • comm (bool, optional (default: False)) – Metrics do NOT depend on the order of input values. In these cases we can skip unnecessary calculations and simply copy the results if necessary (faster).

  • as_df (bool, optional (default: False)) – Return matrix structure, same as for previous pairwise_apply(), only available for n=2. By default, the return value will be a list of ordered dicts.

  • ds_names (bool, optional (default: True)) – Use the column names of df to identify the dataset instead of using their index.

  • must_include (list, optional (default: None)) – The index of one or multiple columns in df that MUST be in part of each combination that is processed.

  • method_kwargs – Keyword arguments that are passed to method.

Returns:

results

Return type:

pd.DataFrame or dict or tuple

pytesmo.df_metrics.pairwise_apply(df, method, comm=False)[source]

Compute given method pairwise for all columns, excluding NA/null values

Parameters:
  • df (pd.DataFrame) – input data, method will be applied to each column pair

  • method (function) – method to apply to each column pair. has to take 2 input arguments of type np.array and return one value or tuple of values

  • comm (bool, optional (default: False)) – Also fills the lower part of the results matrix

Returns:

results

Return type:

pd.DataFrame

pytesmo.df_metrics.pearson_r(df)

Wrapper to call pytesmo.metrics.pearson_r() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.pearson_r() docstring.

pytesmo.df_metrics.pearsonr(df)[source]

Wrapper for scipy.stats.pearsonr

Returns:

result – with column names of df for which the calculation was done as name of the element separated by ‘_and_’

Return type:

namedtuple

See also

pytesmo.metrics.pearsonr, scipy.stats.pearsonr

pytesmo.df_metrics.rmsd(df)

Wrapper to call pytesmo.metrics.rmsd() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.rmsd() docstring.

pytesmo.df_metrics.spearman_r(df)

Wrapper to call pytesmo.metrics.spearman_r() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.spearman_r() docstring.

pytesmo.df_metrics.spearmanr(df)[source]

Wrapper for scipy.stats.spearmanr

Returns:

result – with column names of df for which the calculation was done as name of the element separated by ‘_and_’

Return type:

namedtuple

See also

pytesmo.metrics.spearmenr, scipy.stats.spearmenr

pytesmo.df_metrics.tcol_error(df)[source]

Deprecated: use pytesmo.df_metrics.tcol_metrics() instead.

Triple collocation error estimate, applied to triples of columns of the passed data frame.

Returns:

  • triple_collocation_error_x (namedtuple) – Error for the first dataset

  • triple_collocation_error_y (namedtuple) – Error for the second dataset

  • triple_collocation_error_z (namedtuple) – Error for the third dataset

See also

pytesmo.metrics.tcol_error

pytesmo.df_metrics.tcol_metrics(df, ref_ind=0)[source]

Triple Collocation metrics applied to triples of dataframe columns.

Parameters:
  • df (pd.DataFrame) – Contains the input values as time series in the df columns

  • ref_ind (int or None, optional (default: 0)) – The index of the column in df that contains the reference data set. If None is passed, we use the first column of each triple as the reference, otherwise only triples that contain the reference dataset are considered during processing.

Returns:

  • snr (namedtuple) – signal-to-noise (variance) ratio [dB] from the named columns.

  • err_std_dev (namedtuple) – SCALED error standard deviation from the named columns

  • beta (namedtuple) – Scaling coefficients (i_scaled = i * beta_i)

pytesmo.df_metrics.tcol_snr(df, ref_ind=0)[source]

DEPRECATED: use tcol_metrics instead.

pytesmo.df_metrics.ubrmsd(df)

Wrapper to call pytesmo.metrics.ubrmsd() on a dataframe

Parameters:

df (pd.DataFrame) – Dataframe for whose columns combinations the metric should be evaluated.

Returns:

  • result (namedtuple) – Metric values for the different combinations. Member names are df’s column names separated by ‘_and_’.

  • See also pytesmo.metrics.ubrmsd() docstring.