pytesmo.io.ismn package

Submodules

pytesmo.io.ismn.interface module

Created on Aug 5, 2013

@author: Christoph Paulik Christoph.Paulik@geo.tuwien.ac.at

exception pytesmo.io.ismn.interface.ISMNError[source]

Bases: Exception

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pytesmo.io.ismn.interface.ISMN_Interface(path_to_data, network=None)[source]

Bases: object

class provides interface to ISMN data downloaded from the ISMN website

upon initialization it collects metadata from all files in path_to_data and saves metadata information in numpy file in folder path_to_data/python_metadata/ First initialization can take a minute or so if all ISMN data is present in path_to_data

Parameters:
  • path_to_data (string) – filepath to unzipped ISMN data containing the Network folders
  • network (string or list, optional) – provide name of network to only load the given network
Raises:

ISMNError – if given network was not found in path_to_data

metadata

numpy.array – metadata array for all stations contained in the path given during initialization

grid

pygeogrids.grid.BasicGrid – Grid object used for finding nearest insitu station for given lon lat

find_nearest_station(lon, lat)[source]

find nearest station for given coordinates

find_nearest_station(lon, lat, return_distance=False)[source]

finds the nearest station available in downloaded data

Parameters:
  • lon (float) – Longitude of point
  • lat (float) – Latitude of point
  • return_distance (boolean, optional) – if True also distance is returned
Returns:

  • station (ISMN_station) – ISMN_station object
  • distance (float, optional) – distance to station in meters, measured in cartesian coordinates and not on a great circle. Should be OK for small distances

get_dataset_ids(variable, min_depth=0, max_depth=0.1)[source]

returnes list of dataset_id’s that can be used to read a dataset directly through the read_ts function

get_min_max_obs_timestamps(variable='soil moisture', min_depth=None, max_depth=None)[source]

get minimum and maximum timestamps per station

Parameters:
  • self (type) – description
  • variable (string, optional) –
    one of
    • ’soil moisture’,
    • ’soil temperature’,
    • ’soil suction’,
    • ’precipitation’,
    • ’air temperature’,
    • ’field capacity’,
    • ’permanent wilting point’,
    • ’plant available water’,
    • ’potential plant available water’,
    • ’saturation’,
    • ’silt fraction’,
    • ’snow depth’,
    • ’sand fraction’,
    • ’clay fraction’,
    • ’organic carbon’,
    • ’snow water equivalent’,
    • ’surface temperature’,
    • ’surface temperature quality flag original’
  • min_depth (float, optional) – depth_from of variable has to be >= min_depth in order to be included.
  • max_depth (float, optional) – depth_to of variable has to be <= max_depth in order to be included.
Returns:

data – dataframe with multiindex Network Station and columns start_date and end_date

Return type:

pd.DataFrame

get_station(stationname, network=None)[source]

get ISMN_station object by station name

Parameters:
  • stationname (string) – name of station
  • network (string, optional) – network name, has to be used if stations belonging to different networks have the same name
Returns:

ISMN_station

Return type:

ISMN_station object

Raises:

ISMNError – if stationname was not found

list_networks()[source]

returns numpy.array of networks available through the interface

Returns:networks – unique network names available
Return type:numpy.array
list_stations(network=None)[source]

returns numpy.array of station names available through the interface

Parameters:network (string, optional) – if network name is given only stations belonging to the network are returned
Returns:networks – unique network names available
Return type:numpy.array
plot_station_locations(axes=None)[source]

plots available stations on a world map in robinson projection only available if basemap is installed

Parameters:axes (matplotlib.Axes, optional) – If given then plot will be on this axes.
Returns:
  • fig (matplotlib.Figure) – created figure instance. If axes was given this will be None.
  • axes (matplitlib.Axes) – used axes instance.
Raises:ISMNError – if basemap is not installed
read_ts(idx)[source]

read a time series directly by the id

Parameters:idx (int) – id into self.metadata, best one of those returned from get_dataset_ids()
Returns:timeseries – of the read data
Return type:pandas.DataFrame
stations_that_measure(variable)[source]

Goes through all stations and returns those that measure the specified variable

Parameters:variable (string) –

variable name one of

  • ’soil moisture’,
  • ’soil temperature’,
  • ’soil suction’,
  • ’precipitation’,
  • ’air temperature’,
  • ’field capacity’,
  • ’permanent wilting point’,
  • ’plant available water’,
  • ’potential plant available water’,
  • ’saturation’,
  • ’silt fraction’,
  • ’snow depth’,
  • ’sand fraction’,
  • ’clay fraction’,
  • ’organic carbon’,
  • ’snow water equivalent’,
  • ’surface temperature’,
  • ’surface temperature quality flag original’
Returns:ISMN_station
Return type:ISMN_station object
class pytesmo.io.ismn.interface.ISMN_station(metadata)[source]

Bases: object

Knows everything about the station, like which variables are measured there in which depths and in which files the data is stored. This is not completely true for the CEOP format since depth_from and depth_to are not easily knowable without parsing the whole file. For CEOP format depth_from and depth_to will only contain the phrase ‘multiple’ instead of the actual depth

Parameters
metadata : numpy.array
part of the structured array from metadata_collector.collect_from_folder() which contains only fields for one station
network

string – network the time series belongs to

station

string – station name the time series belongs to

latitude

float – latitude of station

longitude

float – longitude of station

elevation

float – elevation of station

variables

numpy.array – variables measured at this station one of

  • ‘soil moisture’,
  • ‘soil temperature’,
  • ‘soil suction’,
  • ‘precipitation’,
  • ‘air temperature’,
  • ‘field capacity’,
  • ‘permanent wilting point’,
  • ‘plant available water’,
  • ‘potential plant available water’,
  • ‘saturation’,
  • ‘silt fraction’,
  • ‘snow depth’,
  • ‘sand fraction’,
  • ‘clay fraction’,
  • ‘organic carbon’,
  • ‘snow water equivalent’,
  • ‘surface temperature’,
  • ‘surface temperature quality flag original’
depth_from

numpy.array – shallower depth of layer the variable with same index was measured at

depth_to

numpy.array – deeper depth of layer the variable with same index was measured at

sensors

numpy.array – sensor names of variables

filenames

numpy.array – filenames in which the data is stored

get_variables()[source]

returns the variables measured at this station

get_depths(variable)[source]

get the depths in which a variable was measured at this station

get_sensors(variable, depth_from, depth_to)[source]

get the sensors for the given variable, depth combination

read_variable(variable, depth_from=None, depth_to=None, sensor=None)[source]

read the data for the given parameter combination

data_for_variable(variable, min_depth=None, max_depth=None)[source]

function to go through all the depth_from, depth_to, sensor combinations for the given variable and yields ISMNTimeSeries if a match is found. if min_depth and/or max_depth where given it only returns a ISMNTimeSeries if depth_from >= min_depth and/or depth_to <= max_depth

Parameters:
  • variable (string) –

    variable to read one of

    • ’soil moisture’,
    • ’soil temperature’,
    • ’soil suction’,
    • ’precipitation’,
    • ’air temperature’,
    • ’field capacity’,
    • ’permanent wilting point’,
    • ’plant available water’,
    • ’potential plant available water’,
    • ’saturation’,
    • ’silt fraction’,
    • ’snow depth’,
    • ’sand fraction’,
    • ’clay fraction’,
    • ’organic carbon’,
    • ’snow water equivalent’,
    • ’surface temperature’,
    • ’surface temperature quality flag original’
  • min_depth (float, optional) – depth_from of variable has to be >= min_depth in order to be included.
  • max_depth (float, optional) – depth_to of variable has to be <= max_depth in order to be included.
Returns:

time_series – ISMNTimeSeries object containing data and metadata

Return type:

iterator(pytesmo.io.ismn.readers.ISMNTimeSeries)

get_depths(variable)[source]

get depths at which the given variable was measured at this station

Parameters:variable (string) –

variable string best one of those returned by get_variables() or one of

  • ’soil moisture’,
  • ’soil temperature’,
  • ’soil suction’,
  • ’precipitation’,
  • ’air temperature’,
  • ’field capacity’,
  • ’permanent wilting point’,
  • ’plant available water’,
  • ’potential plant available water’,
  • ’saturation’,
  • ’silt fraction’,
  • ’snow depth’,
  • ’sand fraction’,
  • ’clay fraction’,
  • ’organic carbon’,
  • ’snow water equivalent’,
  • ’surface temperature’,
  • ’surface temperature quality flag original’
Returns:
  • depth_from (numpy.array)
  • depth_to (numpy.array)
get_min_max_obs_timestamp(variable='soil moisture', min_depth=None, max_depth=None)[source]

goes throug the filenames associated with a station and reads the date of the first and last observation to get and approximate time coverage of the station. This is just an overview. If holes have to be detected the complete file must be read.

Parameters:
  • self (type) – description
  • variable (string, optional) –
    one of
    • ’soil moisture’,
    • ’soil temperature’,
    • ’soil suction’,
    • ’precipitation’,
    • ’air temperature’,
    • ’field capacity’,
    • ’permanent wilting point’,
    • ’plant available water’,
    • ’potential plant available water’,
    • ’saturation’,
    • ’silt fraction’,
    • ’snow depth’,
    • ’sand fraction’,
    • ’clay fraction’,
    • ’organic carbon’,
    • ’snow water equivalent’,
    • ’surface temperature’,
    • ’surface temperature quality flag original’
  • min_depth (float, optional) – depth_from of variable has to be >= min_depth in order to be included.
  • max_depth (float, optional) – depth_to of variable has to be <= max_depth in order to be included.
Returns:

  • start_date (datetime)
  • end_date (datetime)

get_sensors(variable, depth_from, depth_to)[source]

get the sensors at which the variable was measured at the given depth

Parameters:
  • variable (string) –

    variable abbreviation one of

    • ’soil moisture’,
    • ’soil temperature’,
    • ’soil suction’,
    • ’precipitation’,
    • ’air temperature’,
    • ’field capacity’,
    • ’permanent wilting point’,
    • ’plant available water’,
    • ’potential plant available water’,
    • ’saturation’,
    • ’silt fraction’,
    • ’snow depth’,
    • ’sand fraction’,
    • ’clay fraction’,
    • ’organic carbon’,
    • ’snow water equivalent’,
    • ’surface temperature’,
    • ’surface temperature quality flag original’
  • depth_from (float) – shallower depth of layer the variable was measured at
  • depth_to (float) – deeper depth of layer the variable was measured at
Returns:

sensors – array of sensors found for the given combination of variable and depths

Return type:

numpy.array

Raises:

ISMNError – if no sensor was found for the given combination of variable and depths

get_variables()[source]

get a list of variables measured at this station

Returns:variables – array of variables measured at this station
Return type:numpy.array
read_variable(variable, depth_from=None, depth_to=None, sensor=None)[source]

actually reads the given variable from the file. Parameters are required until any ambiguity is resolved. If there is only one depth for the given variable then only variable is required. If there are multiple depths at least depth_from is required. If there are multiple depth_to possibilities for one variable-depth_from combination also depth_to has to be specified. If 2 sensors are measuring the same variable in the same depth then also the sensor has to be specified.

Parameters:
  • variable (string) –

    variable to read one of

    • ’soil moisture’,
    • ’soil temperature’,
    • ’soil suction’,
    • ’precipitation’,
    • ’air temperature’,
    • ’field capacity’,
    • ’permanent wilting point’,
    • ’plant available water’,
    • ’potential plant available water’,
    • ’saturation’,
    • ’silt fraction’,
    • ’snow depth’,
    • ’sand fraction’,
    • ’clay fraction’,
    • ’organic carbon’,
    • ’snow water equivalent’,
    • ’surface temperature’,
    • ’surface temperature quality flag original’
  • depth_from (float, optional) – shallower depth of layer the variable was measured at
  • depth_to (float, optional) – deeper depth of layer the variable was measured at
  • sensor (string, optional) – name of the sensor
Returns:

data – ISMNTimeSeries object containing the relevant metadata for the time series as well as a .data pointing to a pandas.DataFrame

Return type:

readers.ISMNTimeSeries

Raises:

ISMNError: – if not all ambiguity was resolved by the given input parameters or if no data was found for the given input parameters

pytesmo.io.ismn.metadata_collector module

Created on Aug 1, 2013

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

pytesmo.io.ismn.metadata_collector.collect_from_folder(rootdir)[source]

function walks the rootdir directory and looks for network folders and ISMN datafiles. It collects metadata for every file found and returns a numpy.ndarray of metadata

Parameters:rootdir (string) – root directory on filesystem where the ISMN data was unzipped to
Returns:metadata – structured numpy array which contains the metadata for one file per row
Return type:numpy.ndarray

pytesmo.io.ismn.readers module

Created on Jul 31, 2013

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

exception pytesmo.io.ismn.readers.ISMNTSError[source]

Bases: Exception

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class pytesmo.io.ismn.readers.ISMNTimeSeries(data)[source]

Bases: object

class that contains a time series of ISMN data read from one text file

network

string – network the time series belongs to

station

string – station name the time series belongs to

latitude

float – latitude of station

longitude

float – longitude of station

elevation

float – elevation of station

variable

list – variable measured

depth_from

list – shallower depth of layer the variable was measured at

depth_to

list – deeper depth of layer the variable was measured at

sensor

string – sensor name

data

pandas.DataFrame – data of the time series

plot(*args, **kwargs)[source]

wrapper for pandas.DataFrame.plot which adds title to plot and drops NaN values for plotting :returns: ax – matplotlib axes of the plot :rtype: axes

Raises:ISMNTSError – if data attribute is not a pandas.DataFrame
exception pytesmo.io.ismn.readers.ReaderException[source]

Bases: Exception

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

pytesmo.io.ismn.readers.get_format(filename)[source]

get’s the file format from the length of the header and filename information

Parameters:filename (string) –
Returns:methodname – name of method used to read the detected format
Return type:string
Raises:ReaderException – if filename or header parts do not fit one of the formats
pytesmo.io.ismn.readers.get_info_from_file(filename)[source]

reads first line of file and splits filename this can be used to construct necessary metadata information for all ISMN formats

Parameters:filename (string) – filename including path
Returns:
  • header_elements (list) – first line of file split into list
  • filename_elements (list) – filename without path split by _
pytesmo.io.ismn.readers.get_metadata(filename)[source]

reads ISMN metadata from any format

Parameters:filename (string) –
Returns:metadata
Return type:dict
pytesmo.io.ismn.readers.get_metadata_ceop(filename)[source]

get metadata from ISMN textfiles in the format called CEOP Reference Data Format

Parameters:filename (string) – path and name of file
Returns:metadata – dictionary of metadata information
Return type:dict
pytesmo.io.ismn.readers.get_metadata_ceop_sep(filename)[source]

get metadata from ISMN textfiles in the format called Variables stored in separate files (CEOP formatted)

Parameters:filename (string) – path and name of file
Returns:metadata – dictionary of metadata information
Return type:dict
pytesmo.io.ismn.readers.get_metadata_header_values(filename)[source]

get metadata from ISMN textfiles in the format called Variables stored in separate files (CEOP formatted)

Parameters:filename (string) – path and name of file
Returns:metadata – dictionary of metadata information
Return type:dict
pytesmo.io.ismn.readers.get_min_max_timestamp(filename)[source]

Determine the file type and get the minimum and maximum observation timestamp

pytesmo.io.ismn.readers.get_min_max_timestamp_ceop(filename)[source]

Get minimum and maximum observation timestamp from ceop format.

pytesmo.io.ismn.readers.get_min_max_timestamp_ceop_sep(filename)[source]

Get minimum and maximum observation timestamp from ceop_sep format.

pytesmo.io.ismn.readers.get_min_max_timestamp_header_values(filename)[source]

Get minimum and maximum observation timestamp from header values format.

pytesmo.io.ismn.readers.read_data(filename)[source]

reads ISMN data in any format

Parameters:filename (string) –
Returns:timeseries
Return type:IMSNTimeSeries
pytesmo.io.ismn.readers.read_format_ceop(filename)[source]

Reads ISMN textfiles in the format called CEOP Reference Data Format

Parameters:filename (string) – path and name of file
Returns:time_series – ISMNTimeSeries object initialized with metadata and data from file
Return type:ISMNTimeSeries
pytesmo.io.ismn.readers.read_format_ceop_sep(filename)[source]

Reads ISMN textfiles in the format called Variables stored in separate files (CEOP formatted)

Parameters:filename (string) – path and name of file
Returns:time_series – ISMNTimeSeries object initialized with metadata and data from file
Return type:ISMNTimeSeries
pytesmo.io.ismn.readers.read_format_header_values(filename)[source]

Reads ISMN textfiles in the format called Variables stored in separate files (Header + values)

Parameters:filename (string) – path and name of file
Returns:time_series – ISMNTimeSeries object initialized with metadata and data from file
Return type:ISMNTimeSeries
pytesmo.io.ismn.readers.tail(f, lines=1, _buffer=4098)[source]

Tail a file and get X lines from the end

Parameters:
  • f (file like object) –
  • lines (int) – lines from the end of the file to read
  • _buffer (int) – buffer to use to step backwards in the file.

References

Found at http://stackoverflow.com/a/13790289/1314882

Module contents