# pytesmo.io package¶

## pytesmo.io.dataset_base module¶

Created on Mar 19, 2014

@author: Christoph Paulik christoph.paulik@geo.tuwien.ac.at

class pytesmo.io.dataset_base.DatasetImgBase(path, filename_templ='', sub_path=None, grid=None, exact_templ=True)[source]

Bases: object

Dateset base class that implements basic functions and also abstract methods that have to be implemented by child classes.

Parameters
• path (string) – Path to dataset.

• filename_templ (string) – template of how datetimes fit into the filename. e.g. “ASCAT_%Y%m%d_image.nc” will be translated into the filename ASCAT_20070101_image.nc for the date 2007-01-01.

• sub_path (string or list optional) – if given it is used to generate a sub path from the given timestamp. This is useful if files are sorted by year or month. If a list is one subfolder per item is assumed. This can be used if the files for May 2007 are e.g. in folders 2007/05/ then the list [‘%Y’, ‘%m’] works.

• grid (pytesmo.grid.grids.BasicGrid of CellGrid instance, optional) – Grid on which all the images of the dataset are stored. This is not relevant for datasets that are stored e.g. in orbit geometry

• exact_templ (boolean, optional) – if True then the filename_templ matches the filename exactly. If False then the filename_templ will be used in glob to find the file.

daily_images(day, **kwargs)[source]

Yield all images for a day.

Parameters

day (datetime.date) –

Returns

• data (dict) – dictionary of numpy arrays that hold the image data for each variable of the dataset

• timestamp (datetime.datetime) – exact timestamp of the image

• lon (numpy.array or None) – array of longitudes, if None self.grid will be assumed

• lat (numpy.array or None) – array of latitudes, if None self.grid will be assumed

• jd (string or None) – name of the field in the data array representing the observation dates

iter_images(start_date, end_date, **kwargs)[source]

Yield all images for a given date range.

Parameters
Returns

• data (dict) – dictionary of numpy arrays that hold the image data for each variable of the dataset

• metadata (dict) – dictionary of numpy arrays that hold the metadata

• timestamp (datetime.datetime) – exact timestamp of the image

• lon (numpy.array or None) – array of longitudes, if None self.grid will be assumed

• lat (numpy.array or None) – array of latitudes, if None self.grid will be assumed

• time_var (string or None) – variable name of observation times in the data dict, if None all observations have the same timestamp

read_img(timestamp, **kwargs)[source]

Return an image if a specific datetime is given.

Parameters

timestamp (datetime.datetime) – Time stamp.

Returns

• data (dict) – dictionary of numpy arrays that hold the image data for each variable of the dataset

• metadata (dict) – dictionary of numpy arrays that hold the metadata

• timestamp (datetime.datetime) – exact timestamp of the image

• lon (numpy.array or None) – array of longitudes, if None self.grid will be assumed

• lat (numpy.array or None) – array of latitudes, if None self.grid will be assumed

• time_var (string or None) – variable name of observation times in the data dict, if None all observations have the same timestamp

tstamps_for_daterange(start_date, end_date)[source]

Return all valid timestamps in a given date range. This method must be implemented if iteration over images should be possible.

Parameters
Returns

dates – list of datetimes

Return type

list

class pytesmo.io.dataset_base.DatasetStaticBase(filename, grid)[source]

Bases: object

Dataset base class for arrays that do have a grid associated with them but are not image time series.

Parameters
• filename (string) – path and filename of file to load

• grid (pytesmo.BasicGrid or similar grid definition class) – defines the grid on which the dataset is stored

abstract read_data()[source]

Reads the data and returns it as a dictionary of numpy arrays.

Returns

data – dictionary of numpy arrays

Return type

dict

read_gp(gpi, **kwargs)[source]

Reads data record for a given grid point index(gpi)

Parameters

gpi (int) – grid point index

Returns

data – data record.

Return type

dict of values

read_pos(*args, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either reading the gpi directly or finding the nearest gpi from given lat,lon coordinates and then reading it

class pytesmo.io.dataset_base.DatasetTSBase(path, grid)[source]

Bases: object

Dateset base class that implements basic functions and also abstract methods that have to be implemented by child classes.

Parameters
• path (string) – Path to dataset.

• grid (pytesmo.grid.grids.BasicGrid of CellGrid instance) – Grid on which the time series data is stored.

get_nearest_gp_info(lon, lat)[source]

get info for nearest grid point

Parameters
• lon (float) – Longitude coordinate.

• lat (float) – Latitude coordinate.

Returns

• gpi (int) – Grid point index of nearest grid point.

• gp_lon (float) – Lontitude coordinate of nearest grid point.

• gp_lat (float) – Latitude coordinate of nearest grid point.

• gp_dist (float) – Geodetic distance to nearest grid point.

iter_ts(ll_bbox=None)[source]

Yield all time series for a grid or for grid points in a given lon/lat bound box (ll_bbox).

Parameters

ll_bbox (tuple of floats (latmin, latmax, lonmin, lonmax)) – Set to lon/lat bounding box to yield only points in that area.

Returns

data – pandas.DateFrame with DateTimeIndex

Return type

pandas.DataFrame

abstract read_gp(gpi, **kwargs)[source]

Reads time series for a given grid point index(gpi)

Parameters

gpi (int) – grid point index

Returns

data – pandas.DateFrame with DateTimeIndex

Return type

pandas.DataFrame

read_ts(*args, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either reading the gpi directly or finding the nearest gpi from given lat,lon coordinates and then reading it