functions reference API¶
dimarray functions are listed below by topic, along with examples. DimArray Methods are provided in a separate page DimArray API.
Join¶
-
dimarray.
stack
(arrays, axis=None, keys=None, align=False, **kwargs)[source]¶ stack arrays along a new dimension (raise error if already existing)
Parameters: - arrays : sequence or dict of arrays
- axis : str, optional
new dimension along which to stack the array
- keys : array-like, optional
stack axis values, useful if array is a sequence, or a non-ordered dictionary
- align : bool, optional
if True, align axes prior to stacking (Default to False)
- **kwargs : optional key-word arguments passed to align, if align is True
Returns: - DimArray : joint array
See also
concatenate
- join arrays along an existing dimension
swapaxes
- to modify the position of the newly inserted axis
Examples
>>> from dimarray import DimArray >>> a = DimArray([1,2,3]) >>> b = DimArray([11,22,33]) >>> stack([a, b], axis='stackdim', keys=['a','b']) dimarray: 6 non-null elements (0 null) 0 / stackdim (2): 'a' to 'b' 1 / x0 (3): 0 to 2 array([[ 1, 2, 3], [11, 22, 33]])
-
dimarray.
concatenate
(arrays, axis=0, _no_check=False, align=False, **kwargs)[source]¶ concatenate several DimArrays
Parameters: - arrays : list of DimArrays
arrays to concatenate
- axis : int or str
axis along which to concatenate (must exist)
- align : bool, optional
align secondary axes before joining on the primary axis axis. Default to False.
- **kwargs : optional key-word arguments passed to align, if align is True
Returns: - concatenated DimArray
See also
stack
- join arrays along a new dimension
align
- align arrays
Examples
1-D
>>> from dimarray import DimArray >>> a = DimArray([1,2,3], axes=[['a','b','c']]) >>> b = DimArray([4,5,6], axes=[['d','e','f']]) >>> concatenate((a, b)) dimarray: 6 non-null elements (0 null) 0 / x0 (6): 'a' to 'f' array([1, 2, 3, 4, 5, 6])
2-D
>>> a = DimArray([[1,2,3],[11,22,33]]) >>> b = DimArray([[4,5,6],[44,55,66]]) >>> concatenate((a, b), axis=0) dimarray: 12 non-null elements (0 null) 0 / x0 (4): 0 to 1 1 / x1 (3): 0 to 2 array([[ 1, 2, 3], [11, 22, 33], [ 4, 5, 6], [44, 55, 66]]) >>> concatenate((a, b), axis='x1') dimarray: 12 non-null elements (0 null) 0 / x0 (2): 0 to 1 1 / x1 (6): 0 to 2 array([[ 1, 2, 3, 4, 5, 6], [11, 22, 33, 44, 55, 66]])
-
dimarray.
stack_ds
(datasets, axis, keys=None, align=False, **kwargs)[source]¶ stack dataset along a new dimension
Parameters: - datasets: sequence or dict of datasets
- axis: str, new dimension along which to stack the dataset
- keys, optional: stack axis values, useful if dataset is a sequence, or a non-ordered dictionary
- align, optional: if True, align axes (via reindexing) *prior* to stacking
- **kwargs : optional key-word arguments passed to align, if align is True
Returns: - stacked dataset
See also
concatenate_ds
,stack
,sort_axis
Examples
>>> a = DimArray([1,2,3], dims=('dima',)) >>> b = DimArray([11,22], dims=('dimb',)) >>> ds = Dataset(a=a,b=b) # dataset of 2 variables from an experiment >>> ds2 = Dataset(a=a*2,b=b*2) # dataset of 2 variables from a second experiment >>> stack_ds([ds, ds2], axis='stackdim', keys=['exp1','exp2']) Dataset of 2 variables 0 / stackdim (2): 'exp1' to 'exp2' 1 / dima (3): 0 to 2 2 / dimb (2): 0 to 1 a: ('stackdim', 'dima') b: ('stackdim', 'dimb')
-
dimarray.
concatenate_ds
(datasets, axis=0, align=False, **kwargs)[source]¶ concatenate two datasets along an existing dimension
Parameters: - datasets: sequence of datasets
- axis: axis along which to concatenate
- align, optional: if True, align secondary axes (via reindexing) prior to concatenating
- **kwargs : optional key-word arguments passed to align, if align is True
Returns: - joint Dataset along axis
- NOTE: will raise an error if variables are there which do not contain the required dimension
See also
stack_ds
,concatenate
,sort_axis
Examples
>>> a = da.zeros(axes=[list('abc')], dims=('x0',)) # 1-D DimArray >>> b = da.zeros(axes=[list('abc'), [1,2]], dims=('x0','x1')) # 2-D DimArray >>> ds = Dataset(a=a,b=b) # dataset of 2 variables from an experiment >>> a2 = da.ones(axes=[list('def')], dims=('x0',)) >>> b2 = da.ones(axes=[list('def'), [1,2]], dims=('x0','x1')) # 2-D DimArray >>> ds2 = Dataset(a=a2,b=b2) # dataset of 2 variables from a second experiment >>> concatenate_ds([ds, ds2]) Dataset of 2 variables 0 / x0 (6): 'a' to 'f' 1 / x1 (2): 1 to 2 a: ('x0',) b: ('x0', 'x1')
Align¶
-
dimarray.
align_axes
(*args, **kwargs)¶ Deprecated. Now renamed to align
-
dimarray.
align_dims
(*arrays)[source]¶ Align dimensions of a list of arrays so that they are ready for broadcast.
Method: inserting singleton axes at the right place and transpose where needed. Note : not part of public API, but used in other dimarray modules
Examples
>>> import dimarray as da >>> import numpy as np >>> x = da.DimArray(np.arange(2), dims=('x0',)) >>> y = da.DimArray(np.arange(3), dims=('x1',)) >>> align_dims(x, y) [dimarray: 2 non-null elements (0 null) 0 / x0 (2): 0 to 1 1 / x1 (1): None to None array([[0], [1]]), dimarray: 3 non-null elements (0 null) 0 / x0 (1): None to None 1 / x1 (3): 0 to 2 array([[0, 1, 2]])]
-
dimarray.
broadcast_arrays
(*arrays)[source]¶ Analogous to numpy.broadcast_arrays
but with looser requirements on input shape and returns copy instead of views
Parameters: - arrays : variable list of DimArrays
Returns: - list of DimArrays
Examples
Just as numpy’s broadcast_arrays
>>> import dimarray as da >>> x = da.DimArray([[1,2,3]]) >>> y = da.DimArray([[1],[2],[3]]) >>> da.broadcast_arrays(x, y) [dimarray: 9 non-null elements (0 null) 0 / x0 (3): 0 to 2 1 / x1 (3): 0 to 2 array([[1, 2, 3], [1, 2, 3], [1, 2, 3]]), dimarray: 9 non-null elements (0 null) 0 / x0 (3): 0 to 2 1 / x1 (3): 0 to 2 array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])]
Interpolate¶
-
dimarray.
interp2d
(dim_array, newaxes, dims=(-2, -1), **kwargs)[source]¶ Two-dimensional interpolation
Parameters: - dim_array : DimArray instance
- newaxes : sequence of two array-like, or dict.
axes on which to interpolate
- dims : sequence of two axis names or integer rank, optional
Indicate dimensions which match newaxes. By default (-2, -1) (last two dimensions).
- **kwargs : passed to scipy.interpolate.RegularGridInterpolator
method : ‘nearest’ or ‘linear’ (default) bounds_error : True by default fill_value : np.nan by default, but set to None to extrapolate outside bounds.
Returns: - dim_array_int : DimArray instance
interpolated array
Examples
>>> from dimarray import DimArray, interp2d >>> x = np.array([0, 1, 2]) >>> y = np.array([0, 10]) >>> a = DimArray([[0,0,1],[1,0.,0.]], [('y',y),('x',x)]) >>> a dimarray: 6 non-null elements (0 null) 0 / y (2): 0 to 10 1 / x (3): 0 to 2 array([[0., 0., 1.], [1., 0., 0.]]) >>> newx = [0.5, 1.5] >>> newy = np.linspace(0,10,5) >>> ai = interp2d(a, [newy, newx]) >>> ai dimarray: 10 non-null elements (0 null) 0 / y (5): 0.0 to 10.0 1 / x (2): 0.5 to 1.5 array([[0. , 0.5 ], [0.125, 0.375], [0.25 , 0.25 ], [0.375, 0.125], [0.5 , 0. ]])
Use dims keyword argument if new axes order does not match array dimensions >>> (ai == interp2d(a, [newx, newy], dims=(‘x’,’y’))).all() True
Out-of-bounds filled with NaN: >>> newx = [-1, 1] >>> newy = [-5, 0, 10] >>> interp2d(a, [newy, newx], bounds_error=False) dimarray: 2 non-null elements (4 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[nan, nan],
[nan, 0.], [nan, 0.]])Nearest neighbor interpolation and out-of-bounds extrapolation >>> interp2d(a, [newy, newx], method=’nearest’, bounds_error=False, fill_value=None) dimarray: 6 non-null elements (0 null) 0 / y (3): -5 to 10 1 / x (2): -1 to 1 array([[0., 0.],
[0., 0.], [1., 0.]])
Stats¶
-
dimarray.
percentile
(a, pct, axis=0, newaxis=None, out=None, overwrite_input=False)[source]¶ calculate percentile along an axis
Parameters: - pct: float, percentile or sequence of percentiles (0< <100)
- axis, optional, default 0: axis along which to compute percentiles
- newaxis, optional: name of the new percentile axis, if more than one pct.
By default, append “_percentile” to the axis name on which the transformation is applied.
- out, overwrite_input: passed to numpy’s percentile method (see documentation)
Returns: - pctiles: DimArray or scalar whose required axis has been reduced or replaced by percentiles
Examples
>>> from dimarray import DimArray >>> np.random.seed(0) # for reproductibility of results >>> a = DimArray(np.random.randn(1000), dims=['sample']) >>> percentile(a, 50) -0.058028034799627745
>>> percentile(a, [50, 95]) dimarray: 2 non-null elements (0 null) 0 / sample_percentile (2): 50 to 95 array([-0.05802803, 1.66012041])
Read netCDF data¶
-
dimarray.
read_nc
(f, names=None, *args, **kwargs)[source]¶ Wrapper around DatasetOnDisk.read
Read one or several variables from one or several netCDF file
Parameters: - f : str or netCDF handle
netCDF file to read from or regular expression
- names : None or list or str, optional
variable name(s) to read default is None
- indices : int or list or slice (single-dimensional indices)
or a tuple of those (multi-dimensional) or dict of { axis name : axis indices }
Indices refer to Dataset axes. Any item that does not possess one of the dimensions will not be indexed along that dimension. For example, scalar items will be left unchanged whatever indices are provided.
- indexing : {‘label’, ‘position’}, optional
Indexing mode. - “label”: indexing on axis labels (default) - “position”: use numpy-like position index Default value can be changed in dimarray.rcParams[‘indexing.by’]
- tol : float, optional
tolerance when looking for numerical values, e.g. to use nearest neighbor search, default None.
- keepdims : bool, optional
keep singleton dimensions (default False)
- axis : str, optional
When reading multiple files, axis along which to join the dimarrays or datasets. It the axis already exist, the resulting arrays will be concatenated, otherwise they will be stacked along a new array (in the sense of the numpy functions concatenate and stack)
- keys : sequence, optional
When reading multiple files, keys for the join axis. If the axis already exists in the dataset, the concatenated dataset/dimarray will be re-indexed along the provided key, otherwise the keys will be used to create a new axis for stacking. In the latter case, keys’ length needs to exactly match the number of input files, and if not provided, file names will be taken instead. Note you may manually rename the axes later, or use the set_axis method.
- align : bool, optional
When reading multiple files, passed to stack (new axis) or concatenate (existing axis) to reindex all arrays onto common axes. (in concatenate mode, the concatenation axis is not re-indexed of course, only the secondary axes) Default to False.
- **kwargs : optional key-word arguments passed to align, if align is True
When reading multiple files, passed to stack (new axis) or This includes: sort (False by default) and join (‘outer’ by default)
Returns: - obj : DimArray or Dataset
depending on whether a (single) variable name is passed as argument (names) or not
See also
DatasetOnDisk.read
,stack
,concatenate
,stack_ds
,concatenate_ds
,align
,DimArray.write_nc
,Dataset.write_nc
Examples
>>> import os >>> from dimarray import read_nc, get_datadir
Single netCDF file
>>> ncfile = os.path.join(get_datadir(), 'cmip5.CSIRO-Mk3-6-0.nc')
>>> data = read_nc(ncfile) # load full file >>> data Dataset of 2 variables 0 / time (451): 1850 to 2300 1 / scenario (5): u'historical' to u'rcp85' tsl: (u'time', u'scenario') temp: (u'time', u'scenario') >>> data = read_nc(ncfile,'temp') # only one variable >>> data = read_nc(ncfile,'temp', indices={"time":slice(2000,2100), "scenario":"rcp45"}) # load only a chunck of the data >>> data = read_nc(ncfile,'temp', indices={"time":1950.3}, tol=0.5) # approximate matching, adjust tolerance >>> data = read_nc(ncfile,'temp', indices={"time":-1}, indexing='position') # integer position indexing
Multiple files Read variable ‘temp’ across multiple files (representing various climate models) In this case the variable is a time series, whose length may vary across experiments (thus align=True is passed to reindex axes before stacking)
>>> direc = get_datadir() >>> temp = da.read_nc(direc+'/cmip5.*.nc', 'temp', align=True, axis='model')
A new ‘model’ axis is created labeled with file names. It is then possible to rename it more appropriately, e.g. keeping only the part directly relevant to identify the experiment:
>>> getmodel = lambda x: os.path.basename(x).split('.')[1] # extract model name from path >>> temp.set_axis(getmodel, axis='model') # would return a copy if inplace is not specified >>> temp dimarray: 9114 non-null elements (6671 null) 0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR' 1 / time (451): 1850 to 2300 2 / scenario (5): u'historical' to u'rcp85' array(...)
This works on datasets as well:
>>> ds = da.read_nc(direc+'/cmip5.*.nc', align=True, axis='model') >>> ds.set_axis(getmodel, axis='model') >>> ds Dataset of 2 variables 0 / model (7): 'CSIRO-Mk3-6-0' to 'MPI-ESM-MR' 1 / time (451): 1850 to 2300 2 / scenario (5): u'historical' to u'rcp85' tsl: ('model', u'time', u'scenario') temp: ('model', u'time', u'scenario')
-
dimarray.
summary_nc
(fname, name=None, metadata=False)[source]¶ Print summary information about the content of a netCDF file Deprecated, see dimarray.open_nc