Covariance#
- class astropy.nddata.Covariance(array=None, data_shape=None, assume_symmetric=False, unit=None)[source]#
Bases:
NDUncertaintyA general utility for storing, manipulating, and I/O of covariance matrices.
Covariance matrices are symmetric by definition, \(\Sigma_{ij} = \Sigma_{ji}\). The object therefore only stores the upper triangle of the matrix using a
scipy.sparse.csr_matrix. By default, instantiation will check for symmetry and issue a warning if the matrix is not symmetric. This check can be skipped using theassume_symmetrickeyword. However, by virtue of how the data is stored, symmetry is always imposed on the matrix. That is, if a non-symmetric matrix is used to instantiate aCovarianceobject, the stored data will yield a matrix that is different from the original input.Covariance matrices of higher dimensional arrays are always assumed to be stored following row-major indexing. For example, the covariance value \(\Sigma_{ij}\) for an image of size \((N_x,N_y)\) is the covariance between image pixels \(I_{x_i,y_i}\) and \(I_{x_j,y_j}\), where \(i = x_i + N_x y_i\) and, similarly, \(j = x_j + N_x y_j\).
See Covariance for additional documentation and examples.
- Parameters:
- arraynumpy:array_like,
csr_matrix Covariance matrix to store. If the array is not a
csr_matrixinstance, it must be convertible to one. To match the calling sequence forNDUncertainty,arrayhas a default value of None, but the array must be provided for thisCovarianceobject.- data_shape
tuple, optional The covariance data is for a higher dimensional array with this shape. For example, if the covariance data is for a 2D image with shape
(nx,ny), setdata_shape=(nx,ny); the shape of the covariance array must then be(nx*ny, nx*ny). If None, any higher dimensionality is ignored.- assume_symmetricbool, optional
Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.
- unitastropy:unit-like, optional
Unit for the covariance values.
- arraynumpy:array_like,
- Raises:
TypeErrorRaised if the input array not a
csr_matrixobject and cannot be converted to one.ValueErrorRaised if
data_shapeis provided and the input covariance matrixarraydoes not have the expected shape or ifarrayis None.
Attributes Summary
An array mapping the index along each axis of the covariance matrix to the shape of the associated data array.
The expected shape of the data array associated with this covariance array.
The number of non-zero (NNZ) elements in the full covariance matrix, including both the upper and lower triangles.
The covariance matrix as an dense
Quantityobject.Tuple with the shape of the covariance matrix
The number of non-zero elements stored by the object, which only counts the non-zero elements in the upper triangle.
"cov":Covarianceimplements a covariance matrix.The diagonal of the covariance matrix.
Methods Summary
apply_new_variance(var)Using the same correlation coefficients, return a new
Covarianceobject with the provided variance.coordinate_data([reshape])Construct data arrays with the non-zero covariance components in coordinate format.
copy()Return a copy of this Covariance object.
Given indices along the two axes of the covariance matrix, return the relevant indices in the data array.
Given indices of elements in the source data array, return the matrix coordinates with the associated covariance.
find([correlation])Find the non-zero values in the full covariance matrix (not just the upper triangle).
from_array(covar[, cov_tol, rho_tol])Define a covariance object from an array.
from_matrix_multiplication(T, covar, **kwargs)Construct the covariance matrix that results from a matrix multiplication.
from_samples(samples[, cov_tol, rho_tol])Build a covariance object using discrete samples.
from_table(triu_covar)Construct the covariance matrix from a table with the non-zero elements of the upper triangle of the covariance matrix in coordinate format.
from_variance(variance, **kwargs)Construct a diagonal covariance matrix using the provided variance.
match_to_data_slice(data_slice)Return a new
Covarianceinstance that is matched to a slice of its parent data array.revert_correlation(var, rho[, assume_symmetric])Revert a variance vector and correlation matrix into a covariance matrix.
to_correlation(cov[, assume_symmetric])Convert a covariance matrix into a correlation matrix by dividing each element by the variances.
to_dense([correlation])Return the full covariance matrix as a
numpy.ndarrayobject (a "dense" array).to_sparse([correlation])Return the full covariance matrix as a
csr_matrixobject.to_table()Return the covariance data in a
Tableusing coordinate format.Attributes Documentation
- data_index_map#
An array mapping the index along each axis of the covariance matrix to the shape of the associated data array.
- data_shape#
The expected shape of the data array associated with this covariance array.
- nnz#
The number of non-zero (NNZ) elements in the full covariance matrix, including both the upper and lower triangles.
- shape#
Tuple with the shape of the covariance matrix
- stored_nnz#
The number of non-zero elements stored by the object, which only counts the non-zero elements in the upper triangle.
- uncertainty_type#
"cov":Covarianceimplements a covariance matrix.
- variance#
The diagonal of the covariance matrix.
Methods Documentation
- apply_new_variance(var)[source]#
Using the same correlation coefficients, return a new
Covarianceobject with the provided variance.- Parameters:
- varnumpy:array_like
Variance vector. Must have a length that matches this
Covarianceinstance; e.g., if this instance iscov, the length ofvarmust becov.shape[0]). Note that, if the covariance is for higher dimensional data, this variance array must be flattened to 1D.
- Returns:
CovarianceA covariance matrix with the same shape and correlation coefficients and this object, but with the provided variance.
- Raises:
ValueErrorRaised if the length of the variance vector is incorrect.
- coordinate_data(reshape=False)[source]#
Construct data arrays with the non-zero covariance components in coordinate format.
Coordinate format means that the covariance matrix data is provided in three columns providing \(\Sigma_{ij}\) and the (0-indexed) matrix coordinates \(i,j\).
This procedure is primarily used when constructing the data arrays for storage. Matching the class convention, the returned data only includes the upper triangle.
- Parameters:
- reshape
bool, optional If
reshapeisTrueanddata_shapeis defined, the \(i,j\) indices are converted to the expected data-array indices; seecovariance_to_data_indices(). These can be reverted to the coordinates in the covariance matrix usingdata_to_covariance_indices().
- reshape
- Returns:
- i, j
python:tuple,numpy.ndarray The row and column indices, \(i,j\): of the covariance matrix. If reshaping, these are tuples with the index arrays along each of the reshaped axes.
- cij
numpy.ndarray The covariance, \(\Sigma_{ij}\), between array elements at indices \(i\) and \(j\).
- i, j
- Raises:
ValueErrorRaised if
reshapeis True butdata_shapeis undefined.
- copy()[source]#
Return a copy of this Covariance object.
- Returns:
CovarianceA copy of the current covariance matrix.
- covariance_to_data_indices(i, j)[source]#
Given indices along the two axes of the covariance matrix, return the relevant indices in the data array. This is the inverse of
data_to_covariance_indices().- Parameters:
- i
ndarray 1D array with the index along the first axis of the covariance matrix. Must be in the range \(0...n-1\), where \(n\) is the length of the covariance-matrix axes.
- j
ndarray 1D array with the index along the second axis of the covariance matrix. Must be in the range \(0...n-1\), where \(n\) is the length of the covariance-matrix axes.
- i
- Returns:
- i_data, i_data
python:tuple,numpy.ndarray If
data_shapeis not defined, the input arrays are simply returned (and not copied). Otherwise, the code usesunravel_indexto calculate the relevant data-array indices; each element in the two-tuple is itself a tuple of \(N_{\rm dim}\) arrays, one array per dimension of the data array.
- i_data, i_data
- Raises:
ValueErrorRaised if the provided indices fall outside the range of covariance matrix.
- data_to_covariance_indices(i, j)[source]#
Given indices of elements in the source data array, return the matrix coordinates with the associated covariance. This is the inverse of
covariance_to_data_indices().- Parameters:
- inumpy:array_like,
python:tuple A tuple of \(N_{\rm dim}\) array-like objects providing the indices of elements in the N-dimensional data array. This can be an array-like object if
data_shapeis undefined, in which case the values must be in the range \(0...n-1\), where \(n\) is the length of the data array.- jnumpy:array_like,
python:tuple The same as
i, but providing a second set of coordinates at which to access the covariance.
- inumpy:array_like,
- Returns:
- i_covar, j_covar
numpy.ndarray Arrays providing the indices in the covariance matrix associated with the provided data array coordinates. If
data_shapeis not defined, the input arrays are simply returned (and not copied). Otherwise, the code usesravel_multi_indexto calculate the relevant covariance indices.
- i_covar, j_covar
- Raises:
ValueErrorRaised if the provided indices fall outside the range of data array, or if the length of the
iorjtuples is not \(N_{\rm dim}\).
- find(correlation=False)[source]#
Find the non-zero values in the full covariance matrix (not just the upper triangle).
This is a simple wrapper for
to_sparseandfind.- Parameters:
- correlationbool, optional
Flag to return the correlation data, instead of the covariance data. Note that setting this to
Truedoes not also return the variance vector.
- Returns:
- i, j
numpy.ndarray Arrays containing the index coordinates of the non-zero values in the covariance (or correlation) matrix.
- c
numpy.ndarray The non-zero covariance (or correlation) matrix values located at the provided
i,jcoordinates.
- i, j
- classmethod from_array(covar, cov_tol=None, rho_tol=None, **kwargs)[source]#
Define a covariance object from an array.
Note
The only difference between this method and the direct instantiation method (i.e.,
Covariance(array=covar)) is that it can be used to impose tolerances on the covariance value and/or correlation coefficients.- Parameters:
- covarnumpy:array_like
Array with the covariance data. The object must be either a
csr_matrixor an object that can be converted to one. It must also be 2-dimensional and square.- cov_tol
float, optional The absolute value of any covariance matrix entry less than this is assumed to be equivalent to (and set to) 0.
- rho_tol
float, optional The absolute value of any correlation coefficient less than this is assumed to be equivalent to (and set to) 0.
- **kwargs
python:dict, optional Passed directly to main instantiation method.
- Returns:
CovarianceThe covariance matrix built using the provided array.
- classmethod from_matrix_multiplication(T, covar, **kwargs)[source]#
Construct the covariance matrix that results from a matrix multiplication.
Linear operations on a dataset (e.g., binning or smoothing) can be written as matrix multiplications of the form
\[{\mathbf y} = {\mathbf T}\ {\mathbf x},\]where \({\mathbf T}\) is a transfer matrix of size \(N_y\times N_x\), \({\mathbf x}\) is a vector of size \(N_x\), and \({\mathbf y}\) is a vector of length \({N_y}\) that results from the multiplication. If \({\mathbf \Sigma}_x\) is the covariance matrix for \({\mathbf x}\), then the covariance matrix for \({\mathbf Y}\) is
\[{\mathbf \Sigma}_y = {\mathbf T}\ {\mathbf \Sigma}_x\ {\mathbf T}^\top.\]If
covaris provided as a vector of length \(N_x\), it is assumed that the elements of \({\mathbf X}\) are independent and the provided vector gives the variance in each element; i.e., the provided data represent the diagonal of \({\mathbf \Sigma}\).- Parameters:
- T
csr_matrix,ndarray Transfer matrix. See above.
- covar
csr_matrix,ndarray Covariance matrix. See above.
- **kwargs
python:dict, optional Passed directly to main instantiation method.
- T
- Returns:
CovarianceThe covariance matrix resulting from the matrix multiplication.
- Raises:
ValueErrorRaised if the provided arrays are not two dimensional or if there is a shape mismatch.
- classmethod from_samples(samples, cov_tol=None, rho_tol=None, **kwargs)[source]#
Build a covariance object using discrete samples.
The covariance is generated using
covfor a set of discretely sampled data for an \(N\)-dimensional parameter space.- Parameters:
- samples
ndarray Array with samples drawn from an \(N\)-dimensional parameter space. The shape of the input array must be \(N_{\rm par}\times N_{\rm samples}\).
- cov_tol
float, optional The absolute value of any covariance matrix entry less than this is assumed to be equivalent to (and set to) 0.
- rho_tol
float, optional The absolute value of any correlation coefficient less than this is assumed to be equivalent to (and set to) 0.
- **kwargs
python:dict, optional Passed directly to main instantiation method.
- samples
- Returns:
CovarianceAn \(N_{\rm par}\times N_{\rm par}\) covariance matrix built using the provided samples.
- Raises:
ValueErrorRaised if the input array is not 2D or if the number of samples (length of the second axis) is less than 2.
- classmethod from_table(triu_covar)[source]#
Construct the covariance matrix from a table with the non-zero elements of the upper triangle of the covariance matrix in coordinate format.
This is the inverse operation of
to_table(). The class can read covariance data written by other programs as long as they have a commensurate format; seeto_table().- Parameters:
- triu_covar
Table The non-zero elements of the upper triangle of the covariance matrix in coordinate format; see
to_table().
- triu_covar
- Returns:
CovarianceThe covariance matrix constructed from the tabulated data.
- Raises:
ValueErrorRaised if
triu_covar.metaisNone, if the provided variance array does not have the correct size, or if the data is multidimensional and the table columns do not have the right shape.
- classmethod from_variance(variance, **kwargs)[source]#
Construct a diagonal covariance matrix using the provided variance.
- Parameters:
- variance
ndarray The variance vector.
- **kwargs
python:dict, optional Passed directly to main instantiation method.
- variance
- Returns:
CovarianceThe diagonal covariance matrix.
- match_to_data_slice(data_slice)[source]#
Return a new
Covarianceinstance that is matched to a slice of its parent data array.- Parameters:
- data_slice
slice, numpy:array_like Anything that can be used to slice a
numpy.ndarray. To generate a slice using syntax that mimics accessing numpy array elements, usenumpy.s_; see examples here.
- data_slice
- Returns:
CovarianceA new covariance object for the sliced data array.
- static revert_correlation(var, rho, assume_symmetric=False)[source]#
Revert a variance vector and correlation matrix into a covariance matrix.
This is the reverse operation of
to_correlation.- Parameters:
- var
ndarray Variance vector. Length must match the diagonal of
rho.- rho
ndarray,csr_matrix Correlation matrix. Diagonal must have the same length as
var.- assume_symmetricbool, optional
Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.
- var
- Returns:
csr_matrixCovariance matrix.
- static to_correlation(cov, assume_symmetric=False)[source]#
Convert a covariance matrix into a correlation matrix by dividing each element by the variances.
Specifically, extract
var(\(V_i = C_{ii} \equiv \sigma^2_i\)) and convertcovfrom a covariance matrix with elements \(C_{ij}\) to a correlation matrix with \(\rho_{ij}\) such that\[C_{ij} \equiv \rho_{ij} \sigma_i \sigma_j.\]To revert a variance vector and correlation matrix back to a covariance matrix, use
revert_correlation().- Parameters:
- covnumpy:array_like
Covariance matrix to convert. Must be a
csr_matrixinstance or convertible to one.- assume_symmetricbool, optional
Assume the matrix is symmetric. This means that a check for symmetry is not performed, and the user is not warned if the matrix is not symmetric.
- Returns:
- var
numpy.ndarray Variance vector
- rho
csr_matrix Correlation matrix
- var
- Raises:
ValueErrorRaised if the input array is not 2D and square.
- to_dense(correlation=False)[source]#
Return the full covariance matrix as a
numpy.ndarrayobject (a “dense” array).- Parameters:
- correlationbool, optional
Flag to return the correlation matrix, instead of the covariance matrix. Note that setting this to
Truedoes not also return the variance vector.
- Returns:
ndarrayDense array with the full covariance matrix.
- to_sparse(correlation=False)[source]#
Return the full covariance matrix as a
csr_matrixobject.This method is essentially equivalent to
to_denseexcept that it returns a sparse array.- Parameters:
- correlation
bool, optional Return the correlation matrix. If False, return the covariance matrix.
- correlation
- Returns:
csr_matrixThe sparse matrix with both the upper and lower triangles filled (with symmetric information).
- to_table()[source]#
Return the covariance data in a
Tableusing coordinate format.Coordinate format means that the covariance matrix data is provided in three columns providing \(\Sigma_{ij}\) and the (0-indexed) matrix coordinates \(i,j\).
The output table has three columns:
'INDXI': The row index in the covariance matrix.'INDXJ': The column index in the covariance matrix.'COVARIJ': The covariance at the relevant \(i,j\) coordinate.
The table also contains the following metadata:
'COVSHAPE': The shape of the covariance matrix'BUNIT': (Ifunitis defined) The string representation of the covariance units.'COVDSHP': (Ifdata_shapeis defined) The shape of the associated data array.
If
data_shapeis set, the covariance matrix indices are reformatted to match the coordinates in the N-dimensional array.Warning
Recall that the storage of covariance matrices for higher dimensional data always assumes a row-major storage order.
Objects instantiated by this method can be used to re-instantiate the
Covarianceobject usingfrom_table.- Returns:
TableTable with the covoariance matrix in coordinate format and the relevant metadata.