Functions for the analysis of correlated time series.
Note that if the inherent timescales of the system are long compared to duration of the time series being analyzed, then results will be inaccurate and unreliable.
If time series have initial transients should detected (with ‘detect_equilibration’) and removed before further analysis.
[1] Shirts MR and Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J. Chem. Phys. 129:124105, 2008 http://dx.doi.org/10.1063/1.2978177
[2] J. D. Chodera, W. C. Swope, J. W. Pitera, C. Seok, and K. A. Dill. Use of the weighted histogram analysis method for the analysis of simulated and parallel tempering simulations. JCTC 3(1):26-41, 2007.
Much of this module is a re-implementation (in jax) of the timeseries module in pymbar https://github.com/choderalab/pymbar/blob/master/pymbar/timeseries.py
Extract uncorrelated samples from correlated timeseries data.
time_series – A jax array of shape [T]
transient – If True initial transients will be detected and removed using the detect_equilibration function.
An array of uncorrelated subsamples.
Detect initial transient region of an equilibrating time series using a heuristic that maximizes the number of effectively uncorrelated samples.
We evaluate the statistical inefficiency on a sequence of exponentially spaced time points, and search for the time point that maximizes the effective number of uncorrelated samples after that time. We iterate with finer grids until a local maximum is located. Since the data is noisy we may not locate the global maximum.
A_t (Float[Array, "T"]) – time series
nodes (int) – Number of search nodes at each iteration.
start index of equilibrated data, statistical inefficiency of equilibrated data, Effective number of uncorrelated samples after time t.
[1] J. D. Chodera, A Simple Method for Automated Equilibration Detection in Molecular Simulations, J. Chem. Theory Comput. 12:1799 (2016) http://dx.doi.org/10.1021/acs.jctc.5b00784
Adapted from pymbar/timeseries.py::detect_equilibration_binary_search
We provide two methods for calculating autocorrelation times, ‘ips’ (default) and ‘batchmean’
(TODO: Explain methods)
Compute the autocorrelation_time of a correlated time series.
time_series – Array of shape [T]
method – Either ‘ips’ (defualt) or ‘batchmean’
tau, autocorrelation time
Compute the standard error of the estimated autocorrelation time of a correlated time series.
time_series – Array of shape [T]
method – Either ‘ips’ (defualt) or ‘batchmean’
stderr
Compute the crosscorrelation_time of a series of correlated time series.
multiple_time_series – Array of shape [N, T]
method – Either ‘ips’ (defualt) or ‘batchmean’
tau, array of shape [N, N]
Compute the crosscorrleation functions for a sequence of corrleated time series, using the fast Fourier transform.
multiple_time_series – Array of shape [N, T]
Array of shape [N, T]
Compute the autocorrleation functions for a corrleated time series, using the fast Fourier transform.
time_series – Array of shape [T]
Array of shape [T]
The statistical inefficiency of correlated time series is defined as g = 1 + 2 tau, where tau is the correlation time (measured in unit steps). We enforce a minimum g>=1.
histogram analysis method for the analysis of simulated and parallel tempering simulations. JCTC 3(1):26-41, 2007.
Compute the statistical inefficiency of a correlated time series.
time_series – Array of shape [T]
method – Either ‘ips’ (defualt) or ‘batchmean’
g, the estimated statistical inefficiency
Compute the standard error for the estimated statistical inefficiency of a correlated time series.
time_series – Array of shape [T]
method – Either ‘ips’ (defualt) or ‘batchmean’
Standard error
Compute the cross statistical inefficiency of a collection of correlated time series.
multiple_time_series – Array of shape [N, T]
method – Either ‘ips’ (defualt) or ‘batchmean’
g, the estimated statistical inefficiency
Compute the standard error for the estimated statistical inefficiency of a correlated time series.
time_series – Array of shape [N, T]
method – Either ‘ips’ (defualt) or ‘batchmean’
Standard error, array of shape [N]
The Kirkwood coefficients for a correlated time series.
The Kirkwood coefficient is the integrated correlation functions, or the variance times the correlation time.
time_series – An array of shape [T]
method – Method for estimating correlation times, ‘ips’ (Defualt) or ‘batchmean’
Kirkwood coefficient
Estimate of the error of a Kirkwood coefficients for a correlated time series.
time_series – An array of shape [T]
method – Method for estimating correlation times, ‘ips’ (Defualt) or ‘batchmean’
stderr
Compute the Kirkwood tensor for a sequence of correlated time series.
The elements of the Kirkwood tensor are the Kirkwood coefficients, (The integrated correlation functions, or the variance times the correlation times). Within the thermodynamic geometry of linear response, the Kirkwood tensor is the friction, and acts as the metric tensor
This tensor should be symmetric and positive semi-definite, but may not be due to statistical errors. We return the nearest symmetric positive semi-definite matrix in the Frobenius norm with eigenvalues at least min_eigenvalue https://nhigham.com/2021/01/26/what-is-the-nearest-positive-semidefinite-matrix/
multiple_time_series – An array of shape [N, T]
method – Method for estimating correlation times, ‘ips’ (Defualt) or ‘batchmean’
min_eigenvalue – Minimum eignevalues of the Kirkwood tensor, default zero.
An array of shape [N, N]
TODO
Estimated standard errors for the coefficients in the the Kirkwood tensor..
multiple_time_series – An array of shape [N, T]
method – Method for estimating correlation times, ‘ips’ (default) or ‘batchmean’
min_eigenvalue – Minimum eignevalues of the Kirkwood tensor, default zero.
subseries – TODO
An array of shape [N, N]
Generate time series data with given correlation time, drawn from an autoregressive model of order 1.
Note if you generate multiple series with the same random noise (same key), then those series are correlated with a cross-correlation time equal to the mean of the correlation times of each series.
key – A jax PRNG key
tau – Correlation time of the generated time series
steps – length of the generated time series
initial – Initial value for the auto-regression model. Provide the last value of a previously generated time series to extend the series.
Correlated time series, size [steps]