The EDM Framework

The EDM framework is predicated on the availability of a multidimensional representation of the system dynamics. For example, the well-known Rössler attractor is a 3-dimensional (3-D) system that can express chaotic dynamics. The set of 3 equations that define the Rössler system define an attractor or manifold defining the state-space or phase-space of the system. Often, one does not have complete information regarding the system dynamics, in which case we can invoke Takens embedding theorem.

Takens Theorem

Takens theorem is a remarkable mathematical result that allows one to reconstruct a representation of the system dynamics (state-space manifold) from a single (univariate, 1-D) timeseries observed from the system. The embedding results in generation of a higher-dimensional representation of the system.


Embedding

The process of creating this representation is termed embedding. In the EDM packages we can use the Embed() function to create an embedding. This function creates successively time-lagged, if τ < 0, or, time-advanced if τ > 0, observation vectors from the input vectors.

Embedding is performed implicitly in EDM functions unless the embedded argument is set True indicating that the data are already embedded. Default embeddings are time-delay (lagged) with τ = -1.

Finding the optimal embedding dimension

In the case of the Rössler attractor we know that 3 dimensions will best represent the system. If Takens theorem is used to create an embedding, there will exist an optimal number of dimensions that best represent the dynamics. We can estimate an optimal embedding dimension with the EmbedDimension() function. This function evaluates simplex prediction accuracy over a range of embedding dimensions, the embedding dimension E with the highest predictive accuracy is selected for system analysis. This embedding is presumed to best represent and "disentangle" the manifold.

Nearerst Neighbor Forecasting: Simplex and S-map

EDM implements two timeseries prediction algorithms: Simplex(), and SMap(). Both operate in the embedding state-space, using nearest neighbors of a query point (location in the state-space from which a prediction is desired) to project a new estimate along the manifold.

Simplex() uses the centroid of the k-nearest neighbors (knn) of the query point projected Tp time steps ahead as the estimate. The number of neighbors, knn, is conventionally set as the number of state-space dimensions plus one: knn = E + 1.

SMap() uses a localized linear regression of query point neighbors to project a new estimate along the manifold. By default, the number of neighbors in the regression are set to the total number of state-space observation points. An exponential localization function F(θ) = exp(-θd/D) is used to selectively ignore neighbors beyond the localization radius where θ is the localization parameter, d a neighbor distance, and D the mean distance to all neighbors. This allows one to vary the extent to which local neighbors are considered in the linear projection, effectively modeling different local "resolutions" on the attractor. Since the scale of these local resolutions reflect the degree to which the dynamics are state-dependent, evaluating S-map predictive skill over a range of localizations θ can reveal the degree of state-dependence, and thus nonlinearity of the dynamics. The PredictNonlinear() function can be used for this purpose.

Variable interactions

EDM also provides methods to assess interactions between state-space variables, as well as inference of causal relationships.

SMap() returns the regression coefficients between variables, which have been shown to approximate the gradient (directional derivative) of variables along the manifold (Deyle et al. 2016).

CCM() applies convergent cross mapping to pairs of variables to infer possible causal links between variables.


Details on these algorithms are provided in the section EDM Algorithms in Depth.