S-map - First Principles
S-map can predict nonlinear systems with high accuracy by using a linear predictor localized to state-space points (nearest neighbors) such that a linear prediction in the state-space is appropriate.
For example, consider the Lorenz 3-D butterfly system and its state-space (attractor). The relationship between x and z changes with time (positively correlated in one lobe and negatively in the other). Using a global linear predictor in the state-space for the entire system would predict poorly since the relationship between x and z is not globally linear. However, if we limit the scope of the library of state-space points to a neighborhood local to the query point, the relationship between x and z becomes approximately linear.
S-map localization is implemented by attributing each state-space vector a weight w = exp(-θd/D), where d is the distance from the query point, D the average distance of all state-space points from the query point, and θ the localization parameter. As a result, state-space points far from the query point are attributed a vanishing weight, while data close to the query point are given high weight such that the linear predictor is more "informed" by data in the same state-space locale as the query point.
If the system dynamics are state-dependent, and therefore nonlinear, θ is related to the degree of nonlinearity, that is, as theta increases and the linear predictor is localized to increasingly smaller regions of the state-space, predictability will increase if the system is state-dependent. We can use this property to evaluate the degree of nonlinearity of the system by varying θ and comparing prediction accuracy at each localized scale in the state-space. Such evaluations can be peformed with the PredictNonlinear()
function.
Let's visualize S-map's weighting/localization function on the Lorenz system. Here, we have the entire system dynamics available, all three variables [x,y,z] that define the Lorenz system. In most cases one instead has a single time series observed from the system and creates a state-space from time-delay embedding. This embedding is done by default in the EDM code.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# load the lorenz system data and assign the query point (arbitrary)
lorenz = pd.read_csv("lorenz_full_system.csv")
N = lorenz.shape[0]
query_idx = 300
query_pt = lorenz.iloc[query_idx]
truth_pt = lorenz.iloc[query_idx + 1] # Tp = 1
# add columns for distance and weight
lorenz.insert(3,"dist", np.zeros(N))
lorenz.insert(4,"weight",np.zeros(N))
# for every vector in the library (all points), set dist from the query point
for idx, row in lorenz.iterrows():
# Manhatten distance, could be Euclidean or other
lorenz['dist'].iloc[idx-1] = np.sum(np.abs( row - query_pt ))
dist_scale = np.sum( lorenz['dist'].iloc[:] ) / N
thetas = [0,1,2,4]
fig = plt.figure(figsize=(10,10))
for i in range( len(thetas) ):
# set the weight
for idx, row in lorenz.iterrows():
dist = lorenz['dist'].iloc[idx-1]
weighted_dist = np.exp( -thetas[i]*dist/dist_scale )
lorenz['weight'].iloc[idx-1] = weighted_dist
# plot weighted data - dark points have higher weight
ax = plt.subplot(2,2,(i+1),projection="3d")
ax.scatter(lorenz.iloc[:,0],lorenz.iloc[:,1],lorenz.iloc[:,2],
c=1-lorenz.iloc[:,4],cmap="gray")
# plot the query point as a red filled circle
ax.scatter( query_pt[0],query_pt[1],query_pt[2], c="red", s=100)
plt.title( 'Theta = {}'.format(thetas[i]) )
plt.show()
# each of the plots correspond to a theta value which defines
# the exponential weighting scale for state-space projection
We now have a state-space of points from which to predict the next state-space point in time, a library, weighted such that points in a local, presumed similar state (lobe) have higher weight than those in other state regimes.
The linear model can be viewed as an autoregressive model. An autoregressive (AR) model applied to time series is a linear function for predicting the next point in a time series given the past p previous values. The AR model seeks to construct a linear transformation from point x(t) to point x(t+1) based on the p values preceding x(t): [x(t),x(t-1),..,x(t-p)]. Here, the state-space lagged embedding vectors correspond to previous values. In comparison to Simplex that projects from knn = E+1 nearest neighbors at the state-space query point, S-Map uses the entire state-space library, but with states weighted corresponding to how close (similar) in state-space they are to the query point. Also note that the response vector is the next point for every library point, since we are trying to compute a function that outputs the next point given the current point and its p prior values.
response_vec = lorenz[['x']][1:]
model = lorenz[['x','y','z']][:-1]
# insert column of 1's to compute the error/const of the AR model (sub 1 since last pt has no next pt)
model.insert(0,"err_const",np.ones(lorenz.shape[0]-1))
print(model,response_vec)
# weight our library and response vec
weight_matrix = np.identity(N-1) * lorenz[['weight']][:-1].to_numpy()
model = np.dot(weight_matrix,model.to_numpy())
response_vec = np.dot(weight_matrix,response_vec.to_numpy())
# solve for our weights (least squares since non-unique solution)
solution = np.linalg.lstsq( model, response_vec, rcond=None )[0]
model_const, model_func = solution[0], solution[1:]
# apply our computed model function to our query point to get our next point
pred_pt = model_const + np.dot(query_pt, model_func)
error = pred_pt - truth_pt
print("true point {}, predict {}, error {}".format(
round(pred_pt[0],4), round(truth_pt[0],4), round(error[0],5)))
The prediction is accurate.
Note that SMap()
repeats this process on every query point since every query point is potentially in a different state in state-space, hence sequential locally weighted global linear maps.