SMap
Description :
SMap projection of the input data file or DataFrame.
Python :
SMap(dataFrame=None, columns='', target='',
lib='', pred='', E=0, Tp=1, knn=0, tau=-1,
theta=0, exclusionRadius=0, solver=None, embedded=False,
validLib=[], noTime=False, ignoreNan = True,
showPlot=False, verbose=False, returnObject=False)
R :
SMap(pathIn="./", dataFile="", dataFrame=NULL,
lib="", pred="", E=0, Tp=1, knn=0, tau=-1,
theta=0, exclusionRadius=0, columns="", target="",
embedded=FALSE, verbose=FALSE, validLib=vector(),
ignoreNan=TRUE, generateSteps=0, parameterList=FALSE,
showPlot=FALSE, noTime=FALSE)
Parameter | Type | Default | Purpose |
---|---|---|---|
dataFrame | pyEDM: pandas DataFrame rEDM: data.frame |
None | Input DataFrame |
columns | string or [] | "" | Column names for library |
target | string | "" | Prediction target column name or index |
lib | string or [] | "" | Pairs of library start stop row indices |
pred | string or [] | "" | Pairs of prediction start stop row indices |
E | int | 0 | Data dimension |
Tp | int | 1 | Prediction Interval |
knn | int | 0 | Number nearest neighbors |
tau | int | -1 | Embedding time shift (time series rows) |
theta | int | 0 | SMap localization |
exclusionRadius | int | 0 | Prediction vector exclusion radius |
solver | sklearn.linear_model | None | Linear system solver |
embedded | bool | False | Is data an embedding? If False, embed to E |
validLib | bool [] | [] or None | Conditional embedding |
noTime | bool | False | Do not require first data column of time or index |
ignoreNan | bool | True | Adjust lib to exlcude NaN |
showPlot | bool | False | Plot results (pyEDM, rEDM) |
verbose | bool | False | Echo messages |
generateSteps | int | 0 | Number of recursive time step predictions |
generateLibrary | bool | False | Add generated data to library |
parameterList | bool | False | Include parameter dictionary in return |
const_pred | bool | False | Include non projected forecast data |
pathIn | string | "./" | Input data file path |
dataFile | string | "" | Data file name |
pathOut | string | "./" | Output file path |
predictFile | string | "" | Prediction output file |
smapCoefFile | string | "" | SMap coefficient output file |
smapSVFile | string | "" | SMap singular value output file |
Refer to the parameters table for general parameter definitions.
Notes :
If embedded
is false, data columns are embedded to dimension E
with shift tau
.
If knn
is not specified, it is set to the full library size.
If knn
is specified, it must be greater than E
.
nan :
Version 1.x : Any prediction row (pred
) with nan
will result in SMap
nan
prediction. Any library vector with nan
, whether in the observation, or from time delay embedding used as a nearest neighbor, will result in SMap
nan
prediction. By default SMap
uses all library vectors as neighbors. To address this, if nan
are detected in columns
or target
and ignoreNan = True
(default), the library is automatically redefined to exclude data and embedding vectors containing nan
. If ignoreNan = False
the library is not changed. The user can manually specify library (lib
) row segments to ignore nan values.
Version 2.x : nan
values are removed from the data unless ignoreNan = True
.
Multivariate Embedding :
SMap
should be called with columns explicitly corresponding to dimensions E
. In the univariate case (number of columns
= 1) with default embedded = false
, the time series will be time-delay embedded to dimension E
, returned SMap
coefficients correspond to each dimension.
If multivariate data is used (number of columns
> 1) SMap
must use embedded = true
with E
equal to the number of columns.
This prevents the function from internally time-delay embedding the
multiple columns to dimension E
. If internal time-delay embedding
were performed, then state-space columns will not correspond to the
intended dimensions in the matrix inversion, coefficient assignment,
and prediction. In this multivariate case, the user can first prepare
the embedding (using Embed()
for time-delay
embedding if desired, add a first column of time), then pass this embedding
to SMap
with appropriately specified columns
, E
, and embedded = true
.
The Embedding.py
application can be used to perform the embedding and
insert the time vector for input to SMap
.
Conditional Embedding :
validLib
implements conditional embedding (CE). It is a boolean vector the same length as the number of time series rows. A false
entry means that the state-space vector derived from the corresponding time series row will not be included in the state-space library. See examples
.
Generative Mode :
If generateSteps > 0 SMap
operates in feedback generative mode. The values of pred
are over-riden to start at the end of the data. At each step one prediction is made, added to the columns
data, a new time-delay embedded is created, and the cycle repeated for generateSteps
. Feedback generation only operates on a univariate time series that is time-delay embedded. The columns
and target
variables must be the same.
Linear System Solver :
In pyEDM
: The default LAPACK SVD solver dgelsd()
can be replaced with
a class object instantiated from the sklearn.linear_model
class.
Supported solvers include LinearRegression
, Ridge
, Lasso
,
ElasticNet
, RidgeCV
, LassoCV
, ElasticNetCV
.
See examples
.
Version 1.x Note: Windows does not support community compiler standards thereby creating binary library compatibility barriers, specifically the use of OpenBLAS for the SVD solver. As a result, the Windows pyEDM implementation does not use the cppEDM default solver dgelss
from BLAS/LAPACK. All other implementations use BLAS/LAPACK dgelss
directly.
Returns :
Dict in pyEDM
, named List in rEDM
: with three DataFrames:
predictions
[ 3 columns : "Time", "Observations", "Predictions"],
coefficients
[ E+2 columns : "Time", and E+1 SMap coefficents]
singularValues
[ E+2 columns : "Time", and E+1 SVD singular values] If available from the linear system solver.
Version 1.x : If parameterList = True
, a dictionary of parameters
is added.
Version 2.x : If returnObject = True
returns the SMap class object with all data and variables.