Perform forecasting using multiview embedding

multiview applies the method described in Ye & Sugihara (2016) for forecasting, wherein multiple attractor reconstructions are tested, and a single nearest neighbor is selected from each of the top k reconstructions to produce final forecasts.

multiview(block, lib = c(1, floor(NROW(block)/2)),
  pred = c(floor(NROW(block)/2) + 1, NROW(block)), norm = 2, E = 3,
  tau = 1, tp = 1, max_lag = 3, num_neighbors = "e+1",
  k = "sqrt", na.rm = FALSE, target_column = 1, stats_only = TRUE,
  save_lagged_block = FALSE, first_column_time = FALSE,
  exclusion_radius = NULL, silent = FALSE)

Arguments

block	either a vector to be used as the time series, or a data.frame or matrix where each column is a time series
lib	a 2-column matrix (or 2-element vector) where each row specifies the first and last rows of the time series to use for attractor reconstruction
pred	(same format as lib), but specifying the sections of the time series to forecast.
norm	the distance measure to use. see 'Details'
E	the embedding dimensions to use for time delay embedding
tau	the lag to use for time delay embedding
tp	the prediction horizon (how far ahead to forecast)
max_lag	the maximum number of lags to use for variable combinations. So if max_lag == 3, a variable X will appear with lags X[t], X[t - tau], X[t - 2*tau]
num_neighbors	the number of nearest neighbors to use. Note that the default value will change depending on the method selected. (any of "e+1", "E+1", "e + 1", "E + 1" will peg this parameter to E+1 for each run, any value < 1 will use all possible neighbors.)
k	the number of embeddings to use ("sqrt" will use k = floor(sqrt(m)), "all" or values less than 1 will use k = m)
na.rm	logical. Should missing values (including `NaN`` be omitted from the calculations?)
target_column	the index (or name) of the column to forecast
stats_only	specify whether to output just the forecast statistics or the raw predictions for each run
save_lagged_block	specify whether to output the lagged block that is constructed as part of running `multiview`
first_column_time	indicates whether the first column of the given block is a time column (and therefore excluded when indexing)
exclusion_radius	excludes vectors from the search space of nearest neighbors if their time index is within exclusion_radius (NULL turns this option off)
silent	prevents warning messages from being printed to the R console

Value

A data.frame with components for the parameters and forecast statistics:

E	embedding dimension
tau	time lag
tp	prediction horizon
nn	number of neighbors
k	number of embeddings used

`E`	embedding dimension
`tau`	time lag
`tp`	prediction horizon
`nn`	number of neighbors
`k`	number of embeddings used
`num_pred`	number of predictions
`rho`	correlation coefficient between observations and predictions
`mae`	mean absolute error
`rmse`	root mean square error
`perc`	percent correct sign
`p_val`	p-value that rho is significantly greater than 0 using Fisher's z-transformation
`model_output`	data.frame with columns for the time index, observations, predictions, and estimated prediction variance (if `stats_only == FALSE`)
`embeddings`	list of the columns used in each of the embeddings that comprise the model (if `stats_only == FALSE`)

Details

uses multiple time series given as input to generate an attractor reconstruction, and then applies the simplex projection or s-map algorithm to make forecasts. This method generalizes the simplex and s_map routines, and allows for "mixed" embeddings, where multiple time series can be used as different dimensions of an attractor reconstruction.

The default parameters are set so that, given a matrix of time series, forecasts will be produced for the first column. By default, all possible combinations of the columns are used for the attractor construction, the k = sqrt(m) heuristic will be used, forecasts will be one time step ahead. Rownames will be converted to numeric if possible to be used as the time index, otherwise 1:NROW will be used instead. The default lib and pred are to use the first half of the data for the "library" and to predict over the second half of the data. Unless otherwise set, the output will be just the forecast statistics.

norm = 2 (default) uses the "L2 norm", Euclidean distance: $$distance(a,b) := \sqrt{\sum_i{(a_i - b_i)^2}} $$ norm = 1 uses the "L1 norm", Manhattan distance: $$distance(a,b) := \sum_i{|a_i - b_i|} $$ Other values generalize the L1 and L2 norm to use the given argument as the exponent, P, as: $$distance(a,b) := \sum_i{(a_i - b_i)^P}^{1/P} $$

Examples

data("block_3sp")
block <- block_3sp[, c(2, 5, 8)]
multiview(block, k = c(1, 3, "sqrt"))
#> Warning: Found overlap between lib and pred. Enabling cross-validation with exclusion radius = 0.
#>   E tau tp nn k num_pred       rho       mae      rmse      perc        p_val
#> 1 3   1  1  4 1       99 0.8463247 0.3514340 0.4589715 0.8484848 2.000945e-34
#> 2 3   1  1  4 3       99 0.9023856 0.2720415 0.3535066 0.8989899 2.955477e-48
#> 3 3   1  1  4 8       99 0.9084469 0.2592693 0.3400605 0.9393939 2.262100e-50

Perform forecasting using multiview embedding

Arguments

Value

Details

Examples

Contents