(generalized) Block forecasting using Gaussian Processes

block_gp uses multiple time series given as input to generate an attractor reconstruction, and then applies Gaussian process regression to approximate the dynamics and make forecasts. This method is the generalized version of tde_gp, which constructs the block from lags of a time series to pass into this function.

block_gp(block, lib = c(1, NROW(block)), pred = lib, tp = 1,
  phi = 0, v_e = 0, eta = 0, fit_params = TRUE, columns = NULL,
  target_column = 1, stats_only = TRUE,
  save_covariance_matrix = FALSE, first_column_time = FALSE,
  silent = FALSE, ...)

Arguments

block	either a vector to be used as the time series, or a data.frame or matrix where each column is a time series
lib	a 2-column matrix (or 2-element vector) where each row specifies the first and last rows of the time series to use for attractor reconstruction
pred	(same format as lib), but specifying the sections of the time series to forecast.
tp	the prediction horizon (how far ahead to forecast)
phi	length-scale parameter. see 'Details'
v_e	noise-variance parameter. see 'Details'
eta	signal-variance parameter. see 'Details'
fit_params	specify whether to use MLE to estimate params over the lib
columns	either a vector with the columns to use (indices or names), or a list of such columns
target_column	the index (or name) of the column to forecast
stats_only	specify whether to output just the forecast statistics or the raw predictions for each run
save_covariance_matrix	specifies whether to include the full covariance matrix with the output (and forces the full output as if stats_only were set to FALSE)
first_column_time	indicates whether the first column of the given block is a time column (and therefore excluded when indexing)
silent	prevents warning messages from being printed to the R console
...	other parameters. see 'Details'

Value

If stats_only, then a data.frame with components for the parameters and forecast statistics:

`embedding`	embedding
`tp`	prediction horizon
`phi`	length-scale parameter
`v_e`	noise-variance parameter
`eta`	signal-variance parameter
`fit_params`	whether params were fitted or not
`num_pred`	number of predictions
`rho`	correlation coefficient between observations and predictions
`mae`	mean absolute error
`rmse`	root mean square error
`perc`	percent correct sign
`p_val`	p-value that rho is significantly greater than 0 using Fisher's z-transformation
`model_output`	data.frame with columns for the time index, observations, mean-value for predictions, and independent variance for predictions (if `stats_only == FALSE` or `save_covariance_matrix == TRUE`)
`covariance_matrix`	the full covariance matrix for predictions (if `save_covariance_matrix == TRUE`)

Details

The default parameters are set so that passing a vector as the only argument will use that vector to predict itself one time step ahead. If a matrix or data.frame is given as the only argument, the first column will be predicted (one time step ahead), using the remaining columns as the embedding. Rownames will be converted to numeric if possible to be used as the time index, otherwise 1:NROW will be used instead. The default lib and pred are to perform maximum likelihood estimation of the phi, v_e, and eta parameters over the whole time series, and return just the forecast statistics.

If phi, v_e, and eta parameters are given, all combinations of their values will be tried. If fit_params is also set to TRUE, these values will be the initial values for subsequent optimization of likelihood.

The basic model is: $$y = f(x) + \textnormal{noise} $$ in which the function f(x) is modeled using a Gaussian process prior: $$f \sim \textnormal{GP}(0, C) $$ with mean = 0, and covariance function, C, which is given by the squared-exponential kernel: $$C_{ij} = eta * \exp(-phi^2 * ||x_i - x_j||^2) $$

y is a realization from process f with normally-distributed i.i.d. process noise, $$noise \sim \mathcal(N)(0, v_e) $$ such that the covariance of observations y_i and y_j is $$K_{ij} = C_{ij} + v_e * \delta_{ij}$$ where $\delta_{ij}$ is the kronecker delta (i.e. it is 1 if $i = j$ and 0 otherwise)

From the model definition, the variance in y, after marginalizing over f, is given by eta + v_e. Thus to simplify specification of priors for the hyperparameters eta and v_e, the outputs y are normalized to zero mean and unit variance prior to fitting. This allows us to set (0, 1) bounds on eta and v_e which facilitates parameter estimation. We set Beta(2, 2) priors for both eta and v_e to partition prior uncertainty equally across structural and process uncertainty.

For a scalar input, the length-scale parameter phi controls the expected number of zero crossings on the unit interval as $$E(crossings) = \frac{\sqrt(2)}{\pi} \phi \approx 0.45 \phi $$

Thus to facilitate interpretation and prior specification, the distances in C are scaled by the max distance so that a model with $\phi = 2$ would have roughly one zero crossing over the range of the data. We assign phi a half-Normal prior with variance $\pi / 2$ so that the prior mean phi is 1, which tends to avoid overfitting. To fit the GP we estimate eta, v_e, and phi by maximizing the posterior after marginalizing over f(x). This is given by the multivariate normal likelihood $$logL = -1/2 \log{|K_d|}^{-1/2} y_d^T [K_d]^{-1} y_d $$ where $K_d$ is the matrix obtained by evaluating the covariance function at all pairs of inputs and $y_d$ is the column vector of outputs. Predictions for new values of x are obtained by setting eta, v_e, and phi to the Maximum a Posteriori (MAP) estimates and using the GP conditional on the observed data. Specifically, given $x_d$ and $y_d$, the mean and variance for y evaluated at a new value of x are $$E(y) = C(x, x_d) [K_d]^{-1} y_d $$ $$V(y) = eta + v_e - C(x, x_d)[K_d]^{-1} C(x_d, x) $$ where the vector $C(x, x_d)$ is obtained by evaluating C at x and each of the observed inputs while holding eta, phi, and v_e at the MAP estimates.

Examples

data("two_species_model")
block <- two_species_model[1:200,]
block_gp(block, columns = c("x", "y"), first_column_time = TRUE)
#>   embedding tp       phi       v_e      eta fit_params num_pred       rho
#> 1      1, 2  1 0.5302931 -170.4109 6.904724       TRUE      199 0.9999775
#>           mae        rmse perc p_val
#> 1 0.001004117 0.001284706    1     0

(generalized) Block forecasting using Gaussian Processes

Arguments

Value

Details

Examples

Contents