Data structures
The first step of the workflow is to prepare forecast data. The aim is to use appropriate and convenient data formats facilitating further analysis.
Capabilities
Below we propose data formats allowing forecast data storage with these capabilities:
Allows the implementation of cross-validation analysis based on a rolling forecasting origin
Can be used to store not only point forecasts, but also interval forecasts and other types of forecasting results (such as density forecasts, model parameters found at some specific origin, etc.)
Cross-platform & portable: data can be stored and retrieved using any RDBS using standard SQL queries or using .csv files, does not require data to be stored as a package for some specific language (such as R/Python/etc.)
Fast data access using RDBS engines, can be implemented at industrial settings for any frequency (hours, minutes, seconds, etc.) and for any number of time series
Allows actuals to be updated independently from forecasts
Advantages
Although there are existing packages containing forecast data (e.g., the R-package containing M3-Competition data), the common problem with them is that they do not allow the storage of rolling-origin forecasts and they can be used from specific programming environments. Here we propose a general format that is based on special table schemas that can be implemented in any environment.
Forvision Data Structures
Our approach to forecast data storage is based on using the these two major table schemas:
Time Series Table Schema (TSTS) to store time series actuals
Forecast Table Schema (FTS) to store forecasting results including point forecasts, prediction intervals, and other variables of interest
In order to slice-and-dice forecast data easier, we may need a table containing both actuals and forecast. This is done using the Actual and Forecast Table Schema (AFTS). Such data is obtained using a simple SQL query. The AFTS format also allows you to slice-and-dice forecast data effectively.
References
Sai, C., Davydenko, A., & Shcherbakov, M. (November 23-24, 2018). Data schemas for forecasting (with examples in R). Seventh International Conference on System Modelling & Advancement on Research Trends (pp. 145-149). Moradabad, India.
Hyndman Rob. Mcomp: Data from the M-Competitions. Retrieved from https://pkg.robjhyndman.com/Mcomp/index.html
To cite this website, please use the following reference:
Sai, C., Davydenko, A., & Shcherbakov, M. (date). The Forvision Project. Retrieved from https://forvis.github.io/
© 2020 Sai, C., Davydenko, A., & Shcherbakov, M. All Rights Reserved. All rights reserved. Short sections of text, not exceed two paragraphs, may be quoted without explicit permission, provided that full acknowledgement is given.