stevengould.org | |
|
Forecasting Concepts/TerminologyThis section starts by introducing some basic forecasting terminology. If you are already familiar with forecasting you can skip through the first two sections. Beginning in the section called DataPoints and DataSets, we introduce some concepts and terminology that, while still somewhat general, are more specific to OpenForecast. Observing behaviorsIrrespective of what forecasting model or system is taken, forecasts are estimates of expected behaviors of some system outside of the currently observed domain. The forecasts are generally based on previously observed behaviors. For example, if a company's sales has been increasing at a steady rate month over month for the past couple of years, it would be "reasonable" to expect them to continue to increase the next month. Similarly, with weather forecasting. In an area where the prevailing winds are from the West, meteorologists and climatologists may have observed over many years that the vast majority of the strongest storms in the area move in from the West. Next time a storm is developing in the West, it is very reasonable to expect that the storm will move into the area from the West, rather than missing the are by moving to the North, South or further West. In both of these cases, we are basing our expectations of future behavior on past observations. If we can quantify these observations (that is, assign numeric values to them) then these values, also referred to as observed values, can be gathered for input to OpenForecast. Note that weather forecasting models have many input parameters, not all of which are easily quantifiable. While weather forecasting models could, in part, use OpenForecast, they are not directly implemented in OpenForecast. Introducing variablesWhen an observation is made, there are a number of other variables present - not all of which are necessarily important. For example, the size of a salesforce is likely to have an impact on the total sales for a region. Occasionally, for some products such as heating or cooling systems, even the weather (perhaps measured as the average day time temperature) also could have an impact on sales. There are other variables, such as international exchange rates, that may have no impact on an observation such as the weather. The value of some variable that we observe - the observed value - is also referred to in forecasting terms as the dependent variable. It is this variable that we're ultimately wanting to forecast the values for based on the values of the other variables. Think of the independent variable as the ``effect'' in a cause-and-effect relationship. In contrast, those variables that we believe influence the value of the dependent variable are referred to as potential independent variables. They should be independent of the observed - or dependent - value and ideally, in general, independent of each other. Think of these independent variables as the ``cause'' in a cause-and-effect relationship. DataPoints and DataSetsIn OpenForecast, the term DataPoint is used to refer to a single observation made up of one observed - dependent - value, and any number of values of independent variables. All these values should be captured in one instance. For example, in a time-based series, they should all be captured at the same time. Typically with any forecasting system you'll have several observations on which to base the forecast. In OpenForecast, such a set of DataPoints is referred to as a DataSet. An understanding of what are DataPoints and DataSets is really all that is necessary in order to use OpenForecast. Any extra understanding of forecasting you have, like that given in the introduction, is a bonus. Forecasting ModelsIn the most general sense a forecasting model is any tool or system that can be used to estimate future events, usually based on past events or observations. Some people joke about dice being used to forecast future events, but in some cases this is a perfectly valid form of forecasting model. In terms of OpenForecast, a forecasting model is any mathematical model that implements the ForecastingModel interface. This interface defines the general methods common to any of forecasting model implemented in, or supported by, OpenForecast. In most cases, the methods defined in this interface are all you'll need to use to interact with the forecasting model(s) you use. Though you may be familiar with some of the models supported by OpenForecast, this really is not necessary in order to effectively use OpenForecast. There is a Forecaster factory class to help you with the selection of a good forecasting model. Given an initial DataSet, the getBestForecast method of Forecaster returns the forecasting model that, based on the given DataSet, best fits the data. We will be covering the use of the Forecaster to obtain a forecasting model in more detail in the section called Obtain a ForecastingModel. For more details, refer to the OpenForecast JavaDocs. |