SOLUTION: BUA Statistics Weather Analysis Question
SOLUTION: BUA Statistics Weather Analysis Question.
You need to make sure that your time series data is of at least 100 observations (or close to it).
You need a variable of interest, the “so called” dependent variable and at least 4 other time series
variables that you can refer to as independent variables. You will be modeling the dependent
variable using different time series techniques introduced throughout the semester.
The final write–up is due on the last week of the term and will be submitted digitally on canvas
(only one copy per group). Please use size 11 for fonts and use single spacing. There is no
minimum or maximum number of pages. You need to write as much as you need in conveying
your results. Make sure to ONLY use the relevant R output in your write–ups. You can add the
outputs at the end of the document as an appendix or you can include them in the text. In either
case, you need to make sure to label the outputs properly and refer to them in the narrative.
In all your analysis, you can use a significance level of 0.05 when you need to conduct any kind
of hypothesis. For each case, please formally write down your null and alternative hypotheses,
rejection criteria, and your conclusion when needed. Since you will be assessing the forecasting
(prediction) performance of your models, set aside the last 10 observations (you can use more if
you have more data) of your data as a test sample and use the rest as your training data to estimate
the models.
Your first step is to obtain the data (you will probably spend a lot of time finding the dataset and
getting it ready for analysis). You need to make sure that the dependent variable is not a white
noise process first to proceed.
The final report will consist of the following sections:
1. Introduction and Overview:
Present the details about your dataset. Discuss where your data is coming from, why it is
important for you to study this specific variable and who would benefit from it.
2. Regression Model:
Estimate a multiple linear regression model and discuss the significance of the coefficients.
Check multi–collinearity and check if the residual assumptions (normality, non–constant variance,
and autocorrelation) are violated.
3. Deterministic Time Series Models:
a) Estimate an indicator variable model (if you have monthly data s=12, quarterly data s=4 etc..).
Check if the residuals are auto–correlated. Comment on your findings.
b) Estimate a polynomial model. You can start with a high order polynomial, say 8, and then
reduce it by removing the insignificant terms. Check if the residuals are auto–correlated.
Comment on your findings.
c) Estimate a cyclical (harmonic) model with the sine and cosine terms. In doing so, obtain the
periodogram and briefly explain how you have identified the model. Check if the residuals are
auto–correlated. Comment on your findings.
4. Stochastic Time Series Model
a) By investigating the ACF and the PACF of your series, identify a suitable AR, MA, ARMA,
ARIMA, or SARIMA model. You can try to fit a few suitable models and compare their in–
sample performance to find the best fit model. Can you reduce the series to a white noise process?
b) If the autocorrelation checks from your regression model (from Part 2) and/or the deterministic
time series models (from Part 3) were violated, use an ARIMA type correction for your correlated
residuals. Re–estimate the corrected models. Comment on the changes of any of the regression
coefficient estimates (if any).
c) Investigate if you can use an ARCH/GARCH structure to model the conditional variance of
your dependent variable. Estimate the model and comment on your findings.
d) Investigate if you can use an ARCH/GARCH structure to model the conditional variance of the
residuals of your models from Parts 2,3, and 4–a,b. Re–estimate those models using a suitable
ARCH/GARCH structure and comment on your findings.
5. Predictive Performance Comparison
Compare the forecasting (predictive) performance of all your estimated models from parts 2,3,and
4 using the mean absolute percentage error (MAPE) criteria for the test sample.
6. Multivariate Time Series Models:
Treat your dependent variable and the other four independent variables as a 5–dimensional
multivariate time series data (or more if you have additional independent variables). Investigate if
a VARMA and/or a Transfer Function (TF) model is suitable in modeling your multivariate time
series data. Obtain predictions for the test sample and compute the respective MAPE estimates.
7. Bayesian Dynamic Linear Models: TBA
8. Conclusion:
Comment on you findings and discuss their implications.