Time Series Analysis Using ARIMA Model For Forecasting In R (Practical)

Time series refers to a set of observations on a particular variable recorded in time sequence. This time sequence or space can be hourly, daily, weekly, monthly, quarterly or yearly.

Why time series analysis
1.     Its emphasis is to support decision making
2.     To fit a mathematical model and then proceed to forecast the future.
The tool that we will be using for the practice is the RStudio you can click on the link below and download the RStudio setup and install it;

The dataset that will be used is the daily-minimum-temperatures-in-me.csv you can download it from Kaggle. Below is the link to download the daily minimum temperatures in me dataset;
The libraries that will be used for the model in time series are tseries and forecast. The # tag means comment.


###Importing the necessary  libraries for the model
library(tseries)
library(forecast)

###load the daily minimum temperatures in my data. 
#NB: To copy the path press on the shift button and right-click on the file to copy the path
temp=read.csv("C:\\Users\\LENOVO\\Desktop\\R tutorials\\daily-minimum-temperatures-in-me.csv")
temp

#NB: In time series we need only the observation, so we open the daily-minimum-temperatures-in-me.csv file
# and delete the date column or the first column that will leave with observation column in the csv file
# after we are done we will save the file as dailytemperature so that it will not conflict with the first file.
###load the daily temperatures data.
temperature=read.csv("C:\\Users\\LENOVO\\Desktop\\R tutorials\\dailytemperatures.csv", header = TRUE)
temperature

### we convert the data to time series
temperature <- ts(temperature, start = c(1981,1), frequency = 365)
plot.ts(temperature)

###Test the data for stationarity using adf test
### H0: Unit root
adf.test(temperature)

####Difference the data to make it stationary 
#NB: You only differeniate when the p-value is greater then 0.01
dtemperature=diff(temperature)
adf.test(dtemperature)

###Plotting ACF and PACF
acf(as.numeric(temperature), main="ACF OF DIFFERNCED TEMPERATURES")
pacf(as.numeric(temperature), main="PACF OF DIFFERNCED TEMPERATURES")

###Plotting both the ACF and PACF together 
par(mfcol=c(2,1))
acf(as.numeric(temperature), main="ACF OF DIFFERNCED TEMPERATURES")
pacf(as.numeric(temperature), main="PACF OF DIFFERNCED TEMPERATURES")

#####Fit  Integrated AR models
model1=arima(temperature,order = c(15,1,0))
summary(model1)

#######Diagnosing the model for Adequecy
###Obtained the residuals of the model
res1=model1$resid

###Test for zero mean
t.test(res1)


###Plot the PACF of the Residuals to check for Autocorrelation
pacf(as.vector(res1),lag=25, na.action=na.pass)
pacf(as.vector(res1^2),lag=25, na.action=na.pass)


###Test for Independence
Box.test(res1, type="Ljung-Box", lag=15)

tsdiag(model1)



#######Forecasting from the model Model for 10 days
pred<-predict(model1,n.ahead=10) 
pred


#######Plot the data and fitted values
plot.ts(temperature)
lines(fitted(model1),col="red")
lines(fitted(model1),col="blue")

plot(forecast(model1,10))




Comments

Popular posts from this blog

Linear Regression Prediction Model for predicting Graduate Admissions in Python with Scikit-Learn