This research investigates gold price forecasting based on the combined multiple sources of risk indices (e.g.,
Since these appear to significant and positive correlation between the combined multiple sources of risk indices and gold price fluctuations, the future gold price is forecasted using the Multiple Linear Regression (MLR) model. This model is very robust that has similar result with the Autoregressive Integrated Moving Average (ARIMA) model. As a result, the MLR model has performed well because it has a lower root mean square error (RMSE), mean absolute error (MAE) and normalized RMSE (nRMSE) than the ARIMA model. Overall, the combined multiple sources of risk indices have improved the traditional models in forecasting the future gold price.
Mining projects have been recognized to deal with a greater risk than other sectors [1]. Greater risk in the mining project can influence the future metal prices forecasting. In most recent studies, the metal prices forecasting has been closely linked to the future risks that have been determined based on the multiple sources of risks. The first systematic study of future risks has been proposed by [2] to incorporate the multiple sources of financial risk variables. They incorporated the cash flows variables (e.g., price, fixed costs, depreciation, and capital costs) to improve the future risks. A similar series of Copeland & Antikarov’s (2003) model using an aggregate cash flow volatility has been introduced by [3]. [4] concluded that the multiple sources of financial risk variables in mining projects can be determined by metal price and operating cost. [4]’s argument appeared to be closely linked with [5]’s studies that introduced metal price and operating cost as the most significant variables to associate with the future risks. Another existing scholar similar to these such as [6] that they extended the operating cost and metal price variables by adding tax and inflation rate variables to identify the future risks.
Other scholars forecast the future risks using financial variables such as [7] revealed that the independent financial variables played a vital role in forecasting the future risks to predict the copper price. They demonstrated eight external variables (two energy prices such as oil price and natural gas price, three metal prices such as gold price, silver price and copper price, two commodity prices such as lean hog price and coffee price and one stock market such as Dow Jones Industrial average). [8] used ten financial variables (three exchange rates from India, China and South Africa, two inflation rates from US and China and five commodity prices from West Texas Intermediate crude oil, copper, silver, iron and gold) to assess the future gold price. However, this study did not take into account to detect the most relevant variables. [9] addressed that the most relevant variables can reduce the uncertainty of the future risks that can come up with the more accurate results.
Other researchers attempted to indicate the most relevant variables based on the correlation between one to other variables to forecast the future risks. [10] examined the correlation of joint effect between the iron ore price volatility and the Australian dollar to US dollar exchange rates over a 15-year period. This research concluded that this correlation could accurately predict the future risks. [11] revealed a correlation between the South African currency, platinum and palladium prices. This research showed that there was a strong and positive correlation between the South African currency with platinum and palladium. It was consistent that South Africa was the largest platinum and the second largest palladium producer in the world. [12] demonstrated the Chilean exchange rate as one of the most relevant predictor variables of copper prices. Copper was a major influence in Chilean exports, which was approximately 23% of shares of the global copper supply. To date, [13]’s studies have found a correlation between the metal price volatility and exchange rate volatility. However, these studies just focused with financial variables that no attempt was made to forecast the future risks by involving the technical and social variables.
Studies such as [14] pointed out that the future risks were subject to dealing with the financial and technical variables to indicate the future risks rather than only being assessed using the financial variable. [15] investigated the future risks using the relationship between a financial variable (e.g., metal price) and a technical variable (e.g., production rate). [16] examined the future risks based on the relationship between the actual project value and the production rate. They concluded that these appeared to weaken the relationship between the production rate and the actual project value based on the platinum mines in South Africa. [17] recommended that the most relevant variables to indicate the future risks were metal grade, processing recovery, metal price and exchange rate. Similarly, [18]proposed a metal price, metal grade and operating cost to identify the long-term future risk. [19] pointed out the multiple sources of risk variables to optimize the future risk that can come up with a metal price, production rate and mine lifetime. However, one criticism of much of the literature to optimize the future risk was that these ignored social risks and natural risks as part of multiple sources of risk variables than can contribute to the future risks.
To date, the existing literature on the multiple sources of risk variables in mining focuses particularly on risk hierarchy. [20] has classified the mining risks hierarchy into five categories such as political, financial, technical, natural and human resource risk variables based on the analytic hierarchy process (AHP) model. However, it only focused on the mining risks hierarchy rather than investigating the future risks. Therefore, this study identifies the multiple sources of risk variables and classifies each index of risk variable to forecast the future gold price.
This paper aims to improve future monthly gold price forecasting. To achieve this aim, there are three main points:
The outline of this paper is organized as follows: in section 2, the existing literature review discussed the relevant indices from political, financial, technical and natural risks to detect future risks; in section 3, the classification of the combined multiple sources of risk indices; in section 4, the methodology to examine the data using goodness of fit analysis and econometric analysis; section 5, the proposed model based on goodness of fit analysis and econometric analysis results are determined; section 6, the data used in this study are discussed, the conclusions are summarized in section 7.
The use of indices in identifying future risks has been well-established by previous authors [21]; [22]; [23]. However, most studies in the field of future risk identification using indices have only focused on the stock market performance. For example, S&P 500 index was one of the most common stock markets for improving the future risks forecasting performance based on investor sentiment [24]; [25]. To determine the S&P 500 index, measurements of the future risks were carried out by the Chicago Board Options Exchange (CBOE) implied volatility index (VIX) that was computed based on the future market expectation. The CBOE index has been recognized as the leading US stock market indicator and also was used to predict global stock market volatility based on the fear index [26]; [27]; [28]; [29]. However, the CBOE index has tended to focus only to assess the fear index based on the US stock market that did not take into account the global risks [21].
An effort has been made by [21] and [23] to improve the fear index covering the global stock market. [21] developed the fear index that was used to examine the stock market performance using Brazil, Russia, India, China and South Africa (BRICS) emerging markets. The studies by [21] concluded that the fear index constructed by the emerging markets led performing well in predicting the future stock market. However, one criticism of [21]’s studies on the fear index was that these were limited to predicting the future stock emerging markets. To answer this issue, [25] have added the future stock of developed markets from the Organization for Economic Co-operation and Development (OECD) countries. With these combined markets, they developed the global fear index (GFI) by involving stock market volatility and commodity price volatility. They concluded that this index can perform better than the CBOE index to forecast the future stock markets, especially during the Covid-19 pandemic period. Other scholars [30] have also attempted to combine stock market volatility and commodity price volatility to construct the new-based implied volatility (NVIX) index to predict future risks. They revealed that the NVIX index can improve the stock market volatility for developed countries. However, this index was only limited to the financial risk variables (e.g., stock market and commodity prices) in constructing the indices to capture the future risks that failed to address the real global risks. One of the most proposed techniques for determining the real global risks was that the indices should not be just constructed based on the financial variables, however these can also involve the policy risk.
[31] pointed out the policy risk associated with investor sentiment that it can provide a signal to identify the future risks. However, investor sentiment was relied on from the unstructured data that can provide low accuracy [32]. Therefore, [33] introduced economic policy uncertainty (EPU) index that was developed by more structure of input data. These data were constructed based on the frequency of political issues as shown by articles of ten leading US based-newspapers. [23] paid particular attention to the EPU index that it has a significant and positive impact to the metal prices increasing during the bullish period. However, EPU index has just focused on the national economic and policy uncertainty based on newspapers that it was subjective to interpret the empirical macroeconomic outlook and economic policy. Therefore, another major study in improving the future risks using macroeconomic and policy uncertainty based on newspaper, [23] developed the equity market volatility (EMV) trackers index. This model was assessed by the frequency of fundamental keywords appearance on newspaper such as E for economic, economy and financial, M for the stock market, equity, equities and standard and poor and V for volatility, volatile, uncertain, uncertainty, risk and risky. These keywords were collected from eleven major US leading newspapers. Similar to earlier findings, EPU index and EMV trackers index tended to focus on newspaper rather than the global empirical macroeconomic outlook and economic policy conditions. In order to answer this question, [30] has made attempts to cover the global issue by introducing the global economic and policy uncertainty (GEPU) index. The global issue using the GEPU index was constructed by involving two-third of the international economic and policy and was determined based on the weighted average of gross domestic product (GDP) of the EPU index from 16 countries. [30] has demonstrated that GEPU index was superior to forecast the future gold price.
However, [30]’s studies in forecasting the future gold price have a limitation because they have just considered global economic and political issues [34]. Therefore, they have failed to acknowledge the significance of multiple sources of risk variables that can capture sudden global incidents and times of crisis [35]; [36]; [37]; [38]; [39]; [40]; [41]. To solve this issue, this paper has attempted to develop the indices from the combined multiple sources of risk variables that have taken into account the sudden global incidents from political and natural risks. Such research could contribute to improving future gold price forecasting, including the future gold price shock by investigating the combined multiples sources of risk indices (the multiples sources of risk indices and the sudden global incidents indices).
The combined multiple sources of risk indices is used as the input data to forecast the future gold price using the multiple linear regression (MLR) model. This model is selected on based on the results of a goodness fit of analysis and econometric analysis. The results are very robust that the MLR model has very similar results to the ARIMA model.
This paper proposes the combined multiple sources of risk indices (e.g., political, financial, technical and natural risks) that could improve the future gold price forecasts. This study also includes the sudden global incidents indices as part of political and natural risks to detect the existence of gold price shocks. Then, the combined multiple sources of risks are illustrated in Figure 1 which modifies the risk classification chart by [20]. The [20]’s chart did not take into account the variables that can affect to the gold price shocks. Therefore, this paper proposes to acknowledge the gold price shocks that can be caused by the sudden global incidents such as significant economic policy changes, war and terrorist attacks and global calamity. The sudden global incidents can be represented by a significant economic policy changing based on the EPU index and GPR index, while war, terrorist attacks and global calamity can be reflected by the worldrisk index.
Figure 1. Proposed risk indices based on global political, financial, technical and natural risks.
Different indices have been proposed based on the political, financial, technical and natural risks that are used in this paper.
The EPU index addresses the risks based on the newspaper article on monetary and fiscal policy. There are three basic components currently being adopted in developing the EPU index. The first component is computed based on the monthly frequency of selected keywords (e.g., economic, policy and uncertainty) from ten major US newspapers (USA Today, the Miami Herald, the Chicago Tribune, the Washington Posts, the Los Angeles Times, the Boston Globe, the San Francisco Chronicle, the Dallas Morning News, the New York Times and the Wall Street Journal). Then, after computing the monthly frequency of selected keywords of each newspaper, this accounts for the total amount of monthly published articles. This result is normalized to calculate the EPU index. The second component of the EPU index comes from the expired period of federal tax code provisions for coming years. Finally, the last EPU index component is measured based on the different results between macroeconomic variables from surveyor professional forecasting (SPF) calculation of the consumer price index (CPI), country, province and local expenditures and other professionals. The disagreement between the SPF and other professionals can come up with the final EPU index [42].
The GPR index appears to be closely linked with a tension between two or more countries for wars and terrorism issues that it is determined as a proxy in measuring the EPU index. This index is constructed from 11 leading global monthly newspapers (The Boston Globe, Chicago Tribune, The Daily Telegraph, Financial Times, The Globe and Mail, The Guardian, Los Angeles Times, The New York Times, The Times, The Wall Street Journal, and The Washington Post). This index has been well-known tool as one of the most contributing indices for the extreme price shock [43] ; [44].
The CBOE volatility index is the largest US marketplace option. This is assessed by measuring implied volatility from 30-day of the S&P 500 realized volatility. While, implied volatility plays a critical role in the maintenance of the US stock benchmark both derivative and hedge fund markets. Of particular concern is an implied volatility movement to deal with market turmoil. Therefore, high implied volatility is an important aspect of pessimistic stock market expectations or vice-versa [45].
The gold price is a key instrument monetary, macroeconomic and financial parameters that can address a global economic performance [46]. A global economic performance during the turmoil in political and economic conditions can be shown by gold price raising since it has been widely used as a hedging and safe haven instrument [47]. In summary, understanding gold price fluctuations reveals whole economic characteristics that can be used as a benchmark for investors [48].
The gold reserves (e.g., volume and grade) in short-term periods positively associate with the US dollars as the global exchange rate because these have widely been used for international transactions [49]. This is because the gold reserves have played a critical role to maximize economic growth by minimizing speculation attacks [50]. The most common procedure for determining the future uncertainty concerning to the gold reserves and gold production rates is to compute their indices. These indices are calculated using the Laspeyres index that can be expressed as follows:
(1)
where is index value, represents the monthly gold mine productions or gold reserves at the observation year while describes total annual production or total remaining gold reserves at the base period. Then, expresses the monthly gold mine productions or gold reserves at the base period.
Recently, it is believed thought the global gold reserves depleted as expressed from production levels. Since 1975, the mined gold grade was 6 grams per ton and become 2 grams per ton in 2020. The global gold reserves depletion issue was similar to the production rates that have decreased with similar proportions to gold grade [51]; [16]. One of the main reasons to decline in the global gold reserves because the rise of the production levels as a consequence of global gold demand.
The worldrisk index is one of the most important parameters to investigate the global natural hazard. This index consists of natural hazards and vulnerability. The natural hazards can be assessed using five indicators (e.g., earthquake, cyclone, drought, floods and high tide). While, vulnerability is calculated using 23 indicators that relate to socio-economic, society and environmental hazard conditions [52].
Recently, investigators have introduced the effects of covid-19 pandemic on the worldrisk index for 2020. There is a significant positive correlation between covid-19 pandemic to the socio-economic crisis that has contributed to extreme price shocks [53].
Studies over the past decade have provided important data on monthly economic policy uncertainty (EPU) index, geopolitical risks (GPR) index and Chicago board options exchange (CBOE) volatility index obtained from Yahoo Finance. Then, monthly gold prices were extracted from the Indexmundi. Next, the technical uncertainty presented by gold reserves and gold production was downloaded from the United States Geological Survey (USGS). Finally, the natural risks were observed from the worldrisk index from China. The main reason with this opinion was that China’s gold demand played a vital role in the international gold price [54]. All the datasets began from January 2004 to December 2020 except the world risk index was started from January 2011 to December 2020. Therefore, the world risk index was simulated from January 2004 to December 2010 using the MLR model. This model was selected because the R is close to 1. It explains that the dependent variable follows the other independent variables to change.
A goodness of fit is fundamental to testing a normal distribution of data from random variables. The Jarque-Bera (JB) test is one of the techniques available for measuring the normal distribution. The normal distribution using JB can be determined from the JB value at level p-values is greater than or equal to chi-square test result [55].
Another test of goodness of fit analysis is stationary. This leads to detecting the existence of permanent price shocks. The permanent price shocks appear in non-stationary that can imply non-predictability. One of the most popular methods to detect stationary that it is the augmented Dickey-Fuller (ADF) test. The stationary appears since the p-values levels are less than the t-tests or vice versa [21].
The econometric analysis in this study can be represented by a correlation between two or more variables. Correlation can play an important role in addressing the issue of the strength of the relationship between one to other variables. In the history of price forecasting, relationship behavior has been thought of as a key factor in determining the prediction model based on their correlation analysis [56]. Several methods currently exist for the measurement of correlation such as:
Pearson’s correlation coefficient (PCC) has been recognized as one of the most well-known tools in assessing correlation between one variable to one or more variables. The benefit of this approach is that: first, PCC provides the significant positive and negative relationship between one variable to another variable. Second, this method is particularly useful in studying the significance degree correlation. Third, this approach can allow to calculate the dependent variable (Y) using the independent variable (X). Fourth, this approach can be used to examine a goodness fit for a linear regression. Finally, this offers an effective way of eliminating overfitting that may contribute an error in testing the original data [57].
PCC value ranges from -1 to +1 that -1 reflects the significant negative relationship between the independent variable (X) and dependent variable (Y). While, +1 implies the significant positive correlation between variable X and variable Y. This relationship between X and Y is absent since PCC is zero.
Multiple linear regression (MLR) is a classical method in investigating the relationship between a dependent variable (Y) to several independent variables (. A major advantage of the MLR is that it can account the dependent variable using the independent variables [58]. In order to assess the correlation between the independent and the dependent variables, it can be seen from an adjusted regression value (R). Since, R is close to one that appears the independent variables can perform well to predict the dependent variables [59].
The section of this paper describes to determine the quality of the proposed model in forecasting gold price by comparing between these results to other relevant forecasting models. The most extensive performance measures are root mean square error (RMSE) and mean absolute error (MSE) [60]. RMSE equation is described as follows:
(2)
while, MAE is expressed:
(3)
where is the actual prices, expresses the forecasted prices and N denotes the number of data.
In order to assess the accuracy in predicting the future prices, the RMSE is suggested to normalized using formulation as follows [61]:
(4)
where Z is the difference between the maximum and minimum of actual values. nRMSE ranges from 0 to 1 that the best fitting forecasting performance measures are represented closer to 0 [61].
Finally, the forecasting performance measures were split into training and test data. [8] proposed that the training data as in-of-sample are 70% of total samples and test data as out-of-sample of 30% of total data using the historical data. In this paper, the training data were started from January 2004 to November 2015 with a total of 143 sample while the test data was 61 samples from December 2015 to December 2020.
Figure 2. The training and test data based on the historical gold price with a ratio of 70% and 30%
This section describes the proposed model to forecast the future gold price using the combined multiple sources of risks.
The MLR is one of the most well-known method for forecasting the future price using the relationship between dependent variable and independent variable(s). This model is selected because the goodness fit of analysis results confirms that the data are stationary. While, the econometric analysis results suggest revealing a strong correlation between the independent variable and the dependent variables. In this case, the dependent variable is represented by the predicted gold price and dependent variables consist of EPU, GPR, CBOE volatility, gold prices, reserve, production and worldrisk indices. The general form of MLR is defined [58]:
(5)
Y expresses dependent variable. are independent variables. denotes intercept value. are coefficient of independent variables. represents error term of equation.
Autoregressive integrated moving average (ARIMA) model is the traditional time – series model that has been widely used to forecast the future risks [62]. The ARIMA model has the following form [63]:
(6)
where is the forecasted future gold price that is measured by the historical gold price at time (t). is constant value. , …. are the numerical coefficient of gold price at year 1, 2 and n. , and are the gold price measured at time t – 1, t – 2 and t – n. and are the numerical coefficient of random shock at year 1 to year n. and are the random shock at time t – 1 and t – n.
The results of statistical tests of the indices and prices are given in Table 1. The mean of EPU, GPR, CBOE volatility and gold production are larger than median that these are consistent with the skewness with positive values. However, the mean of gold reserves indices is smaller than the median with positive skewness that suggests the larger mode than mean and median. Then, the data distribution is tested using kurtosis that GPR, CBOE volatility and gold reserves indices indicated platykurtic distribution data. The effect of kurtosis is similar to the Jarque-Bera tests. Significance levels of the Jarque-Bera tests are set at the 5% level using the chi-square tests. These results illustrate the distribution data that reject the null hypothesis to normal distribution. As a result, all data are distributed non-normal.
Another goodness of fit test is aimed to detect the existence of stationary from the data based on the t-tests at 5% p-value level. These results are stationary except for gold price since the p-value levels are less than t-tests at 5%.
Table 1. Descriptive statistics for EPU, GPR, CBOE Volatility, Gold Reserves, Gold Production and WorldRisk Indices
|
Descriptive |
Index or Price |
|
|||||||||
|
Statistics |
EPU |
GPR |
CBOE Volatility |
Gold |
Gold Reserves |
Gold Production |
World Risk |
||||
|
Mean |
141.361 |
99.674 |
18.992 |
1,131.730 |
140.925 |
123 |
6.597 |
||||
|
Median |
124.281 |
83.440 |
16.255 |
1,229.250 |
146.564 |
117 |
6.668 |
||||
|
Minimum |
48.896 |
40.432 |
9.510 |
383.780 |
88.062 |
70 |
4.443 |
||||
|
Maximum |
430.018 |
380.597 |
59.890 |
1,968.630 |
289.797 |
208 |
8.285 |
||||
|
Std. dev |
72.647 |
50.828 |
8.558 |
404.871 |
35.450 |
35.715 |
0.715 |
||||
|
Skewness |
1.306 |
1.781 |
2.114 |
-0.287 |
1.100 |
0.382 |
-0.324 |
||||
|
Kurtosis |
1.651 |
4.714 |
5.396 |
-0.748 |
3.328 |
-0.906 |
-0.361 |
||||
|
Observations |
204 |
204 |
204 |
204 |
204 |
204 |
204 |
||||
|
Jarque-Bera |
81.202 |
296.799 |
399.476 |
7.553 |
135.291 |
11.925 |
4.670 |
||||
|
ADF tests |
-2.806 |
-5.007 |
-4.426 |
0.984 |
-5.114 |
-3.679 |
-4.869 |
||||
Table II illustrates the gold price correlation to each individual index of variables. These Pearson’s correlation coefficient (PCC) results suggest that the gold price has a significant positive correlation to the EPU (0.693).
Table 2. Pearson’s coefficient of correlation (PCC) between each variable
|
|
EPU |
GPR |
CBOE Vol |
Gold Price |
Gold Reserves |
Gold Production |
World Risk |
|||||||
|
EPU |
1.000 |
|
||||||||||||
|
GPR |
0.375 |
1.000 |
|
|||||||||||
|
CBOE Vol |
0.287 |
-0.245 |
1.000 |
|
||||||||||
|
Gold Price |
0.693 |
0.153 |
0.161 |
1.000 |
|
|||||||||
|
Gold Reserves |
0.292 |
0.105 |
0.010 |
0.549 |
1.000 |
|
||||||||
|
Gold Production |
0.569 |
0.519 |
-0.187 |
0.505 |
0.748 |
1.000 |
|
|||||||
|
World Risk |
-0.467 |
-0.291 |
-0.358 |
-0.361 |
-0.167 |
-0.286 |
1.000 |
|||||||
However, the main weakness of the PCC study is the failure to address the relationship between gold prices to other indices. To measure the relationship between the gold price to other indices, the multilinear regression (MLR) model is taken int account. The MLR model provides an adjusted regression value is 0.671 that indicates the gold price correlating significantly positively to the dependent variables.
The findings of this study suggest that the global political, financial, technical and natural risks indices (EPU, GPR, CBOE volatility, gold prices, reserve, production and worldrisk) have confirmed to perform well as the predictor variables for the future gold price using the MLR model. This forecasting can perform well because these indices have a strong correlation to the historical gold price based on the MLR analysis.
The forecasting performance is validated by forecasting 30% or 61 samples of the actual gold price from December 2015 to December 2020 based on 70% or 143 samples of the historical gold price from January 2004 to November 2015. The future gold price forecast is predicted using the MLR model. According to [64]; [65], this model was more useful to forecast the dependent variable with a strong correlation to the independent variables that the level of accuracy could reach 89%.
Another validation is provided by the ARIMA model that it confirms the future gold price forecasting results using the MLR model are robust (Figure 3). The ARIMA model has been recognized as one of the most popular time-series models to forecast the future metal prices [62]. The comparison forecasting results of MLR model and ARIMA model have been examined using RMSE, MAE and nRMSE (Table 3). As a result, the MLR model has smaller value of RMSE, MAE and nRMSE than the ARIMA model. In addition, the adjusted regression (R) of MLR model is higher than the ARIMA model. Therefore, it can be concluded that in this case, the MLR model can forecast the future gold price accurately since it involves the independent variables with a strong correlation to the historical gold price.
Figure 3. The comparison forecasting performance of MLR model and ARIMA model based on training data
Table 3. The performance forecasting results of MLR model and ARIMA model based on RMSE, MAE, nRMSE and R
|
|
MLR |
ARIMA |
||
|
RMSE |
0.24 |
0.31 |
||
|
MAE |
0.20 |
0.23 |
||
|
nRMSE |
0.27 |
0.34 |
||
|
R |
0.95 |
0.93 |
||
The accurate forecasting gold price is very challenging that it is influenced by the multiple sources of risk variables. This study has developed the combined multiple sources of risk variables that are represented by indices such as the global political, financial, technical and natural risks (EPU, GPR, CBOE volatility, gold prices, gold reserve, gold production and worldrisk) to forecast the future gold price. In order to forecast the future gold price based on the combined multiple sources of risks, this study proposes the MLR model. This model has been selected because the dependent variable has a strong correlation to all independent variables.
The forecasting result using the MLR model with the combined multiple sources of risks is examined using the ARIMA model that is one of the well-known times–series model for forecasting the future metal prices. The MLR model is very robust with RMSE of 0.24, MAE of 0.20, nRMSE of 0.27 and R of 0.95 while the ARIMA model was 0.31, 0.23, 0.34 and 0.93, respectively.
Based on these results, the MLR model using the combined multiple sources of risks can be confirmed that it can successfully be applied to forecast the future gold price. In this paper, this model has only been applied to forecast the future gold price, however strong evidence of the MLR model can solve the prediction problems for other metals.