# Resurrecting the Expectation Hypothesis: How to Extract Additional

код для вставкиResurrecting the Expectation Hypothesis: How to Extract Additional Information From the Term Structure of Interest Rates Andrea Carriero UniversitГ Bocconi First draft: March 2004. This version: February 2005 Abstract In this paper we propose a new way of modelling the Expectation Hypothesis (EH) of the term structure of interest rates and provide striking evidence validating it on both statistical and economic grounds. The idea is to model the EH as a noisy relation, allowing for temporary departures from it. We do so using a Bayesian framework in which the EH can be viewed as a prior on a gaussian VAR. Importantly, our approach is very general and comprises the traditional framework as a special case. Once the EH is modeled as a noisy relation it is strongly supported by the data and is perfectly able to explain the behavior of the U.S. 10-year rate from the seventies to nowadays. Moreover, our evidence explains the common result of rejection and the anomaly found by Campbell and Shiller (1987). Finally, our approach allows to extract additional information from the term structure and then to signi7cantly increase the accuracy of a Taylor-rule based model in predicting future short term rates. JEL Classi7cation: C11, E43, E44, E47. Keywords: Bayesian VARs, Expectations Theory, Macroeconomic Information in Finance, Term Structure, Uncertain Restrictions Author address: IEP UniversitГ Bocconi, Via Gobbi 5, 20136 Milano, Italy. E-mail: andrea.carriero@unibocconi.it. I am indebted to Carlo Favero. I am grateful to Marco Aiol7, Fabio Canova, Marco Del Negro, Andrea Ghiringhelli, Alfred Haug, Massimiliano Marcellino, Giorgio Primiceri, Luca Sala, Nicola Scalzo and Ulf SГ¶derstrГ¶m. I also thank participants at seminars and at the VI workshop in quantitative 7nance at Bocconi University, Milan and at the First Italian Congress of Econometric and Empirical Economics at CaвЂ™ Foscari University, Venice. This paper has previously circulated as: "Validating the Expectation Hypothesis as a Set of Uncertain Restictions". 1 1 Introduction The Expectations Hypothesis of the term structure of interest rates (EH) states that actual long-term interest rates are determined by the marketвЂ™s expectation of the future short-term rates. This theory, popularized in the writings of Fisher (1930), Keynes (1930), and Hicks (1953), continues to be a way that many economists think about the determination of long-term interest rates. Notwithstanding its important role in macroeconomics and 7nance, the EH has been widely criticized on theoretical grounds and has received little empirical support. From a theoretical perspective the EH was not viewed as a viable model of the term structure for several years, due to the result of Cox, Ingersoll and Ross (1981) that the EH is not consistent with the absence of arbitrage. More recently, however, McCulloch (1993) and Fisher and Gilles (1998) have presented counterexamples to the Cox, Ingersoll and Ross (1981) proof, and LongstaG (2000) shows that all traditional forms of the EH can be consistent with the absence of arbitrage if markets are incomplete. In particular he demonstrates that the Cox, Ingersoll and Ross (1981) proof hinges on the complete market hypothesis, which does not necessarily hold, as shown by several studies (see, for example, Daves and Ehrhardt 1993, Amihud and Mendelson 1991, Kamara 1994, DuGee 1996, LongstaG 1992, Boudoukh and Whitelaw 1991, Cornell and Shapiro 1990, Elton and Green 1998). Therefore, the EH cannot be ruled out on theoretical grounds and its validity is purely an empirical issue. Turning to the empirical validation, the EH has been widely tested, and almost invariably rejected. The bulk of the available literature (see, for example, Mankiw and Miron 1986, Fama and Bliss 1987, Campbell 1995, and Cochrane 2001) testing the EH has two common features. First, it uses a single-equation, limited information approach. Second, it uses ex-post realized returns as a proxy for ex-ante expected returns. There are many problems related to this testing strategy. First, as noted by McCallum (1994), the limited information approach might cause a bias in the estimates due to simultaneity. Moreover, Elton (1999) asserts that there is ample evidence against the belief that information surprises tend to cancel out over time. Hence realized returns cannot be considered as an appropriate proxy for expected returns. Finally, Campbell (2001) 7nds strong eGects of expectation errors on the single-equation tests, which are con7rmed by a number of papers relating expectation errors to peso problems. Hence, having a good approximation of expected returns is crucial when testing the theory. Keeping these latter points in mind, it is not surprising that a single equation approach proxying expected returns with ex-post realized returns rejects the 2 EH. In many cases expectations that were entirely reasonable ex-ante may turn out to be completely wrong ex-post. In all these instances using ex-post realized returns amounts to using irrational forecasts and this biases the test of the EH. Campbell and Shiller (1987) circumvent these problems by using a bivariate framework. Their approach allows to derive model-based proxies for ex-ante expected returns. Still, as noted by Carriero, Favero and Kaminska (2005), their results are biased by the fact that they use information from the whole sample to simulate investorsвЂ™ expectations while investors can use only historically available information to generate predictions of short-term rates. Campbell and Shiller (1987) implement a Wald test which still rejects the EH but their analysis of the data leads them to conclude that there is an important element of truth in the EH. In particular, they 7nd an anomaly: the EH is statistically rejected but the theoretical spread based on its validity has a very high correlation with the actual spread. Building on a similar framework, Carriero, Favero and Kaminska (2005) show that these two spreads are not statistically diGerent, while the high correlation between them leads Campbell and Shiller (1987) to conclude that вЂњ...deviations from the present value model for bonds are transitory...вЂќ. We take this last point and develop an extended version of the EH in which transitory deviations from the theory may occur. This is the only key assumption made throughout the paper, and it seems reasonable. By de7nition any economic theory is a simpli7cation of reality, and as such it cannot hold exactly even if the theory is вЂњtrueвЂќ. Indeed, in the real world there is always some kind of noise which blurs any equilibrium. The proposed extension leads to a more general framework which comprises the traditional one as a special case. We adopt the Bayesian approach developed by JeGreys (1935, 1961) as a major part of his program for scienti7c inference. In this approach, statistical models are introduced to represent the probability of the data according to several competing theories, and BayesвЂ™s theorem is used to compute the posterior probability that one of the theories is correct. Then the theories can be compared using the Bayes factor, which is a summary of the evidence provided by the data in favour of one theory as opposed to another. In particular, we have two competing theories, represented by two diGerent priors on the coeJcients of a gaussian bivariate VAR. The 7rst theory does not impose any restriction, and it is shaped into a loose, proper prior. The second theory imposes the restrictions derived from the EH, and is shaped into a prior on some linear combinations of the coeJcients. To ensure the robustness of our results, we extend the testing framework in two dimensions, by providing statistical results recursively through 3 time, and by adding macroeconomic information to the picture. The EH implies that monetary policy aGects long-term rates by inKuencing expectations about future short-term rates, but as central banks also look at bond markets to extract information about inKation expectations, monetary policy (i.e. the short-term rate) is likely to respond to bond market conditions. Our main results can be summarized as follows: i) The EH is strongly supported by the data both on statistical and economic grounds (see Section 4); ii) The EH is perfectly consistent with U.S. 10-year rate dynamics from the beginning of the seventies to nowadays (see Section 5); iii) The EH allows to extract additional information from the term structure and therefore to signi7cantly increase the accuracy of a Taylor-rule based model in predicting future short-term rates (see Section 6). Moreover, our evidence comprises as a special case the common result of rejection of the EH and explains the anomaly found by Campbell and Shiller (1987). The paper is organized as follows: Section 2 introduces the basic framework, Section 3 builds on it to derive our extended framework, and Section 4 provides statistical evidence. Section 5 discusses the ability of our model to explain the dynamics of the 10-year rate, and Section 6 evaluates forecast accuracy. Finally, Section 7 concludes. Four appendices derive some results used throughout the paper. 2 The Basic Framework In this section we introduce our basic framework, developed by Shiller (1979) and Campbell and Shiller (1987). 2.1 A linearized expectations model The EH states that actual long-term interest rates are determined by the marketвЂ™s expectation of future short-term rates. Most simple linear term structure models relate long-term interest rates to an unweighted simple average of expected short rates. Those models are appropriate for pure discount bonds. For coupon-carrying bonds Shiller (1979) proposes a linearized model relating the T -period interest rate (the yield to maturity on T -period bonds) Rt,T to a weighted average of expected future one-period (short-term) interest rates rt , rt+1 , ...: T 1 1 i Rt,T = Et rt+i + T PT . (1) T 1 i=0 Here t denotes the time period (month), 4 is a constant of linearization 0 < < 1, T PT is a constant term premium (i.e. dependent on maturity only) and Et denotes expectations given information at time t. The parameter is ВЇ T ), where R ВЇ T is the average of Rt,T . Then (1) relates set equal to = 1/(1+ R ВЇT . Rt,T to the present value of future short-term interest rates discounted by R Rearranging (1) gives an expression involving the spread St,T = Rt,T rt : T 1 St,T = i T Et rt+i + (Rt,T T PT ) + T PT . (2) i=1 As T this simpli7es to i St = Et rt+i + T P . (3) i=1 where T P 2.2 is the term premium for a bond with an in7nite maturity. Expectation Hypothesis restrictions Our data set consists of the 1-month certi7cate of deposit in the U.S. secondary market rate and the 10-year U.S. Treasury bond yield, at a constant maturity rate. Data are monthly and go from 1966:1 to 2004:1. Both series are provided by the Federal Reserve of St.Louis. Following Campbell and Shiller (1987) we consider a VAR for St and rt+i : rt = k1 + a1 rt 1 + a2 rt 2 + a3 rt 3 + b1 St 1 + b2 St 2 + b3 St 3 + u1t , St = k2 + c1 rt 1 + c2 rt 2 + c3 rt 3 + d1 St 1 + d2 St 2 + d3 St 3 + u2t , (4) where the lag length has been chosen via the Bayesian information criterion performed over the whole sample with a maximum lag length of 13. From equation (3) it is possible to derive a set of restrictions implied by the EH on the VAR (4). In appendix A we show that, provided the VAR is stable, these restrictions are given by: a1 + c1 b1 + d1 a2 + c2 b2 + d2 a3 + c3 b3 + d3 = 0 1/ 0 0 0 0 . (5) Thus the EH implies that the two coeJcients attached to a given variable in the two equations must be perfectly negatively correlated. 5 3 The Expectation Hypothesis as a Set of Uncertain Restrictions By de7nition any economic theory is a simpli7cation of reality, and as such it cannot hold exactly even if the theory is вЂњtrueвЂќ. In this section we develop an extended version of the EH which allows transitory deviations from the theory to occur. This extension leads to a more general framework which comprises the traditional one as a special case. 3.1 Adding uncertainty Suppose the EH does hold, but only on average, i.e. some noise causes temporary departures from the EH restrictions in (5). Formally, let the uncertainty introduced by this noise be measured by the parameter . The resulting set of stochastic constraints is: a1 + c1 b1 + d1 a2 + c2 b2 + d2 a3 + c3 b3 + d3 N ВµEH0 = 0 1/ 0 0 0 0 0 , EH0 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . (6) 0 It is straightforward to interpret the parameter as the tightness of the restrictions: a large value of implies that the EH restrictions hold with a lot of uncertainty, while as decreases the EH restrictions become more binding and eventually become certain. Therefore it is crucial to calibrate in an appropriate way. Of course, if we allow for very little variation (or no variation) this essentially implies ruling out any noise and imposing the EH restrictions to hold exactly, while allowing for too large departures from the restrictions would lead to a very insigni7cant version of the EH. Indeed, any theory is likely to be supported by the data if we allow its restrictions to hold with a suJciently large amount of noise. We will show in subsection 3.3 that our estimate for the parameter provides a sensible set of uncertain restrictions, i.e. the implied degree of uncertainty is suJciently high to avoid imposing the theory to hold exactly, and suJciently low to be eGectively binding. The diagonal, homoskedastic structure of the matrix EH0 may not seem very general. However, the choice of independence between restrictions is intended as the one which minimizes departures from the certainty case and there is no a priori economic rationale to think that the restrictions are correlated. Some rationale could be used instead to question the choice of 6 homoskedasticity of the restrictions. For example, it could be argued that restrictions which are linked to variables further away in time could bear more uncertainty than those which are nearer in time. We have tried with alternative speci7cations consistent with this latter argument, each specifying diGerent proportions among the variances of each single restriction and the overall statistical results were very robust to all such modi7cations. 3.2 The EH in a Bayesian framework The set of restriction (6) can be thought in a Bayesian perspective as a prior on the coeJcients of the VAR (4). Therefore, we can test the EH by using the approach developed by JeGreys (1935, 1961). In this approach, statistical models are introduced to represent the probability of the data according to several competing theories, and BayesвЂ™s theorem is used to compute the posterior probability that one of the theories is correct. Then the theories can be compared using the Bayes factor, which is a summary of the evidence provided by the data in favour of one theory as opposed to another. In particular, we shall compare two competing theories: the 7rst theory does not impose any restrictions on the coeJcients of the VAR (4), while the second theory imposes the restrictions derived from the EH. Rewrite the VAR (4) in the following way: y= (7) + , with: yt = rt St X = rt = 1 , St 1 = [I2 ... X] , rt 3 u1t u2t = St 3 1 N (0, rt 1 St 1 u ... a1 b1 a2 b2 a3 b3 k1 c1 d1 c2 d2 c3 d3 k2 IT ) , rt 3 St 3 1 , where T is the sample size and where y, and are 2T Г— 1, 2T Г— 1, and 14 Г— 1 vectors, and X, , u are T Г— 7, 2T Г— 14, and 2 Г— 2 matrices. The 7rst theory does not impose any restriction on the coeJcients and so it is easily shaped into a loose prior: N( 0 = 0 , 14Г—1 0 = I14 ). (8) We will refer to the VAR with this loose prior as the unrestricted VAR (UVAR). For suJciently large, the prior does not add any information to that of the likelihood, and the posterior mean of is numerically identical to 7 , the OLS estimate. With our data, a value of = 1 is large enough to ensure that this is the case. Using larger values of would imply an even looser prior but also would penalize the UVAR too much. The second theory imposes the restriction scheme (6), implied by the EH: H N ВµEH0 , EH H = I6 EH0 I6 0 6Г—1 (9) , 0 6Г—1 , where the subscript EH denotes the fact that the vector of coeJcients does satisfy the restrictions implied by the EH. We will refer to (9) as the EH prior and to the system composed by (7) and (9) as the EH-restricted VAR (RVAR): y= EH + , (10) H EH N ВµEH0 , EH0 . In Appendix B we derive the following alternative representation of the EH prior, expressed in terms of the vector of coeJcients of the VAR rather than in terms of the vector of restrictions: EH0 EH 1Г—8 I6 0 N EH0 = 1/ 0 = 1Г—6 0 0 1Г—5 I6 0 1Г—6 0 0 , (11) I6 0 ( + )I6 0 0 0 0 1Г—6 1Г—6 where is the variance of the unrestricted coeJcients. Provided that is suJciently large this speci7cation is perfectly equivalent to (9). Again, with our data, a value of = 1 is large enough to ensure this is the case. Equations (7) and (11) lead to the following alternative representation of the RVAR: EH y= N( + , , EH0 ) . EH0 EH (12) Thus, the RVAR is simply a linear regression model subject to a set of stochastic linear restrictions on the regression coeJcients. To estimate such a model, Theil (1971) proposed the method of mixed estimation, which involves using the uncertain restrictions to supplement data. The added restrictions act as prior information on the coeJcients and GLS is numerically equivalent to Bayesian estimation. Derivations of the posterior and marginal likelihood are contained in Appendix C. The parameter is estimated to be 0.13 by maximizing the marginal likelihood of the model (see Figure 1). 8 3.3 The EH prior Before moving on, it is crucial to check that the parametrization = 0.13, = 1 is a sensible one. This is ensured by three features. First, neither the loose prior nor the EH prior should impose any restriction on the individual coeJcients. Second, only the EH prior should impose restrictions on linear combinations of the coeJcients. Indeed the EH does not say anything about individual coeJcients, it does so about some linear combinations of them. Third, this latter restrictions should be uncertain but eGective, namely they should be binding. Our parametrization has all these features, as shown in Figures 2 and 3. Figure 2 plots prior and posterior distributions of the coeJcients for the RVAR and the UVAR. Notice that both priors are very loose with respect to the individual coeJcients. This con7rms that the adopted value of = 1 provides a suJciently diGuse prior for the individual coeJcients as desired. However, Figure 2 hides the key feature of the EH prior, namely to put restrictions on some linear combinations of the coeJcients. To see this feature Figure 3 shows the prior and posterior distributions of the restrictions imposed by the EH: it is clear that while the loose prior does not bind, the EH prior is much tighter. Figure 3 also shows that the EH prior is eGectively binding since the posterior estimates are shrunk toward the prior mean. From (11), the correlation matrix of the coeJcients under the EH-prior is: I6 0 Corr( EH0 ) = 0 1 1Г—6 + 0 1Г—6 I6 0 0 0 0 0 I6 0 0 1 + 1Г—6 1Г—6 I6 . (13) If = 0, the EH is in the traditional form and involves perfect negative correlation between six couples of coeJcients of the VAR. Letting > 0 we allow this correlation to be imperfect. Notice that, as , the correlation across the relevant couples of coeJcients goes to zero and this matrix approaches to the correlation matrix under the loose prior (the identity matrix), thus the two priors become virtually identical. Again, the chosen parametrization avoids this to happen: the estimated value = 0.13 increases the variance of half of the coeJcients to + = 1.13 and the correlation between the relevant pairs of coeJcients decreases from 1 to 0.94, becoming imperfect but remaining very high. To conclude, the chosen parametrization provides a sensible prior for our EH, namely a prior which is loose on single coeJcients but tight and binding on the EH restrictions. 9 4 Statistical Evidence In this section we provide evidence clearly supporting the EH. We do so by computing Bayes factors 7rst as a function of the EH prior tightness using the whole sample, and then recursively using the estimate = 0.13. For derivations of all the formulas used in this section see Appendix C. 4.1 Bayes factor The Bayes factor is a summary of the evidence provided by the data in favour of one theory, represented by a statistical model, as opposed to another. Following Kass and Raftery (1995), consider some data D assumed to have arisen under one of the two theories H1 and H2 according to a probability density pr(D|H1 ) or pr(D|H2 ). Given a priori probabilities pr(H1 ) and pr(H2 ) = 1 pr(H1 ), the data produce a posteriori probabilities pr(H1 |D) and pr(H2 |D) = 1 pr(H1 |D). Because any prior opinion gets transformed to a posterior opinion through consideration of the data, the transformation itself represents the evidence provided by the data. Once we convert to the odds scale (odds = probability/(1 probability)), the transformation takes a simple form. Using Bayes theorem, we obtain pr(H2 |D) pr(D|H2 ) pr(H2 ) = , pr(H1 |D) pr(D|H1 ) pr(H1 ) (14) so that the transformation is simply multiplication by B21 = pr(D|H2 ) , pr(D|H1 ) (15) which is the Bayes factor of theory H2 as opposed to theory H1 . Kass and Raftery (1995) suggested the following interpretation for the value of B21 : Table 1: Interpreting Bayes factors B21 Evidence Against H1 1 to 3 Bare Mention 3 to 20 Positive 20 to 150 Strong > 150 Very Strong We speak here in terms of B21 , because weighting evidence against a null hypothesis is more familiar, but Bayes factors can equally well provide 10 evidence in favour of a null hypothesis. For example, a B21 between 3 and 20 provides both evidence against H1 and evidence in favour of H2 . From (15) it is clear that we need pr(D|H2 ) and pr(D|H1 ) in order to compute the Bayes factor. As shown in Appendix C, when Hi is a gaussian VAR with 7xed variance, pr(D|Hi ) is given by: pr(D|Hi ) = (2%) MT /2 | | 1/2 | i0 | | ВЇi | 1/2 1/2 exp { Qi /2} , (16) for i = 1, 2, where 1 Qi = y y ВЇi 1 ВЇi ВЇ i + 1 0i 0i (17) 0i , and where i0 , ВЇ i and i0 , ВЇ i are prior and posterior means and variances of the vector of coeJcients, and is the variance-covariance matrix of the residuals. In our case, theory H1 is the UVAR, with 10 = 0 , ВЇ 1 = ВЇ , 10 = , = , while theory H is the RVAR, with = , ВЇ ВЇ1 ВЇ 2 20 EH0 2 = 0 ВЇ EH , , ВЇ 2 = ВЇ EH . Thus the Bayes factor for the RVAR 20 = EH0 versus the UVAR is: B21 = | | | ВЇ EH | | 0| | 4.2 1/2 EH0 exp Q1 Q2 2 . (18) ВЇ| Results Figure 4 plots the Bayes factor as a function of the EH prior tightness together with its inconclusive region. The inconclusive region ranges from 1 to 3; in this region neither B21 > 3 nor B12 = B211 > 3 B21 < 13 , so both 3 the evidence in favour of theory H2 as opposed to theory H1 and of H1 as opposed to theory H2 are not worth more than a bare mention. If we allow for very little noise, letting 0, the Bayes factor supports the UVAR. This is the very common result of rejection. Indeed, letting the tightness go to zero amounts to imposing the EH without noise. Therefore our general framework nests the traditional one as a special case, and is also consistent with the empirical 7ndings rejecting the EH. On the other hand, allowing for very large departures from the EH restrictions, letting 1, leads the Bayes factor to the inconclusive region. Intuitively, the noise on the constraints becomes too large and data cannot distinguish between the restricted and the unrestricted VAR. Indeed, if we 11 allow for too large departures from the EH, the RVAR becomes virtually equivalent to the UVAR, and the EH prior is not sensible. For intermediate values of the Bayes factor strongly supports the RVAR. Importantly, the Bayes factor computed at our estimate for the EH prior tightness ( = 0.13) reaches a value close to 30 denoting strong evidence in favour of the EH1 . This result con7rms formally the informal statement of Carriero, Favero and Kaminska (2005) that once uncertainty is added to the picture the EH cannot be rejected. Moreover, this results shows that the amount of uncertainty needed is considerably smaller than that implied by their simulation experiment performed with an unrestricted VAR. Finally, this evidence explains the anomaly found by Campbell and Shiller (1987), that the EH is statistically rejected but the theoretical spread based on its validity is closely correlated with the actual spread. As stressed above, the theory is strongly supported by the data, which explains the high correlation, but also the EH is perturbed by some noise which leads the Wald test to reject the exact restrictions. 4.3 Robustness To evaluate the robustness of our results, we next extend the testing framework in two dimensions, providing statistical results recursively through time, and adding macroeconomic information to the picture. Before doing so, we study the eGect of alternative parametrizations of on our results. Finally we also look at the likelihood ratios which is unusual in the adopted Bayesian framework, but which is still useful to con7rm the results derived so far. 4.3.1 Alternative parametrizations As shown in section 3.3, a value of = 1 is suJciently large to ensure that the prior does not add any information to that of the likelihood, and the posterior mean of is numerically identical to the OLS estimate. Still, we may be interested in what happens if we increase the value above 1 (i.e. increasing the looseness of the loose prior). Table 2 displays Bayes factors computed at diGerent values of and . An increase in rescales our results, but does not change their qualitative pattern. 1 The fact that the Bayes factor is maximized by the same value of maximizing the marginal likelihood of the RVAR is obvious, as long as the competing prior (the UVAR) does not depend on that parameter. 12 Table 2: Bayes factors =1 0 1.3e 009 = 0.13 29.235 = 1.0037 for diGerent =2 6.3e 009 179.76 0.99691 , . =5 = 10 7.0e 008 5.0e 007 2398.7 18206 0.99757 0.99858 The magnitude of the Bayes factor at the peak (i.e. when = 0.13) increases, while its value at the extremes remains the same. In particular, as 0 the Bayes factor goes to zero, while as = it goes to 1. Thus results are independent from the particular choice of we make. 4.3.2 Macroeconomic information Whereas the EH implies that monetary policy aGects long-term rates by inKuencing expectations about future short-term rates, central banks also look at bond markets to extract information about inKation expectations. Therefore, policy is likely to respond to bond market conditions which may introduce an obvious misspeci7cation to our framework: the omission of macroeconomic variables to which the monetary authority reacts. Fuhrer (1996) uses a simple Taylor-rule type reaction function, the EH and reduced-form equations for output and inKation to solve for the reaction function coeJcients that delivers long-term rates consistent with the EH. He 7nds that modest and smoothly evolving time-variation in parameters of the reaction function is suJcient to reconcile the expectations model with the long-bond data. Favero (2002) extends FuhrerвЂ™s framework to derive standard errors for long-term rates consistent with the EH. Roush (2001) argues that previous work on the EH has failed to suJciently account for interactions between monetary policy and bond markets in the determination of long- and short-term interest rates and using a VAR model with macro and 7nancial variables 7nds strong evidence supporting a term structure channel for policy that is consistent with the EH. Ang and Piazzesi (2003) 7nd that macro factors explain a signi7cant amount of the variation in bond yields: in particular they explain most of the forecast variance of short-term rates at long forecast horizons, and of long-term rates at short forecast horizons. Thus we extend our framework to include macroeconomic information. In particular, we add to both the restricted and the unrestricted VAR the CPI inKation rate and the unemployment rate. Both series are provided by the Federal Reserve of St.Louis and are not subject to revision, so they can be used to produce forecasts in real time, which we do in Section 5. Of course, the inclusion of new variables requires some additional restrictions to hold 13 under the EH. Appendix D shows how to derive the additional restrictions implied by the EH on this augmented VAR. Performing the analysis described above in this extended framework does not change the results. In particular, the estimated tightness slightly decreases to a value of = 0.11, and at this value the Bayes factor reaches 44 (see Figure 6), that is, even higher than in the previous case. 4.3.3 Time It is worth to check whether our results are stable through time. Figure 5 plots Bayes factors computed recursively from 1984:1 until 2004:8, with the parameter 7xed at its estimated value of 0.13. As is clear, the Bayes factor is always above 20, providing strong evidence in favour of the EH. Figure 7 does the same with the RVAR augmented with macroeconomic information. 4.3.4 Likelihood ratios Figure 8 and 9 plot the (twice the log) likelihood ratio between the UVAR and the RVAR as a function of the EH prior tightness. Results con7rm those obtained with the Bayes factor: low values of imply a signi7cant loss in the likelihood, and this explains the common result of rejection in the literature. On the other hand, for higher values of the loss in the likelihood becomes lower and eventually zero, providing evidence in favor of the EH. 5 Explaining Long Term Rate Dynamics The previous section provided clear statistical evidence in favour of the EH. This section will complement this result with economic evidence. In particular, we will show that the behaviour of U.S. 10-year interest rate from the 1970s until 2004 has been perfectly consistent with the statement of the EH. To demonstrate this we 7rst construct a theoretical, EH-consistent long-term rate and then contrast it with the actual, realized long-term rate. 5.1 The EH-consistent long term rate Recall that under the EH the long term rate Rt,T is given by T 1 Rt,T = (1 ) i=0 14 i Et rt+i + T PT , (19) where the star denotes we are under the null of the EH. In our framework the expectational term Et rt+i can be obtained by the linear projection of the estimated RVAR. This avoids both problems related to the common strategy when testing the EH: we do not use ex-post data to proxy for expectations, and we avoid the simultaneity problems inherent to the single equation approach. This approach has been developed by Campbell and Shiller (1987). However, Campbell and Shiller (1987) use information from the whole sample to simulate investorsвЂ™ expectations while investors can use only historically available information to generate predictions of short-term rates. Indeed, as long as expectations are formed given the information at time t, the estimation window should not contain data beyond that date. Therefore, we compute Rt,T using a recursive estimation / projection scheme, such that at each point in time only the available information is used to 7rst estimate the RVAR, and then to project it forward. In particular, our procedure works as follows. i) The 7rst estimation is performed over the sample 1966:1 1970:12. All the subsequent estimations are performed over the sample 1966:1 1970:12+i where i is the number of iterations already executed. ii) Using the posterior of the coeJcients obtained at point i) the RVAR is projected forward and posterior of the variables Et rt+i and Rt,T are obtained. iii) Then we move forward one period, adding one data point to the estimation window, and go back to point i). This recursive estimation / projection scheme provides time series of the posterior distributions of the variables Et rt+i and Rt,T . As long as we are under the null of the EH, these variables would also yield the posterior distribution of the term premium i ) Et rt+i . as T PT = Rt,T (1 5.2 An economic test of the EH Our recursive estimation/projection approach provides a simple but very eGective test of the Expectation Hypothesis. Indeed, a test for the pure EH is immediately performed simply by checking whether the actual long-term rate could be a plausible draw from the posterior distribution of the EHconsistent long-term rate, i.e. if Rt,T lies within some credible bounds of the posterior distribution of Rt,T . This procedure is very similar to that of Carriero, Favero and Kaminska (2005), but with the subtle diGerence that here the uncertainty does not arise from estimation, but is modeled within the theory. The implied distribution of Rt,T is plotted in Figure 10. Interestingly, at the beginning of the sample there are many instances in which Rt,T has an asymmetric distribution. This comes from the fact that in those periods the draws of the coeJcients are such that the VAR is nearly unstable. The 15 forecasts explode and as a result the mean goes very far from the median. Notwithstanding this initial instability, the actual long-term rate (Rt,T ) almost2 always lies within the 5th and the 95th percentiles of the posterior distribution of the EH-consistent long-term rate Rt,T . The EH-consistent and the actual long-term rate are highly correlated, but of course they do not perfectly coincide. Again, this explains both the common result of rejection and the Campbell and Shiller (1987) anomaly. The traditional framework to test the EH understates the amount of uncertainty that individuals face when forecasting future short-term rates up to 120-month ahead, as it proxies ex-ante expected rates with the ex-post realized rates. Once we take into account the uncertainty involved in predicting short-term rates, the EH cannot be rejected. This result would hold also if the actual and the EH-consistent long-term rates were less clearly correlated: Figure 10 shows that the gap between the 5% and the 95% percentiles is considerably wider than the diGerence between the actual and the EH-consistent long-term rates. Therefore even if the actual long-term rate would have behaved much more diGerently from EH-consistent one, still it could very likely be consistent with the theory. Thus the dynamics of the 10-year rate in the last 35 years is perfectly consistent with the EH. This result adds an economic validation to the striking statistical evidence described in Section 4, and suggests that the common result of rejection is due to the understatement of the uncertainty involved in forecasting future short-term rates. 6 Does the Expectation Hypothesis Help in Forecasting Short Term Interest Rates? The previous sections provided clear statistical and economic evidence in favour of the EH. The results have proved to be robust through time and to the inclusion of macroeconomic information. In this section we investigate whether the EH can be exploited to improve forecasts of future short-term rates. In particular, we study whether imposing the EH prior restriction on the unrestricted VAR produces signi7cant improvements in forecast accuracy. As long as the theory is supported by the data, we expect that imposing it would yield advantages in terms of forecasting. As the EH relates future 2 Except for some instances all occurring during the reserves targeting era between the end of the 1970s and the beginning of the 1980s 16 short-term and actual long-term interest rates, it should provide additional useful information about future short-term rates by extracting it from the actual long-term rate. It is also worth to notice that the unrestricted VAR augmented with macroeconomic variables can be interpreted as the reduced form of a model featuring an interest rate rule, for example a Taylor rule. A large, growing body of empirical literature has established interest rate rules as a convenient way to model and interpret monetary policy (Taylor, 1993, Clarida, Gali and Gertler, 1998, 1999, 2000). When allowing for persistence in the short rate interest rate rules responding to inKation and the output gap tend to track the data well (Rudebush 2002, SГ¶derlind, SГ¶derstrГ¶m and Vredin, 2005). They are also capable of explaining the high inKation in the seventies in terms of an accommodating behaviour towards inKation in the pre-Volcker era. Thus an interest rate-rule model is the natural benchmark to evaluate the accuracy of short-term rate forecasts. There are many criteria available to compare predictive accuracy. Here we choose mean absolute forecast error (MAFE) and mean squared forecast error (MSFE). A 7rst assessment about predictive accuracy can be done by inspecting the following regression equations: U V AR M AF Et,h EH M AF Et,h = + ut,h, (20) U V AR M SF Et,h EH M SF Et,h = + + vt,h , (21) where h indexes the forecast horizon. Figures 11 and 12 plot the rolling estimates of the coeJcients and + together with their HAC standard error bounds. The performance of the interest rate rule based VAR is signi7cantly worse than that of the EH-restricted VAR up to 12 month-ahead for the MAFE loss function and 7 month ahead for the MSFE loss function. In order to interpret this as a valid test of equal predictive accuracy the error terms should be normally distributed, but this is not the case in the data.3 This issue cannot be solved using asymptotic tests of equal predictive accuracy4 as long as the two models are nested and thus forecast errors would coincide asymptotically under the null. Giacomini and White (2004) develop a new test (GW) which is based on conditional expectations of forecasts rather than the unconditional expectations. This test can handle the comparison of both nested and non-nested models and forecasts obtained by nonparametric and Bayesian estimation, the only requirement being that the size of the estimation window has to 3 4 The Jarque-Bera test rejects the null of normality in almost all instances For example, that proposed by Diebold and Mariano (1995). 17 be 7nite. For the one step ahead case, the GW test statistic for the null of equal conditional predictive accuracy is numerically equivalent to T times the uncentered squared multiple correlation coeJcient of the regression of a conU V AR EH stant on the diGerences in the loss functions (M SF Et,h M SF Et,h and U V AR EH M SF Et,h MSF Et,h ), and is distributed as .21 . Results of rolling GW test for both MSFE and MAFE are plotted in Figure 13. The performance of the interest rate rule based VAR is signi7cantly diGerent to that of the EH-restricted VAR up to 6 month-ahead for the MSFE loss function and 8 month ahead for the MAFE loss function. The sign of the diGerences in forecast accuracy can be deduced by Figures 9 and 10 and we recall that it is positive, implying a better forecasting performance of the EH-restricted VAR. A second, natural competitor of the EH prior is the Minnesota prior, which is the workhorse of Bayesian forecasting. Results for the GW test contrasting the EH prior with the Minnesota prior (with standard hyperparameter values) are contained in Figure 14. The forecasting accuracy of the Minnesota and the EH prior is not statistically diGerent at all forecasts horizons but the one-step ahead. Notice that the pattern of results in Figures 11-14 shows a signi7cant increase in the accuracy of the EH prior after the end of the reserves-targeting period. This provides evidence that while in the Volcker era interest rates were too volatile for the EH to be useful in forecasting, in the Greenspan era the stabilization of interest rates makes the EH useful in extracting information from long-term rates. The GW test provide clear evidence that the EH-restricted VAR outperforms the unrestricted VAR in forecasting short-term rates. As long as the unrestricted VAR augmented with macroeconomic variables can be interpreted as the reduced form of a model featuring an interest rate rule (for example a Taylor-rule), this result can be interpreted as evidence that, if we add to an interest rate rule based VAR also the EH restrictions, then additional information about future short-term rates contained into the long-term rate can be extracted and exploited to increase signi7cantly the accuracy of the forecasts. Indeed, since the EH relates expected future short-term and actual longterm rates, it provides additional useful information about future short-term rates by extracting it from the actual long-term rate. Of course, this additional information can be extracted only if the EH does hold, which is what we found in the data. 18 7 Conclusions By de7nition any economic theory is a simpli7cation of reality, and as such it cannot hold exactly even if the theory is вЂњtrueвЂќ. Indeed, in the real world there is always some kind of noise which blurs any equilibrium. This study develops an extended version of the EH in which transitory deviations from the theory may occur. To model these deviations we derive a set of restrictions on a VAR and then we let them hold with some degree of uncertainty. This amounts to deriving from the EH a prior for the coeJcients of the VAR. Then, we can contrast this prior with an unrestricted, loose prior representing a world in which the EH does not hold. To make the comparison, we use the Bayes factor, which is a summary of the evidence provided by the data in favour of one theory as opposed to another. To ensure the robustness of our results, we extend the testing framework towards two dimensions, providing statistical results recursively through time, and including macroeconomic information into the picture. Indeed, the EH implies that monetary policy aGects long-term rates by inKuencing expectations on future short-term rates, but also central banks look at bond markets to get informed about inKation expectations and so incidental policy reactions to bond market conditions are likely to occur. Beyond statistical evidence, we also provide economic evidence. In particular, we perform an economic test by checking the consistency of the observed long-term rate with the EH, and we study whether the EH may lead to some improvements in forecasting short-term rates. Our results show that the EH is strongly supported by the data and is perfectly consistent with the dynamics of the US 10-year rate from the seventies to nowadays. Moreover, the proposed framework is able to explain the very common result of rejection of the EH, to solve the anomaly found out by Campbell and Shiller (1987) and to signi7cantly improve forecast accuracy of a Taylor-rule based model in predicting future short-term rates. 19 Appendix A: derivation of the EH restrictions Stack the VAR as: rt rt 1 rt 2 . St St 1 St 2 = k1 0 0 k2 0 0 a1 a2 a3 b1 b2 b3 1 0 0 0 0 0 0 1 0 0 0 0 c1 c2 c3 d1 d2 d3 0 0 0 1 0 0 0 0 0 0 1 0 + rt 1 rt 2 rt 3 . St 1 St 2 St 3 + u1t 0 0 u2t 0 0 , or, more succinctly: zt = C + Azt + vt . 1 Recalling (3), the EH would put on the VAR the following set of nonlinear restrictions: i 1 i g zt = An C + Ai zt h + TP , n=0 i 1 i=1 i = h An C + n=0 i=1 i = h i h Ai zt + T P , i=1 Ai ) (I (I i A) 1 C + i=1 h Ai zt + T P , i=1 where g and h are selector vectors with 6 elements, all of which are zero except for the 4th element of g and the 7rst element of h which are unity. Since the above expression has to hold in general, it holds also for TP = i h Ai ) (I (I A) 1 C i=1 and5 for any zt : i g zt = h Ai zt . i=1 The trace statistic for the null of no cointegration (with the intercept both in cointegrating equation and in the VAR) is well above the critical value (207.811 while the 1% critical value is 20.04). Therefore this VAR is 5 Notice that by adding the restriction T Pt = 0 we could test the PURE EH. 20 stable and we can exploit the properties of geometric series to write: g = h A(I Postmultiplying by (I A) 1 . A) provides the following set of linear restrictions: g (I A) = h A, i.e.: 1 a1 0 c1 0 0 g a2 1 c2 0 0 a3 0 1 c3 0 0 b1 0 0 1 d1 b2 0 0 d2 1 b3 0 0 d3 0 1 0 a1 =h 0 c1 0 0 a2 0 c2 0 0 a3 0 0 c3 0 0 b1 0 0 d1 b2 0 0 d2 0 0 b3 0 0 d3 0 0 , As g and h select respectively the 4th and 1st row of A we obtain: c1 c2 c3 1 d1 d2 d3 = a1 a2 a3 b1 b2 Thus the EH imposes the following constraints on the individual coeJcients of the VAR: a1 + c1 0 b1 + d1 1/ a2 + c2 0 = . b2 + d2 0 a3 + c3 0 b3 + d3 0 21 b3 . Appendix B: representation of the EH prior in terms of the vector of coe2cients rather than in terms of the restrictions Put the VAR in the SUR notation: y = + , y EH r S = IM T Г—(pM+1) MT Г—M(pM+1) MT Г—1 r X = = EH X 1 S 1 M(pM+1)Г—1 u1 , u2 + MT Г—1 ... r 3 S 1 3 r 1 S 1 ... r 3 a1 b1 a2 b2 a3 b3 k1 c1 d1 c2 d2 c3 d3 k2 ! " N 0, = u IT , S 3 , MT Г—MT where M = 2 is the number of equations, p = 3 is the number of lags included, and T is the sample size. The generic form of a normal prior with 7xed variance for the vector of coeJcient would be: N ( 0 , 0 ), The unrestricted VAR corresponds the following diGuse prior: N( 0 = 0 , 14Г—1 0 = I14 ), and for = 1 the posterior mean of are identical to the OLS estimator. Now consider the set of restrictions implied on the unrestricted VAR by the EH: a1 + c1 b 1 + d1 a2 + c2 b 2 + d2 a3 + c3 b 3 + d3 N Denoting with EH ВµEH0 = 0 1/ 0 0 0 0 , EH0 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . the vector of coeJcients of the VAR when it satis7es the 22 1 , EH-restrictions we can write: H= 1 0 0 0 0 0 0 1 0 0 0 0 H EH 0 0 1 0 0 0 0 0 0 1 0 0 N ВµEH0 , 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 (22) , EH0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 . The RVAR consists of the VAR plus the EH-restrictions: y= H EH + N ВµEH0 , EH . EH0 There is an alternative way to write EH restrictions. The generic form of a normal prior satisfying the EH restrictions would be: EH N( EH0 , EH0 ). Which implies: H EH (H EH0 , H EH0 (23) H ). Under the EH both (22) and (23) must hold, so there is the following relation between the prior moments of the vector of restrictions and those of the vector of coeJcients: EH0 ВµEH0 = H = H EH0 H, EH0 . The above system has no unique solution since there are 14 coeJcients and 6 restrictions, 8 coeJcients are not identi7ed and H is not invertible. To solve this problem simply set a diGuse prior with variance (the same of the UVAR coeJcients) on the unidenti7ed coeJcients. This provides an invertible H without aGecting the EH restrictions and the analysis. The restriction prior moments become: ВµEH0 = EH0 0 0 0 0 0 0 0 0 1/ = diag 0 0 0 0 0 , , 23 while the restriction matrix becomes: H2 = 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . With suJciently big (which is the case in the paper) the form of the EH restrictions is not aGected and this speci7cation is equivalent to the preceding one. Moreover, it is now possible to invert the restriction matrix and to get an explicit prior for EH : EH0 EH0 = H2 1 ВµEH0 , = H2 1 EH0 H2 1 . So the RVAR can be written as: y= N( EH EH0 = + , EH0 EH EH0 0 0 0 0 0 0 0 0 1/ 24 ) , 0 0 0 0 0 . 0 EH0 0 0 0 0 0 0 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 , The corresponding correlation matrix is: 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 + 0 + 0 + 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 + 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + + + 0 0 + 0 + + + This matrix converges to the correlation matrix of the UVAR (i.e. the identity matrix) for . and to the correlation matrix of the VAR with the exact EH restrictions for 0. To estimate we write this prior in the form suggested by Theil-Goldberger and add it to the VAR. De7ne: H2 EH EH0 = vEHt N (0, EH0 25 )= EH0 = H2 EH + vEHt . . and plug in getting the EH-restricted VAR: yEH = EH EH + EH . EH yEH EH IM rt St X T Г—(pM+1) MT Г—M(pM+1) = EH M(pM+1)Г—1 [H2 ] M(pM+1)Г—M(pM+1) (MT +M(pM+1))Г—M(pM+1) EH0 (MT +M(pM+1))Г—1 0, = EH , (MT +M(pM+1))Г—1 = [ N EH u1t u2t vEHt + IT ] u MT Г—MT 0 , 0 EH0 (MT +M(pM+1))Г—(MT +M(pM+1)) The same procedure is applied to the competing UVAR diGuse prior. First rewrite the prior in the form suggested by Theil-Goldberger: F F0 = vF Then plug it in the VAR. # yF = F N (, F0 )= y= + N( 0 = 0 , 14Г—1 F + 0 F0 = F + vF . = I14 ) , F, F yF F rt St IM 0 (MT +M(pM+1))Г—1 M(pM+1)Г—1 = [ F + IM(pM+1) (MT +M(pM+1))Г—M(pM+1) N 0, u1t u2t vM X T Г—(pM+1) MT Г—M(pM+1) = F = (MT +M(pM+1))Г—1 IT ] u MT Г—MT 0 0 . 0 (MT +M(pM+1))Г—(MT +M(pM+1)) 26 , Appendix C: posterior densities and marginal likelihoods Here we compute the posterior and the marginal likelihood of the vector of coeJcients . The results of course can be applied to both the models ( EH and F P ) at hand. y rt St MT Г—1 = IM X T Г—k MT Г—Mk MkГ—1 MT Г—1 0, 0 ); kГ—1 kГ—k k = pM + 1; a e u1t ; u2t + N( N(0, ); = [ IT ], u MT Г—MT where M = 2 is the number of equations, p = 3 is the number of lags included, and T is the sample size. Here 0 and 0 can be both the EH or the Kat prior moments. In compact notation: y= + . The prior density is: p( ) = (2%) Mk/2 | 1/2 exp | | 1/2 exp | 0 the likelihood is6 : p(y| ) = (2%) MT /2 a posterior density kernel is: p(y| )p( ) = (2%) notice that: | | 1/2 =| u $ 1/2( IT | | | (y +( 1/2 1/2 = (| 27 0) 1/2(y M(T +k)/2 exp 6 $ T u| 1/2 ) ( 0 1 1/2 =| 0) ) 0) u| , % ) , (y | 0 | 1/2 , ) 1 (y 1 ( 0) 0 |IT |M ) % 1 T /2 . Now de7ne7 : 1 ВЇ = ВЇ= ВЇ 0 1 1 + 1 1 + 0 0 , y . Using the above de7nitions and completing the square yields: (y 1 = y + 1 y 1 a0 + + 0 1 ] + 0 ВЇ ВЇ] [ ВЇ ВЇ 1 1 0 ( 0) 1 0 1 1 0) 1 a [y y )+( y 1 [ 1 =y (y y 0 1 = y y + [ =y 1 ) 0 1 0 ] 0 1 ВЇ 1 1 ВЇ 1 y+ 1 + 0 0 0 1 [ 1 1 0 1 y+ 0 a0 ] 0 0 ВЇ] + 1 [ ВЇ+ ВЇ ] + 1 + ВЇ 1 0 0 0 1 0 0. 0 This can be rewritten as8 : (y = y 7 1 y (y ВЇ )+( 1 ) ВЇ ВЇ + (ВЇ 1 0) 1 0 (ВЇ ВЇ ( )+ 0) 1 0 0, 0 Notice that: ВЇ ВЇ 1 = 0 = [ 0 = = 8 1 ) 1 ВЇ ВЇ 1 + + IM 1 0 1 0 ] 1 u IM 0 0 =[ + (IM + (IM X IT X] 1 + 1 0 y = u 1 = ВЇ X IT )y = X) ( 1 0 1 0 u IT )(IM + u XX 0 + (IM ВЇ 1 0 1 X) ( 0 +( 1 X) u u IT )y X )y since: 1 ВЇ = ( +ВЇ +( ВЇ ) ВЇ ) ВЇ 1( +ВЇ 1 ) ВЇ ( ВЇ ) + (ВЇ 28 ВЇ ) = (ВЇ ) ВЇ 1 (ВЇ 1 1 ВЇ ( ВЇ ) + ( ВЇ ) ВЇ (ВЇ ) ) so a posterior density kernel can be also written as follows: M(T +k)/2 p(y| )p( ) = (2%) exp 1/2 | | y 1/2 (ВЇ 1 y | | = (2%) $ exp 1/2[(ВЇ ) where: 1 Q=y y 1 ВЇ Forgetting constants: $ p(y| )p( ) exp 1/2[(ВЇ ) ВЇ 1 ВЇ | 1/2 | ) 0 ВЇ 1/2 M(T +k)/2 | | 0 1 ВЇ (ВЇ ВЇ+ 0 ВЇ 1 1 (ВЇ ВЇ+ 1 0 ВЇ 1/2 0 0 % ) + Q] , 1 0. 0 % )] = (ВЇ ) p( |y) N (ВЇ , ВЇ ), which shows that ВЇ , ВЇ are the moments of the posterior. The posterior properly normalized density is: $ % 1/2 (ВЇ ) ВЇ 1 (ВЇ ) . p( |y) = (2%) Mk/2 | ВЇ | 1/2 exp The marginal likelihood is given by integral over the M Г— k dimensional space of the product of the properly normalized prior and data densities: & + & + & ML = ... p(y| )p( ) d 1 ...d Mk = p(y| )p( )d Mk = & M(T +k)/2 (2%) Mk | | 1/2 | 0 | 1/2 exp (y ( 1/2 1 (y ( 0 ) 0) )+ 0 )] 1 we have that: y 1 = y 1 = y +ВЇ 1 1 y ВЇ ВЇ y ВЇ ВЇ y ВЇ ВЇ 1 ВЇ 1 ВЇ 1 ВЇ 1 1 ВЇ 1 1 ВЇ +[ ВЇ (ВЇ (ВЇ ВЇ+ 1 (ВЇ = y 1 y ВЇ ) y ВЇ 1 ВЇ 1 ВЇ ) ВЇ ВЇ + (ВЇ ) ВЇ ВЇ ВЇ 1 (ВЇ ВЇ ВЇ ВЇ ВЇ + (ВЇ 1 ВЇ ВЇ + (ВЇ ) 1 ВЇ ) (ВЇ 1 (ВЇ )+ 0 1 ВЇ )+ 29 ) + ВЇ ВЇ 1 ВЇ+ ВЇ ВЇ 1 (ВЇ ) + 0 ) ВЇ 1 0 0 (ВЇ 1 ВЇ ВЇ ) + 0 1 0 1 ) 1 ВЇ 1 = y y ВЇ ВЇ 1 [ + (ВЇ )] [(ВЇ 1 ) + ВЇ ВЇ1ВЇ + +(ВЇ ) ВЇ (ВЇ = y ]+ 1 0 ] 0 1 ВЇ 0 ВЇ 0 (ВЇ 0 )+ВЇ 1 0 0 1 ВЇ ВЇ+ 0 1 0 0 1 0 0 d & (2%) M(T +k)/2 | | 1/2 | Mk $ exp 1/2[(ВЇ ) ВЇ 1 (ВЇ = 0 1/2 | % ) + Q] d = (2%) M(T +k)/2 | | 1/2 | 0 | 1/2 exp { Q/2} & $ % exp 1/2[(ВЇ ) ВЇ 1 (ВЇ )] d . Mk Notice it is important that the properly normalized prior and properly normalized likelihood, and not arbitrary kernels of these densities, be used in forming the marginal likelihood. Now recognize a posterior kernel in the above expression and exploit the fact that the posterior properly normalized density integrates to one: & + & & + ... p( |y) d 1 ...d Mk = p( |y)d = 1 = Mk 1 = = & (2%) Mk/2 Mk (2%) | 1 Mk/2 | ВЇ| 1/2 ВЇ| = 1/2 exp & $ exp $ 1/2 (ВЇ 1 ) 1/2 (ВЇ ВЇ ) (ВЇ 1 ВЇ (ВЇ % ) d % ) . The marginal likelihood is thus: & p(y| )p( )d = (2%) MT /2 | | Mk 1/2 | | 0 1/2 | 1/2 ВЇ| exp { Q/2} , where: Q=y 1 y 1 ВЇ ВЇ ВЇ+ 0 1 0 0. From this it is immediate to derive the Bayes factor of the RVAR against the UVAR: priorRV AR BF = | postRV AR | | priorU V AR | | posU V AR | 1/2 exp 30 QU V AR QRV AR 2 . (24) Appendix D: VAR Augmentation Our VAR becomes: rt = k1 + a(L) rt 1 + b(L)St 1 + 2(L)%t 1 + 3(L)yt 1 + u1t , St = k2 + c(L) rt 1 + d(L)St 1 + 4(L)% t 1 + .(L)yt 1 + u2t . (25) Stack the VAR as: rt rt 1 rt 2 . St St 1 St 2 !t !t 1 !t 2 yt yt 1 yt 2 = k1 0 0 k2 0 0 k3 0 0 k4 0 0 a1 1 0 c1 0 0 e1 0 0 h1 0 0 + a2 0 1 c2 0 0 e2 0 0 h2 0 0 a3 0 0 c3 0 0 e3 0 0 h3 0 0 b1 0 0 d1 1 0 f1 0 0 i1 0 0 b2 0 0 d2 0 1 f2 0 0 i2 0 0 b3 0 0 d3 0 0 f3 0 0 i3 0 0 "1 0 0 $1 0 0 g1 1 0 l1 0 0 "2 0 0 $2 0 0 g2 0 1 l2 0 0 "3 0 0 $3 0 0 g3 0 0 l3 o 0 #1 0 0 %1 0 0 g1 0 0 m1 1 0 #2 0 0 %2 0 0 g2 0 0 m2 0 1 #3 0 0 %3 0 0 g3 0 0 m3 0 0 rt rt rt St St St !t !t !t yt yt yt 1 2 3. 1 2 3 1 + 2 3 1 2 3 Or, more succinctly: zt = C + Azt 1 + vt . For T going to in7nity, since the VAR is cointegrated this converges to: T i g = h Ai i=1 h A(I T A) 1 . Postmultiplying provides a set of linear restrictions: g (I g( a1 1/ g 1 0 c1 0 0 e1 0 0 h1 0 0 a2 1/ 1 c2 0 0 e2 0 0 h2 0 0 a3 0 1/ c3 0 0 e3 0 0 h3 0 0 b1 0 0 d1 1/ 1 0 f1 0 0 i1 0 0 A) = h A, I b2 0 0 d2 1/ 1 f2 0 0 i2 0 0 A) = h A, b3 0 0 d3 0 1/ f3 0 0 i3 0 0 31 "1 0 0 $1 0 0 1/ g1 1 0 l1 0 0 "2 0 0 $2 0 0 g2 1/ 1 l2 0 0 "3 0 0 $3 0 0 g3 0 1/ l3 o 0 #1 0 0 %1 0 0 g1 0 0 1/ m1 1 0 #2 0 0 %2 0 0 g2 0 0 m2 1/ 1 #3 0 0 %3 0 0 g3 0 0 m3 0 1/ = u1t 0 0 u2t 0 0 u3t 0 0 u4t 0 0 . a1 1 0 c1 0 0 e1 0 0 h1 0 0 =h a2 0 1 c2 0 0 e2 0 0 h2 0 0 a3 0 0 c3 0 0 e3 0 0 h3 0 0 b1 0 0 d1 1 0 f1 0 0 i1 0 0 b2 0 0 d2 0 1 f2 0 0 i2 0 0 b3 2 1 2 2 2 3 31 32 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d3 41 42 43 .1 .2 .3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 f3 g1 g2 g3 n1 n2 n3 0 1 0 0 0 0 0 0 0 1 0 0 0 0 i3 l1 l2 l3 m1 m2 m3 0 0 0 o 1 0 0 0 0 0 0 0 1 0 . So the EH imposes the following constraints on the individual coeJcients of the VAR: c1 = a1 c2 c3 1/ d1 d2 d3 41 42 43 a2 a3 b1 b2 b3 2 1 2 2 2 3 31 32 33 . .1 De7ne: EH 1 2 3 4 = = = = = 1 a1 c1 e1 h1 2 b1 d1 f1 i1 3 21 41 e2 h2 , 31 a2 .1 c2 f2 e3 i2 h3 4 b2 d2 f3 i3 2 2 32 a3 b3 42 .2 c3 d3 g1 n1 g2 n2 l1 m1 l2 m2 2 3 33 k1 , 43 .3 k2 , g3 n3 k3 , l3 m3 k4 , The matrices H and H2 become: I12 H= 0 12Г—1 I13 H2 = I12 0 1Г—12 I12 0 12Г—1 , 0 13Г—13 0 0 12Г—1 0 0 26Г—26 32 I13 26Г—26 I26 . .2 .3 References [1] Amihud Y, Mendelson H. 1991. Liquidity, Maturity, and the Yields on U.S. Treasury Securities. Journal of Finance 46: 1411-1425. [2] Ang A, Piazzesi M. 2003. A No-Arbitrage Vector Autoregression of Term Structure Dynamics with Macroeconomic and Latent Variables. Journal of Monetary Economics 50: 745-787. [3] Boudoukh J, Whitelaw RF. 1991. The Benchmark EGect in the Japanese Government Bond Market. Journal of Fixed Income September: 52-59. [4] Campbell J. 1995. Some Lessons from the Yield Curve. Journal of Economic Perspectives. 9 3: 129-152. [5] Campbell J, Shiller R. 1987. Cointegration and Tests of Present Value Models. Journal of Political Economy. 95: 1062-1088. [6] Carriero A, Favero C, Kaminska I. 2005. Financial factors, Macroeconomic Information and the Expectation Theory of the Term Structure of interest Rates. Journal of Econometrics, forthcoming. [7] Clarida R, Gali J, Gertler M. 1998. Monetary Policy Rules in Practice: Some International Evidence. European Economic Review. 42. [8] Clarida R, Gali J, Gertler M. 1999. The Science of Monetary Policy: A new-Keynesian Perspective. Journal of Economic Literature. XXXVII 4: 1661-1707. [9] Clarida R, Gali J, Gertler M. 2000. Monetary Policy Rules and Macroeconomic Stability: evidence and some Theory. The Quarterly Journal of Economics. 115 1: 147-180. [10] Cornell B, Shapiro A. 1990. The Mispricing of U.S. Treasury Bonds: a Case Study. Review of Financial Studies. 2: 297-310. [11] Cox J, Ingersoll J, Ross S. 1981. A Reexamination of the Traditional Hypotheses About the Term Structure of Interest Rates, Journal of Finance 36: 321-346. [12] Daves P, Ehrhardt MC. 1993. Liquidity, Reconstruction. and the Value of U.S. Treasury Strips. Journal of Finance. 48: 315-329. [13] Doan T, Litterman R, Sims C. 1984. Forecasting and Conditional Projection Using Realistic Prior Distributions. Econometric Reviews. 3: 1-100. 33 [14] Diebold FX, Mariano RS. 1995. Comparing Predictive Accuracy. Journal of Business and Economic Statistics 13 3: 253-63. [15] Diebold FX, Rudebusch GD, Aruoba SB. 2003. The Macroeconomy and the Yield Curve: a Dynamic Latent Factor Approach. NBER Working Paper. No. 10616. [16] DuGee G. 1996. Idiosyncratic Variation of Treasury Bill Yields. Journal of Finance. 51: 527-551. [17] Elton EJ. 1999. Expected return, Realized Return and Asset Pricing Tests. Journal of Finance. [18] Elton EJ, Green C. 1998. Tax and Liquidity EGects in Pricing Government Bonds. Journal of Finance 53: 1533-1562. [19] Fama E, Bliss RR. 1987. The Information in Long-Maturity Forward Rates. American Economic Review. 77: 680-692. [20] Favero CA. 2002. Taylor rules and the Term Structure. IGIER Working Paper [21] Fisher I. 1930. The Theory of Interest. Macmillan. New York. [22] Fisher M, Gilles C. 1998. Around and Around: the Expectation Hypothesis. Journal of Finance 53: 365-383. [23] Fuhrer JC. 1996. Monetary Policy Shifts and Long-Term Interest Rates. Quarterly Journal of Economics. 111 4: 1183-1209. [24] Geweke J. 1998. Using Simulation Methods for Bayesian Econometric Models: Inference, Development, and Communication. FED of Minneapolis Research Department Sta* Report 249 [25] Giacomini R, White H. Tests of Conditional Forecast Accuracy. mimeo UCSD [26] Hicks J. 1953. The Long-Term Dollar Problem, Oxford EP [27] JeGreys H. 1935. Some Tests of Signi7cance, Treated by the Theory of Probability. Proceedings of the Cambridge Philosophy Society. 31: 203222. [28] JeGreys H. 1961. Theory of Probability. Oxford University Press. Oxford UK. 34 [29] Johansen S. 1995. Likelihood-based Inference in Cointegrated Vector Auto-regressive Models. Oxford University Press, Oxford UK. [30] Kass RE, Raftery A. 1995. Bayes Factors. Journal of the American Statistical Association, 90 430: 773-795. [31] Keynes JM. 1930. Treatise on Money. Harcourt, Brace. New York. [32] Ingram B, Whitemann C. 1994. Supplanting the Minnesota prior. Forecasting Macroeconomic Time Series Using Real Business Cycle Priors. Journal of Monetary Economics. 34: 497-510. [33] LongstaG, FA. 1992. Are Negative Option Prices Possible? The Callable U.S. Treasury Bond Puzzle. Journal of Business. 65: 571-592. [34] LongstaG FA. 2000. Arbitrage and the Expectation Hypothesis. Journal of Finance. 55 2: 989-994. [35] Mankiw NG, Miron J. 1986. The Changing Behaviour of the Term Structure of Interest Rates. The Quarterly Journal of Economics. 101: 211221. [36] McCallum B. 1994. Monetary Policy and the Term Structure of Interest Rates. NBER Working Paper. No. 4938. [37] McCulloch JH. 1993. A Reexamination of Traditional Hypoteheses About the Term Structure: a Comment. Journal of Finance 49: 186182. [38] Roush J. 2001. Evidence Uncovered: Long-Term Interest Rates, Monetary Policy and the Expectations Theory. mimeo Board of Governors of the Federal Reserve System. [39] Rudebusch GD. 1995. Federal Reserve Interest Rate Targeting, rational expectations, and the term structure. Journal of Monetary Economics. 35: 245-274. [40] Rudebusch GD. 2002. Term Structure Evidence on Interest Rate Smoothing and Monetary Policy Inertia. Journal of Monetary Economics. 49: 1161-1187. [41] Shiller R. 1979. The Volatility of Long Term Interest Rates and Expectations Models of the Term Stucture. Journal of Political Economy. 87: 1190-1219. 35 [42] Shiller R. 1981. Alternative Tests of Rationals Expectations Models: the Case of the Term Structure. Journal of Econometrics 16: 71-87. [43] Sims C, Zha T. 1998. Bayesian Methods for Dynamic Multivariate Models. International Economic Review. 39: 949-968. [44] SГ¶derlind P, SГ¶derstrГ¶m U, Vredin A. 2005. Dynamic Taylor Rules and the Predictability of Interest Rates. Macroeconomic Dynamics, forthcoming. [45] Taylor JB. 1993. Discretion Versus Policy Rules in Practice. CarnegieRochester Conference Series on Public Policy. 39. 195-214. [46] Theil H. 1971. Principles of Econometrics. Wiley. New York. [47] Thornton DL. 2004. Predictions of Short-Term Rates and the Expectations Hypothesis of the Term Structure of Interest Rates. St.Louis FED Working Paper. No.2003-021B [48] Thornton DL. 2005. Tests of the Expectations Hypothesis: Resolving the Campbell-Shiller Paradox. Journal of Money, Credit, and Banking. Forthcoming. [49] Thornton DL. 2005. Tests of the Expectations Hypothesis: Resolving the Anomalies when the Short-Term Rate Is the Federal Funds Rate. Journal of Banking and Finance, Forthcoming. [50] Zellner A. 1988 Bayesian Analysis in Econometrics. Journal of Econometrics. 37: 27-50. 36 -218 Marginal Likelihood 3 x 10 2 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 EH Tightness 0.7 0.8 0.9 1 Figure 1: Marginal Likelihood of the restricted VAR as a function of the EH prior tightness 0.2 a 1 0.2 0.1 0.1 0 0.6 0.2 a 2 0.8 1 c1 0.1 0 0.2 0.4 0.6 c2 -0.5 -0.4 -0.3 0 0.2 0.1 0.1 0 0.8 -0.4 -0.2 0.2 0 0.8 1 1 0 -1 0.2 3 0.1 0.6 b -0.5 d1 c 0.1 0 0.2 a 3 0.1 0 -0.2 0 0.2 0.2 b 2 0.2 b 3 0.2 0.1 0.1 0.1 0 -0.1 0.2 d 2 0 0.1 0.1 0 -0.2 0 0 0.2 0 -0.2 0.2 d 0 0 0.1 1 0 0.2 0.4 -0.1-0.05 0 0.05 0.2 k2 3 0.1 -0.1 k 0.1 0 -0.1 0 0.1 0.2 0 0 0.05 0.1 RVAR Posterior RVAR Prior 0.2 a 1 0.1 0 0.6 0.2 c1 0.1 0 0.8 1 0.2 a 2 0.2 0.1 0.1 0 0.4 0.2 c 3 0.6 2 0 3 0.1 0.8 1 b 0.2 1 0.1 0 0.8 -0.4 -0.2 0.2 c 0.1 0 -0.5 -0.4 -0.3 0.6 0.2 a 0 -0.2 0.2 2 0.1 0 -1 0.2 d -0.5 1 0.1 0 b 0 -0.1 0.2 d 2 0 0.1 0 0.2 0 0.2 k 1 0.1 0.1 0 -0.2 0.2 0 d3 0.1 0 -0.2 0.2 b 3 0 0.2 0.4 -0.1-0.05 0 0.05 0.2 k 2 0.1 -0.1 0 0.1 0 -0.2 0.1 0 0.2 0 0 0.05 0.1 UVAR Posterior UVAR Prior Figure 2: Prior and posterior distributions for the VAR coeJcients under the Expectation Hypothesis prior (Restricted VAR) and under the loose prior (Unrestricted VAR) 37 0.15 a +c 1 1 0.1 0.05 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -0.6 -0.4 -0.2 0 0.2 0.4 EH Posterior (RVAR) Loose Posterior (UVAR) EH Prior (RVAR) Loose Prior (UVAR) 0.6 0.8 1 0.15 b +d 1 1 0.1 0.05 0 0 0.15 a +c 2 2 0.1 0.05 0 -1 0.15 b +d 2 2 0.1 0.05 0 -1 -0.8 0.15 a +c 3 3 0.1 0.05 0 -1 -0.8 0.15 b3+d3 0.1 0.05 0 -1 -0.8 Figure 3: Prior and posterior distributions of the Expectation Hypothesis restrictions under the Expectation Hypothesis prior (Restricted VAR) and the loose prior (Unrestricted VAR) 38 Bayes Factor 30 20 10 0 0 0.1 0.2 0.3 0.4 0.5 0.6 EH Tightness 0.7 0.8 0.9 1 Figure 4: Bayes factor for the EH-restricted versus the unrestricted VAR, as a function of the EH prior tightness Bayes Factor 40 30 20 10 0 1984 1989 1994 Time 1999 2004 Figure 5: Recursive Bayes factor for the EH-restricted versus the unrestricted VAR. The EH prior tightness is 7xed at its estimated value = 0.13 39 Bayes Factor 50 40 30 20 10 0 0 0.1 0.2 0.3 0.4 0.5 0.6 EH Tightness 0.7 0.8 0.9 1 Figure 6: Bayes factor for the EH-restricted versus the unrestricted VAR, when both include macroeconomic information. As a function of the EH prior tightness . Bayes Factor 40 30 20 10 0 1984 1989 1994 Time 1999 2004 Figure 7: Recursive Bayes factor for the EH-restricted versus the unrestricted VAR when both include macroeconomic information. The EH prior tightness is 7xed at its estimated value = 0.11 40 Figure 8: Twice the log-likelihood ratio for the EH-restricted versus the unrestricted VAR, as a function of the EH prior tightness Figure 9: Twice the log-likelihood ratio for the EH-restricted versus the unrestricted VAR when both include macroeconomic information. As a function of the EH prior tightness 41 30 Actual R Median R* 5% R* 95% R* MeanR* 25 20 15 10 5 0 -5 -10 -15 -20 1971:1 1976:1 1981:1 1986:1 1991:1 1996:1 2001:1 2004:1 Figure 10: Economic test of the Expectation Hypothesis. The distribution of the theoretical, EH-consistent long term rate Rt,T is obtained by a recursive estimation/projection scheme, such that at each point in time only the available information is used to estimate the RVAR and then to project it forward.The procedure works as follows: i) The 7rst estimation is performed over the sample 1966:1 1970:12. All the subsequent estimations are performed over the sample 1966:1 1970:12+i where i is the number of iterations already executed. ii) Using the posterior of the coeJcients obtained at point i) the RVAR is projected forward and posterior of the variables Et rt+i and Rt,T are obtained . iii) Then we move forward one period, adding one data point to the estimation window, and go back to point i). 42 1-month ahead 2-month ahead 3-month ahead 0.2 0.06 0.1 0.15 0.04 0.1 0.05 0.02 0.05 0 0 1980 1990 2000 1980 1990 2000 1980 5-month ahead 4-month ahead 0.2 0 1990 2000 6-month ahead 0.2 0.2 0.1 0.1 0.15 0.1 0.05 0 0 1980 1990 2000 7-month ahead 0.2 0 1980 1990 2000 8-month ahead 0.2 0.1 0.1 0 0 0.05 0 0.3 1980 1990 2000 10-month ahead 0.3 1980 1990 2000 11-month ahead 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0.3 1980 1990 2000 18-month ahead 1990 2000 9-month ahead 0.2 0.15 0.1 1980 1980 1990 2000 21-month ahead 1980 1990 2000 12-month ahead 1980 1990 2000 24-month ahead 0.2 0.2 0.4 0.1 0.1 0.2 0 0 -0.1 -0.1 0 -0.2 1980 1990 2000 1980 1990 2000 1980 1990 2000 alfaUVAR-EH 2se HAC bounds 43 Figure 11: Excess Mean Absolute Forecast Errors of the UVAR with respect to the EH. Rolling (5 year window) estimates with 2 HAC standard errors bounds of the parameter in the following regression equation: M AF Eh,U V AR MAF Eh,EH = + ut , where h is the number of step-ahead. 1-month ahead 2-month ahead 0.15 0.8 0.1 0.6 0.05 0.4 0 0.2 -0.05 0 1980 1990 2000 4-month ahead 1.5 1 0.5 0 1980 1990 2000 5-month ahead 1980 1990 2000 6-month ahead 2 1.5 1 3-month ahead 1.5 1 1 0.5 0.5 0.5 0 0 0 1980 1990 2000 1980 1990 2000 8-month ahead 7-month ahead 4 3 2 2 1 0 1980 6 2 1 0 1990 2000 0 1980 1990 2000 11-month ahead 10-month ahead 1980 1990 2000 12-month ahead 8 6 4 1980 1990 2000 9-month ahead 6 4 2 2 4 2 0 0 0 -2 1980 1990 2000 18-month ahead 15 1980 1990 2000 1980 1990 2000 24-month ahead 21-month ahead 10 20 5 5 10 0 0 10 -5 0 -5 1980 1990 2000 1980 1990 2000 1980 1990 2000 betaUVAR-EH 2se HAC bounds 44 Figure 12: Excess Mean Squared Forecast Errors of the UVAR with respect to the EH. Rolling (5 year window) estimates with 2 HAC standard errors bounds of the parameter + in the following regression equation: M SF Eh,U V AR M SF Eh,EH = + + vt , where h is the number of step-ahead. 1-month ahead 2-month ahead 3-month ahead 50 20 20 40 15 15 10 10 5 5 30 20 10 0 1970 20 1980 1990 2000 4-month ahead 0 1970 20 1980 1990 2000 5-month ahead 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 7-month ahead 0 1970 20 1980 1990 2000 8-month ahead 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 10-month ahead 0 1970 20 1980 1990 2000 11-month ahead 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 18-month ahead 0 1970 20 1980 1990 2000 21-month ahead 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 1980 1990 2000 0 1970 1980 1990 2000 0 1970 1980 1990 2000 6-month ahead 1980 1990 2000 9-month ahead 1980 1990 2000 12-month ahead 1980 1990 2000 24-month ahead 1980 1990 2000 MSFE EH vs UVAR MAFE EH vs UVAR Critical Value 45 Figure 13: EH prior versus UVAR: rolling (5-year window) Giacomini and White (2004) test statistic for the null of equal conditional predictive accuracy. The statistic is computed both for the MAFE (solid) and the MSFE (dotted) loss function. If the statistic is above the critical value, the null of equal accuracy can be rejected. 50 1-month ahead 40 30 20 10 0 1970 20 1980 1990 2-month ahead 20 2000 4-month ahead 20 15 15 10 10 5 5 0 1970 1980 1990 2000 5-month ahead 20 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 7-month ahead 0 1970 1980 1990 2000 8-month ahead 20 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 10-month ahead 0 1970 1980 1990 2000 11-month ahead 20 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 20 1980 1990 2000 18-month ahead 0 1970 1980 1990 2000 21-month ahead 20 0 1970 20 15 15 15 10 10 10 5 5 5 0 1970 1980 1990 2000 0 1970 1980 1990 2000 0 1970 3-month ahead 1980 1990 2000 6-month ahead 1980 1990 2000 9-month ahead 1980 1990 2000 12-month ahead 1980 1990 2000 24-month ahead 1980 1990 2000 MSFE EH vs Minnesota MAFE EH vs Minnesota Critical Value 46 Figure 14: EH prior versus Minnesota prior: rolling (5-year window) Giacomini and White (2004) test statistic for the null of equal conditional predictive accuracy. The statistic is computed both for the MAFE (solid) and the MSFE (dotted) loss function. If the statistic is above the critical value, the null of equal accuracy can be rejected.

1/--страниц