Estimate a simple model - Schorfheide (2000)

This tutorial is intended to show the workflow to estimate a model using the No-U-Turn sampler (NUTS). The tutorial works with a benchmark model for estimation and can therefore be compared to results from other software packages (e.g. dynare).

Define the model

The first step is always to name the model and write down the equations. For the Schorfheide (2000) model this would go as follows:

julia> using MacroModelling
julia> @model FS2000 begin dA[0] = exp(gam + z_e_a * e_a[x]) log(m[0]) = (1 - rho) * log(mst) + rho * log(m[-1]) + z_e_m * e_m[x] - P[0] / (c[1] * P[1] * m[0]) + bet * P[1] * (alp * exp( - alp * (gam + log(e[1]))) * k[0] ^ (alp - 1) * n[1] ^ (1 - alp) + (1 - del) * exp( - (gam + log(e[1])))) / (c[2] * P[2] * m[1])=0 W[0] = l[0] / n[0] - (psi / (1 - psi)) * (c[0] * P[0] / (1 - n[0])) + l[0] / n[0] = 0 R[0] = P[0] * (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ ( - alp) / W[0] 1 / (c[0] * P[0]) - bet * P[0] * (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ (1 - alp) / (m[0] * l[0] * c[1] * P[1]) = 0 c[0] + k[0] = exp( - alp * (gam + z_e_a * e_a[x])) * k[-1] ^ alp * n[0] ^ (1 - alp) + (1 - del) * exp( - (gam + z_e_a * e_a[x])) * k[-1] P[0] * c[0] = m[0] m[0] - 1 + d[0] = l[0] e[0] = exp(z_e_a * e_a[x]) y[0] = k[-1] ^ alp * n[0] ^ (1 - alp) * exp( - alp * (gam + z_e_a * e_a[x])) gy_obs[0] = dA[0] * y[0] / y[-1] gp_obs[0] = (P[0] / P[-1]) * m[-1] / dA[0] log_gy_obs[0] = log(gy_obs[0]) log_gp_obs[0] = log(gp_obs[0]) endModel: FS2000 Variables Total: 18 Auxiliary: 2 States: 4 Auxiliary: 0 Jumpers: 7 Auxiliary: 2 Shocks: 2 Parameters: 9

First, we load the package and then use the @model macro to define our model. The first argument after @model is the model name and will be the name of the object in the global environment containing all information regarding the model. The second argument to the macro are the equations, which we write down between begin and end. Equations can contain an equality sign or the expression is assumed to equal 0. Equations cannot span multiple lines (unless you wrap the expression in brackets) and the timing of endogenous variables are expressed in the squared brackets following the variable name (e.g. [-1] for the past period). Exogenous variables (shocks) are followed by a keyword in squared brackets indicating them being exogenous (in this case [x]). Note that names can leverage julia's unicode capabilities (e.g. alpha can be written as α).

Define the parameters

Next we need to add the parameters of the model. The macro @parameters takes care of this:

julia> @parameters FS2000 begin
           alp     = 0.356
           bet     = 0.993
           gam     = 0.0085
           mst     = 1.0002
           rho     = 0.129
           psi     = 0.65
           del     = 0.01
           z_e_a   = 0.035449
           z_e_m   = 0.008862
       endRemove redundant variables in non stochastic steady state problem:	1.874 seconds
Set up non stochastic steady state problem:				2.482 seconds
Take symbolic derivatives up to first order:				0.5 seconds
Find non stochastic steady state:					4.049 seconds
Model:        FS2000
Variables
 Total:       18
  Auxiliary:  2
 States:      4
  Auxiliary:  0
 Jumpers:     7
  Auxiliary:  2
Shocks:       2
Parameters:   9

The block defining the parameters above only describes the simple parameter definitions the same way you assign values (e.g. alp = .356).

Note that we have to write one parameter definition per line.

Load data

Given the equations and parameters, we only need the entries in the data which correspond to the observables in the model (need to have the exact same name) to estimate the model. First, we load in the data from a CSV file (using the CSV and DataFrames packages) and convert it to a KeyedArray (using the AxisKeys package). Furthermore, we log transform the data provided in levels, and last but not least we select only those variables in the data which are observables in the model.

julia> using CSV, DataFrames, AxisKeys
       
       # load data
julia> dat = CSV.read("../assets/FS2000_data.csv", DataFrame)192×2 DataFrame Row │ gy_obs gp_obs │ Float64 Float64 ─────┼──────────────────── 1 │ 1.03836 0.99556 2 │ 1.0267 1.0039 3 │ 1.03352 1.02277 4 │ 1.0163 1.01683 5 │ 1.00469 1.03897 6 │ 1.01428 1.00462 7 │ 1.01456 1.00051 8 │ 0.996854 1.01176 ⋮ │ ⋮ ⋮ 186 │ 1.0126 1.00303 187 │ 1.00279 1.00458 188 │ 1.00813 1.00447 189 │ 1.00849 1.00699 190 │ 1.00764 1.00387 191 │ 1.00797 1.00296 192 │ 1.00489 1.00286 177 rows omitted
julia> data = KeyedArray(Array(dat)',Variable = Symbol.("log_".*names(dat)),Time = 1:size(dat)[1])2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element Vector{Symbol}Time ∈ 192-element UnitRange{Int64} And data, 2×192 adjoint(::Matrix{Float64}) with eltype Float64: (1) (2) (3)(191) (192) (:log_gy_obs) 1.03836 1.0267 1.03352 1.00797 1.00489 (:log_gp_obs) 0.99556 1.0039 1.02277 1.00296 1.00286
julia> data = log.(data) # declare observables2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element Vector{Symbol}Time ∈ 192-element UnitRange{Int64} And data, 2×192 Matrix{Float64}: (1) (2)(191) (192) (:log_gy_obs) 0.0376443 0.026348 0.00793817 0.00487735 (:log_gp_obs) -0.0044494 0.0038943 0.00295712 0.00285919
julia> observables = sort(Symbol.("log_".*names(dat))) # subset observables in data2-element Vector{Symbol}: :log_gp_obs :log_gy_obs
julia> data = data(observables,:)2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ Variable ∈ 2-element view(::Vector{Symbol},...)Time ∈ 192-element UnitRange{Int64} And data, 2×192 view(::Matrix{Float64}, [2, 1], :) with eltype Float64: (1) (2)(191) (192) (:log_gp_obs) -0.0044494 0.0038943 0.00295712 0.00285919 (:log_gy_obs) 0.0376443 0.026348 0.00793817 0.00487735

Define bayesian model

Next we define the parameter priors using the Turing package. The @model macro of the Turing package allows us to define the prior distributions over the parameters and combine it with the (Kalman filter) loglikelihood of the model and parameters given the data with the help of the get_loglikelihood function. We define the prior distributions in an array and pass it on to the arraydist function inside the @model macro from the Turing package. It is also possible to define the prior distributions inside the macro but especially for reverse mode auto differentiation the arraydist function is substantially faster. When defining the prior distributions we can rely n the distribution implemented in the Distributions package. Note that the μσ parameter allows us to hand over the moments (μ and σ) of the distribution as parameters in case of the non-normal distributions (Gamma, Beta, InverseGamma), and we can also define upper and lower bounds truncating the distribution as third and fourth arguments to the distribution functions. Last but not least, we define the loglikelihood and add it to the posterior loglikelihood with the help of the @addlogprob! macro.

julia> import Turing
julia> import Turing: NUTS, sample, logpdf
julia> prior_distributions = [ Beta(0.356, 0.02, μσ = true), # alp Beta(0.993, 0.002, μσ = true), # bet Normal(0.0085, 0.003), # gam Normal(1.0002, 0.007), # mst Beta(0.129, 0.223, μσ = true), # rho Beta(0.65, 0.05, μσ = true), # psi Beta(0.01, 0.005, μσ = true), # del InverseGamma(0.035449, Inf, μσ = true), # z_e_a InverseGamma(0.008862, Inf, μσ = true) # z_e_m ]9-element Vector{Distributions.Distribution{Distributions.Univariate, Distributions.Continuous}}: Distributions.Beta{Float64}(α=203.68895999999998, β=368.47104) Distributions.Beta{Float64}(α=1724.5927500000014, β=12.157249999999973) Distributions.Normal{Float64}(μ=0.0085, σ=0.003) Distributions.Normal{Float64}(μ=1.0002, σ=0.007) Distributions.Beta{Float64}(α=0.16246596553318987, β=1.0969601238713826) Distributions.Beta{Float64}(α=58.49999999999999, β=31.49999999999999) Distributions.Beta{Float64}(α=3.95, β=391.05) Distributions.InverseGamma{Float64}( invd: Distributions.Gamma{Float64}(α=2.0, θ=28.2095404665858) θ: 0.035449 ) Distributions.InverseGamma{Float64}( invd: Distributions.Gamma{Float64}(α=2.0, θ=112.84134506883322) θ: 0.008862 )
julia> Turing.@model function FS2000_loglikelihood_function(data, model) parameters ~ Turing.arraydist(prior_distributions) Turing.@addlogprob! get_loglikelihood(model, data, parameters) endFS2000_loglikelihood_function (generic function with 2 methods)

Sample from posterior: No-U-Turn Sampler (NUTS)

We use the NUTS sampler to retrieve the posterior distribution of the parameters. This sampler uses the gradient of the posterior loglikelihood with respect to the model parameters to navigate the parameter space. The NUTS sampler is considered robust, fast, and user-friendly (auto-tuning of hyper-parameters).

First we define the loglikelihood model with the specific data, and model. Next, we draw 1000 samples from the model:

julia> FS2000_loglikelihood = FS2000_loglikelihood_function(data, FS2000);
julia> n_samples = 10001000
julia> chain_NUTS = sample(FS2000_loglikelihood, NUTS(), n_samples, progress = false);┌ Info: Found initial step size └ ϵ = 0.00625

Inspect posterior

In order to understand the posterior distribution and the sequence of sample we are plot them:

julia> using StatsPlots
julia> StatsPlots.plot(chain_NUTS);

NUTS chain

Next, we are plotting the posterior loglikelihood along two parameters dimensions, with the other parameters ket at the posterior mean, and add the samples to the visualisation. This visualisation allows us to understand the curvature of the posterior and puts the samples in context.

julia> using ComponentArrays, MCMCChains, DynamicPPLWARNING: using DynamicPPL.@model in module Main conflicts with an existing identifier.
julia> parameter_mean = mean(chain_NUTS)Mean parameters mean Symbol Float64 parameters[1] 0.4040 parameters[2] 0.9905 parameters[3] 0.0046 parameters[4] 1.0141 parameters[5] 0.8477 parameters[6] 0.6822 parameters[7] 0.0025 parameters[8] 0.0138 parameters[9] 0.0033
julia> pars = ComponentArray([parameter_mean.nt[2]], Axis(:parameters));
julia> logjoint(FS2000_loglikelihood, pars)1343.412656463074
julia> function calculate_log_probability(par1, par2, pars_syms, orig_pars, model) orig_pars[1][pars_syms] = [par1, par2] logjoint(model, orig_pars) endcalculate_log_probability (generic function with 1 method)
julia> granularity = 32;
julia> par1 = :del;
julia> par2 = :gam;
julia> paridx1 = indexin([par1], FS2000.parameters)[1];
julia> paridx2 = indexin([par2], FS2000.parameters)[1];
julia> par_range1 = collect(range(minimum(chain_NUTS[Symbol("parameters[$paridx1]")]), stop = maximum(chain_NUTS[Symbol("parameters[$paridx1]")]), length = granularity));
julia> par_range2 = collect(range(minimum(chain_NUTS[Symbol("parameters[$paridx2]")]), stop = maximum(chain_NUTS[Symbol("parameters[$paridx2]")]), length = granularity));
julia> p = surface(par_range1, par_range2, (x,y) -> calculate_log_probability(x, y, [paridx1, paridx2], pars, FS2000_loglikelihood), camera=(30, 65), colorbar=false, color=:inferno);
julia> joint_loglikelihood = [logjoint(FS2000_loglikelihood, ComponentArray([reduce(hcat, get(chain_NUTS, :parameters)[1])[s,:]], Axis(:parameters))) for s in 1:length(chain_NUTS)];
julia> scatter3d!(vec(collect(chain_NUTS[Symbol("parameters[$paridx1]")])), vec(collect(chain_NUTS[Symbol("parameters[$paridx2]")])), joint_loglikelihood, mc = :viridis, marker_z = collect(1:length(chain_NUTS)), msw = 0, legend = false, colorbar = false, xlabel = string(par1), ylabel = string(par2), zlabel = "Log probability", alpha = 0.5);
julia> pPlot{Plots.GRBackend() n=2}

Posterior surface

Find posterior mode

Other than the mean and median of the posterior distribution we can also calculate the mode. To this end we will use L-BFGS optimisation routines from the Optim package.

First, we define the posterior loglikelihood function, similar to how we defined it for the Turing model macro.

julia> function calculate_posterior_loglikelihood(parameters, prior_distribuions)
           alp, bet, gam, mst, rho, psi, del, z_e_a, z_e_m = parameters
           log_lik = 0
           log_lik -= get_loglikelihood(FS2000, data, parameters)
       
           for (dist, val) in zip(prior_distribuions, parameters)
               log_lik -= logpdf(dist, val)
           end
       
           return log_lik
       endcalculate_posterior_loglikelihood (generic function with 1 method)

Next, we set up the optimisation problem, parameter bounds, and use the optimizer L-BFGS.

julia> using Optim, LineSearches
julia> lbs = [0,0,-10,-10,0,0,0,0,0];
julia> ubs = [1,1,10,10,1,1,1,100,100];
julia> sol = optimize(x -> calculate_posterior_loglikelihood(x, prior_distributions), lbs, ubs , FS2000.parameter_values, Fminbox(LBFGS(linesearch = LineSearches.BackTracking(order = 3))); autodiff = :forward) * Status: success * Candidate solution Final objective value: -1.343749e+03 * Found with Algorithm: Fminbox with L-BFGS * Convergence measures |x - x'| = 1.20e-17 ≰ 0.0e+00 |x - x'|/|x'| = 6.52e-18 ≰ 0.0e+00 |f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00 |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00 |g(x)| = 1.64e-03 ≰ 1.0e-08 * Work counters Seconds run: 19 (vs limit Inf) Iterations: 6 f(x) calls: 487 ∇f(x) calls: 83
julia> sol.minimum-1343.7491257511065

Model estimates given the data and the model solution

Having found the parameters at the posterior mode we can retrieve model estimates of the shocks which explain the data used to estimate it. This can be done with the get_estimated_shocks function:

julia> get_estimated_shocks(FS2000, data, parameters = sol.minimizer)2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   Shocks ∈ 2-element Vector{Symbol}Periods ∈ 192-element UnitRange{Int64}
And data, 2×192 Matrix{Float64}:
             (1)         (2)(191)         (192)
  (:e_a₍ₓ₎)    3.07802     2.02956          0.31192       0.0219213
  (:e_m₍ₓ₎)   -0.338786    0.529109        -0.455173     -0.596633

As the first argument we pass the model, followed by the data (in levels), and then we pass the parameters at the posterior mode. The model is solved with this parameterisation and the shocks are calculated using the Kalman smoother.

We estimated the model on two variables but our model allows us to look at all variables given the data. Looking at the estimated variables can be done using the get_estimated_variables function:

julia> get_estimated_variables(FS2000, data)2-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   Variables ∈ 18-element Vector{Symbol}Periods ∈ 192-element UnitRange{Int64}
And data, 18×192 Matrix{Float64}:
                 (1)          (2)(191)           (192)
  (:P)             0.585869     0.596628          0.560846        0.559734
  (:Pᴸ⁽¹⁾)         0.585109     0.595479          0.561547        0.560631
  (:R)             1.02576      1.02692           1.01966         1.01858
  (:W)             2.99871      3.01049           2.96335         2.9599
  (:c)             1.73321      1.7013     …      1.80033         1.80164
  (:cᴸ⁽¹⁾)         1.73494      1.7039            1.79935         1.80039
   ⋮                                       ⋱      ⋮
  (:k)            53.4885      52.0319           56.2707         56.2515
  (:l)             0.724385     0.733136          0.705017        0.704388
  (:log_gp_obs)   -0.0044494    0.0038943         0.00295712      0.00285919
  (:log_gy_obs)    0.0376443    0.026348   …      0.00793817      0.00487735
  (:m)             1.01686      1.01822           1.00971         1.00845
  (:n)             0.241598     0.243599          0.237911        0.237975
  (:y)             2.1224       2.11046           2.14601         2.14608

Since we already solved the model with the parameters at the posterior mode we do not need to do so again. The function returns a KeyedArray with the values of the variables in levels at each point in time.

Another useful tool is a historical shock decomposition. It allows us to understand the contribution of the shocks for each variable. This can be done using the get_shock_decomposition function:

julia> get_shock_decomposition(FS2000, data)3-dimensional KeyedArray(NamedDimsArray(...)) with keys:
↓   Variables ∈ 18-element Vector{Symbol}Shocks ∈ 3-element Vector{Symbol}Periods ∈ 192-element UnitRange{Int64}
And data, 18×3×192 Array{Float64, 3}:
[showing 3 of 192 slices]
[:, :, 1] ~ (:, :, 1):
                 (:e_a₍ₓ₎)     (:e_m₍ₓ₎)     (:Initial_values)
  (:P)            0.0148888    -0.00108998    0.00810035
  (:Pᴸ⁽¹⁾)        0.0146127    -0.000913525   0.00743931
  (:R)            1.27548e-15  -0.000955352   0.0031234
   ⋮
  (:log_gy_obs)   0.0328867     8.71844e-5    0.000120402
  (:m)           -0.0          -0.00111942    0.00365979
  (:n)            0.00322234    3.46637e-5    0.000868553
  (:y)           -0.018474      0.000187179  -0.00624119

[:, :, 97] ~ (:, :, 97):
                 (:e_a₍ₓ₎)     (:e_m₍ₓ₎)     (:Initial_values)
  (:P)            0.0288477     0.00602704    0.000719125
  (:Pᴸ⁽¹⁾)        0.0283128     0.00503281    0.00070579
  (:R)            5.82183e-15   0.00539571    3.21952e-10
   ⋮
  (:log_gy_obs)  -0.0168618     0.000336676   7.85287e-6
  (:m)            8.35267e-15   0.00632233    3.77241e-10
  (:n)            0.00624342   -0.000223709   0.000155638
  (:y)           -0.0357942    -0.000897024  -0.000892289

[:, :, 192] ~ (:, :, 192):
                 (:e_a₍ₓ₎)     (:e_m₍ₓ₎)     (:Initial_values)
  (:P)            0.00129291   -0.00565098    0.000121484
  (:Pᴸ⁽¹⁾)        0.00126893   -0.00472692    0.000119231
  (:R)           -9.13394e-16  -0.00500928    6.67869e-17
   ⋮
  (:log_gy_obs)   0.000247176   7.87664e-5    1.32661e-6
  (:m)           -1.05673e-15  -0.00586953    8.1532e-17
  (:n)            0.000279819   0.000195656   2.62924e-5
  (:y)           -0.00160423    0.000901755  -0.000150737

We get a 3-dimensional array with variables, shocks, and time periods as dimensions. The shocks dimension also includes the initial value as a residual between the actual value and what was explained by the shocks. This computation also relies on the Kalman smoother.

Last but not least, we can also plot the model estimates and the shock decomposition. The model estimates plot, using plot_model_estimates:

julia> plot_model_estimates(FS2000, data)3-element Vector{Any}:
 Plot{Plots.GRBackend() n=38}
 Plot{Plots.GRBackend() n=36}
 Plot{Plots.GRBackend() n=6}

Model estimates

shows the variables of the model (blue), the estimated shocks (in the last panel), and the data (red) used to estimate the model.

The shock decomposition can be plotted using plot_shock_decomposition:

julia> plot_shock_decomposition(FS2000, data)3-element Vector{Any}:
 Plot{Plots.GRBackend() n=50}
 Plot{Plots.GRBackend() n=52}
 Plot{Plots.GRBackend() n=9}

Shock decomposition

and it shows the contribution of the shocks and the contribution of the initial value to the deviations of the variables.