import numpy as np
import pandas as pd
import pickle

from ruspy.estimation.criterion_function import get_criterion_function
from ruspy.simulation.simulation import simulate
from ruspy.model_code.fix_point_alg import calc_fixp
from ruspy.model_code.cost_functions import lin_cost
from ruspy.model_code.cost_functions import calc_obs_costs
from ruspy.estimation.estimation_transitions import create_transition_matrix
from auxiliary_iskhakov import check_simulated_data
from estimagic import minimize

Replication of Iskhakov et al. (2016)

In this notebook we seek to replicate Iskhakov et al. (2016) with some notable differences in the setup which we will explain in the course of this notebook. Before making your way through this notebook it is advisable to first have a look at the other replication notebook to get you started on how to use the ruspy package.

To familiarize you with the set up in the paper let us start with a few key points to understand the remainder. Iskhakov et al. (2016) follow up on Su and Judd (2012) in which they had compared the performance of Mathematical Programming with Equlibrium Constraints (MPEC) and the Nested Fixed Point Algorithm (NFXP) for the canonical structural model of bus engine replacement based on Rust (1987). The aim of Iskhakov et al. is mainly to improve the setup of the NFXP of Su and Judd following the explanations of Rust (2000). Their revised comparison between MPEC and NFXP shows that both approaches perform similarly for a Monte Carlo simulation of the bus engine replacement problem.

We replicate parts of this Monte Carlo simulation below using our ruspy package in which we have implemented MPEC as well as the NFXP in pure Python. This is gives us the flexibility to easily experiment with key assumptions such as tolerance or optimization algorithms used in one common interface.

In the boxes just below we obtain all the key results needed to replicate Table I of Iskhakov et al. For that we simulate the data using the ruspy package and based on the data generating process described in Iskhakov et al. (2016). There they use the setup in which we have 120 time periods and 50 buses. The cost function is assumed to be linear taking the following form \(c(x, \theta_1) = 0.001 \theta_{11} x\). The true parameters are the following:

\begin{equation} RC=11.7257, \\ \theta_{11}=2.4569, \\ \theta_3=(0.0937, 0.4475, 0.4459, 0.0127, 0.0002). \end{equation}

The true parameter for \(\beta\) is varied: \(\beta\in \{0.975, 0.985, 0.995, 0.999, 0.9995, 0.9999\}\). For each \(\beta\) 250 data sets are simulated. Those are then estimated using five different starting values for the replacement cost \(RC\) and the linear cost parameter \(\theta_{11}\). The following are the five different starting values:

\begin{align} (RC^0, \theta^0_{11}) \in \{(4,1), (5,2), (6,3), (7,4), (8,5)\} \end{align}

In the case of MPEC also starting values for the discretized expected value function are necessary and they are set to the zero vector every time: \(EV^0_1,...,EV^0_{175}=(0,...,0)\). The subscript of 175 also displays the grid size that Iskhakov et al. choose to discretize the continuous variable of mileage. The starting values of the transition probabilities are frequency based in the paper and we follow this approach in our implementation.

All those ingredients can be found below in the code. Further in our code for the NFXP we specify stopping criteria for the fixed point calculation as well as the tolerance at which we switch from contraction steps to Newton-Kantorovich (N-K) steps. Here, there is a first difference in our implementation of the NFXP as there is no switching back from N-K to contraction steps implemented. Furthermore the switching tolerance is solely an absolute one. For the routine to maximize the likelihood function we use the scipy L-BFGS-B povided by estimagic which uses a combination of relative and absolute stopping tolerance which also does not exactly match the original paper. Further the authors employ the BHHH algorithm instead of the L-BFGS-B which we will follow as well as soon as the algorithm is available on estimagic.

For MPEC we could not rely on KNITRO as in the paper as it is not freely available. For the recreation of the table in Iskhakov et al. we decided to use nlopt here. Again the stopping tolerances cannot exactly be set to those of KNITRO in the paper. Another notable difference is that we only give analytical first order derivatives to nlopt. In the paper, on top of that second order analytical derivatives are provided by using automatic differentiation and also the sparsity patterns of both order derivatives are passed in. Apart from that, though, our setup replicates the paper by using upper and lower bounds as well as a properly recentering the expected value function.

A last difference in our setup that affects both NFXP and MPEC is that we estimate the transition probabilities separately from the cost parameters. In the original paper for MPEC those are estimated jointly and for the NFXP the cost parameters are first estimated partially and then after that together with the transition probabilities.

For the sake of simplicity and comprehensibility, we decide to solely present the output for the discount factor \(0.975\) paired with 100 simulation runs and only \((4, 1)\) as starting values for the cost parameter. This setup is selected below. But we also give the option to run the full simulation as done in Iskhakov et al.

discount_factor = [0.975]
number_runs = 100
starting_cost_params = np.array([[4], [1]])

In order to run the full simulation as described above, please uncomment the cell below and run it. The simulation then might take quite a while.

# # uncomment and run this cell for a whole replication of Iskhakov et al. (2016)
# discount_factor = [0.975, 0.985, 0.995, 0.999, 0.9995, 0.9999]
# number_runs = 250
# starting_cost_params = np.vstack((np.arange(4,9), np.arange(1,6)))

For ruspy to run the estimation using either NFXP or MPEC we need to pass in a dictionairy plus the data to the function get_criterion_function. The cost parameters are estimated by minimizing the resulting criterion function using the minimize function of estimagic. Below we create one dictionairy per approach (NFXP or MPEC) and adapt it accordingly within the nested for-loop below depending on which discount factor \(\beta\) and which starting values are used in the respective run.

# Initialize the simulation
approach = ["NFXP", "MPEC"]
starting_expected_value_fun = np.zeros(175)
number_buses = 50
number_periods = 120
number_states = 175
number_cost_params = 2
scale = 1e-3

# Initialize the set up for the nested fixed point algorithm
stopping_crit_fixed_point = 1e-13
switch_tolerance_fixed_point = 1e-2

init_dict_nfxp = {
    "model_specifications": {
        "num_states": number_states,
        "maint_cost_func": "linear",
        "cost_scale": scale,
    "method": "NFXP",
    "alg_details": {
        "threshold": stopping_crit_fixed_point,
        "switch_tol": switch_tolerance_fixed_point,

init_dict_mpec = {
    "model_specifications": {
        "num_states": number_states,
        "maint_cost_func": "linear",
        "cost_scale": scale,
    "method": "MPEC",
# Initialize DataFrame to store the results of each run of the Monte Carlo simulation
# does not store "CPU Time", "# of Major Iter.", "# of Func. Eval.", "# of Bellm. Iter.", "# of N-K Iter."
index = pd.MultiIndex.from_product(
    names=["Discount Factor", "Run", "Start", "Approach"],

columns = [

results = pd.DataFrame(index=index, columns=columns)
# set seed

saved_data = {}
# Main loop to calculate the results for each run
for factor in discount_factor:
    saved_data[factor] = {}
    # set up simulation of data
    init_dict_simulation = {
        "simulation": {
            "discount_factor": factor,
            "periods": number_periods,
            "buses": number_buses,
    params = np.array([11.7257, 2.4569])
    trans_probs = np.array([0.0937, 0.4475, 0.4459, 0.0127, 0.0002])

    # Calculate objects necessary for the simulation process. See documentation for details.
    costs = calc_obs_costs(number_states, lin_cost, params, scale)
    trans_mat = create_transition_matrix(number_states, trans_probs)
    ev = calc_fixp(trans_mat, costs, factor)[0]

    for run in range(number_runs):
        # simulate the data
        data = simulate(init_dict_simulation["simulation"], ev, costs, trans_mat)
        saved_data[factor][run] = data
        for start in range(starting_cost_params.shape[1]):

            init_params = starting_cost_params[:, start]

            # Adapt the Initiation Dictionairy of NFXP for this run
            init_dict_nfxp["model_specifications"]["discount_factor"] = factor

            # Run NFXP using ruspy and estimagic
            function_dict_nfxp, transition_result_nfxp = get_criterion_function(
                init_dict_nfxp, data

            cost_result_nfxp = minimize(
            # convergence
            status_nfxp = np.array([int(cost_result_nfxp.success)])

            # store the results of this run
            results.loc[factor, run, start, "NFXP"] = np.concatenate(
                (cost_result_nfxp.params, transition_result_nfxp["x"][:4], status_nfxp)

            # Adapt the Initiation Dictionairy of MPEC for this run
            init_dict_mpec["model_specifications"]["discount_factor"] = factor

            # Run MPEC using ruspy and estimagic
            function_dict_mpec, result_transitions_mpec = get_criterion_function(
                init_dict_mpec, data

            cost_result_mpec = minimize(
                params=np.concatenate((np.zeros(number_states), init_params)),
                    "type": "nonlinear",
                    "func": function_dict_mpec["constraint"],
                    "derivative": function_dict_mpec["constraint_derivative"],
                    "value": np.zeros(number_states, dtype=float),
            # convergence
            status_mpec = np.array([int(cost_result_mpec.success)])

            # store the results of this run
            results.loc[factor, run, start, "MPEC"] = np.concatenate(

To give you an idea of what the above loop has just produced, let us have a look at the results.

RC theta_11 theta_30 theta_31 theta_32 theta_33 Converged
Discount Factor Run Start Approach
0.975 0 0 NFXP 15.227656 3.688777 0.091333 0.4475 0.446 0.014667 1.0
MPEC 15.227726 3.6888 0.091333 0.4475 0.446 0.014667 1.0
1 0 NFXP 11.963511 2.49318 0.091 0.4535 0.444833 0.010667 1.0
MPEC 11.963561 2.493201 0.091 0.4535 0.444833 0.010667 1.0
2 0 NFXP 12.225457 2.356033 0.0825 0.460333 0.444833 0.012167 1.0
... ... ... ... ... ... ... ... ... ...
97 0 MPEC 12.320376 2.66534 0.091833 0.447 0.448833 0.012333 1.0
98 0 NFXP 10.955414 2.197645 0.091333 0.4465 0.447333 0.0145 1.0
MPEC 10.954181 2.197365 0.091333 0.4465 0.447333 0.0145 1.0
99 0 NFXP 11.483554 2.334562 0.094167 0.4485 0.444667 0.0125 1.0
MPEC 11.483555 2.334563 0.094167 0.4485 0.444667 0.0125 1.0

200 rows × 7 columns

As we have actually relied on our own simulated data and do not use the one provided by Iskhakov et al., we calculate some some key statistics of theirs and ours to validate our simulated data and ensure that our results are actually comparrable to theirs. Some selected statistics are presented below for our simulated data sets.

check_simulated_data(saved_data, discount_factor, number_runs)
Average State at Replacement Average of all States Average Replacement
Discount Factor
0.975 125.361716 60.184252 0.007158

This is followed by the same statistics for the simulated data of Iskhakov et al.

pd.read_pickle("check_simulated_data_iskhakov.pickle").loc[discount_factor, :]
Average State at Replacement Average of all States Average Replacement
Discount Factor
0.975 125.011564 60.088817 0.007145

Before we proceed to Table I in Iskhakov et al., we give out the means and standard deviations of our parameter estimations like it was done by Su and Judd (2012) in Table I.

# Create Table I from Su & Judd (2012) with the simulated values from Iskahkov et al. (2016)
columns_table_1 = ["RC", "theta_11", "theta_30", "theta_31", "theta_32", "theta_33"]
table_1_temp = results.loc[results["Converged"] == 1, columns_table_1].astype(float).groupby(
    level=["Discount Factor", "Approach"])

statistic = ["Mean", "Standard Deviation"]
index = pd.MultiIndex.from_product([discount_factor, approach, statistic],
                                   names=["Discount Factor", "Approach", "Statistic"])
table_1 = pd.DataFrame(index=index, columns=columns_table_1)
table_1.loc(axis=0)[:,:,"Mean"] = table_1_temp.mean()
table_1.loc(axis=0)[:,:,"Standard Deviation"] = table_1_temp.std()
RC theta_11 theta_30 theta_31 theta_32 theta_33
Discount Factor Approach Statistic
0.975 NFXP Mean 11.983 2.525 0.094 0.447 0.446 0.013
Standard Deviation 1.479 0.434 0.004 0.006 0.006 0.002
MPEC Mean 11.982 2.525 0.094 0.447 0.446 0.013
Standard Deviation 1.479 0.434 0.004 0.006 0.006 0.002

To get an idea of the quality of our estimates, we compare this now to the results obtained by Iskhakov et al. using their original matlab code for the implementation of their NFXP. Those results were not published in the paper.

# Create a Table with the results
index = pd.MultiIndex.from_product([discount_factor, statistic],
                                   names=["Discount Factor", "Statistic"])
NFXP_Iskhakov = pd.DataFrame(index=index, columns=["RC", "theta_11"])
# load stored results
results_iskhakov = pickle.load(open("results_iskhakov.pickle", "rb"))
for factor in discount_factor:
    NFXP_Iskhakov_temp = results_iskhakov[factor]
    NFXP_Iskhakov.loc[factor, "Mean"] = NFXP_Iskhakov_temp.mean(axis=0)
    NFXP_Iskhakov.loc[factor, "Standard Deviation"] = NFXP_Iskhakov_temp.std(axis=0)
RC theta_11
Discount Factor Statistic
0.975 Mean 11.914 2.508
Standard Deviation 1.517 0.468


Gabler, J., “A Python Tool for the Estimation of (Structural) Econometric Models.”, unpublished (2019), https://github.com/OpenSourceEconomics/estimagic

Iskhakov, F., Lee, J., Rust, J., Schjerning, B. and Seo, K. (2016), Comment on “Constrained Optimization Approaches to Estimation of Structural Models”. Econometrica, 84: 365-370. doi:10.3982/ECTA12605

Johnson, Steven G., The NLopt nonlinear-optimization package, http://github.com/stevengj/nlopt

Rust, John. “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher.” Econometrica 55, no. 5 (1987): 999-1033. Accessed June 7, 2020. doi:10.2307/1911259.

Rust, John. “Nested fixed point algorithm documentation manual.” Unpublished Manuscript (6) (2000): 1-43.

Su, C.‐L. and Judd, K.L. (2012), Constrained Optimization Approaches to Estimation of Structural Models. Econometrica, 80: 2213-2230. doi:10.3982/ECTA7925

Wächter, A. and Biegler, L. T., On the Implementation of a Primal-Dual Interior Point Filter Line Search Algorithm for Large-Scale Nonlinear Programming, Mathematical Programming 106(1), pp. 25-57, 2006 (preprint), https://github.com/coin-or/Ipopt