IVaps.estimator

Treatment estimation functions

Functions

covariate_balance_test([aps, X, Z, data, …])

Covariate Balance Test

covariate_balance_test_controls(aps, X, Z, W)

Covariate Balance Test

estimate_counterfactual_ml([aps, Y, Z, …])

Estimate counterfactual performance of a new algorithm

estimate_treatment_effect([aps, Y, Z, D, …])

Main treatment effect estimation function

estimate_treatment_effect_controls(aps, Y, …)

Main treatment effect estimation function with added controls

IVaps.estimator.covariate_balance_test(aps=None, X=None, Z=None, data=None, X_ind=None, Z_ind=None, aps_ind=None, X_labels=None, cov_type='robust', verbose: bool = True)[source]

Covariate Balance Test

Parameters
  • aps (array-like, default: None) – Array of estimated APS values

  • X (array-like, default: None) – Array of covariates to test

  • Z (array-like, default: None) – Array of treatment recommendations

  • data (array-like, default: None) – 2D array of estimation inputs

  • X_ind (int/array_of_int, default: None) – Indices/indices of covariates in data

  • Z_ind (int, default: None) – Index of treatment recommendation variable in data

  • aps_ind (int, default: None) – Index of APS variable in data

  • X_labels (array-like, default: None) – Array of string labels to associate with each covariate

  • cov_type (str, default: “robust”) – Covariance type of SUR. Any value other than “robust” defaults to simple (nonrobust) covariance.

  • verbose (bool, default: True) – Whether to print output for each test

Returns

Tuple containing the fitted SUR model results and a dictionary containing the results of covariate balance estimation for each covariate as well as the joint hypothesis.

Return type

tuple(SystemResults, dict(X_label, dict(stat_label, value)))

Notes

This function estimates a system of Seemingly Unrelated Regression (SUR) as defined in the linearmodels package.

APS, X, Z, and data should never have any overlapping columns. This is not checkable through the code, so please double check this when passing in the inputs.

For APS, X, Z, either the variables themselves should be passed, or their indices in data. If neither is passed then an error is raised.

IVaps.estimator.covariate_balance_test_controls(aps, X, Z, W, cov_type='robust', verbose: bool = True)[source]

Covariate Balance Test

Parameters
  • aps (array-like, default: None) – Array of estimated APS values

  • X (array-like, default: None) – Array of covariates to test

  • Z (array-like, default: None) – Array of treatment recommendations

  • W (array-like, default: None) – Array of control variables

  • cov_type (str, default: “robust”) – Covariance type of SUR. Any value other than “robust” defaults to simple (nonrobust) covariance.

  • verbose (bool, default: True) – Whether to print output for each test

Returns

Tuple containing the fitted SUR model results and a dictionary containing the results of covariate balance estimation for each covariate as well as the joint hypothesis.

Return type

tuple(SystemResults, dict(X, dict(stat_label, value)))

Notes

This function estimates a system of Seemingly Unrelated Regression (SUR) as defined in the linearmodels package.

IVaps.estimator.estimate_counterfactual_ml(aps=None, Y=None, Z=None, ml_out=None, cf_ml_out=None, data=None, Y_ind=None, Z_ind=None, ml_out_ind=None, cf_ml_out_ind=None, aps_ind=None, cov_type: str = 'unadjusted', single_nondegen: bool = False, verbose: bool = True)[source]

Estimate counterfactual performance of a new algorithm

Parameters
  • aps (array-like, default: None) – Array of estimated APS values

  • Y (array-like, default: None) – Array of outcome variables

  • Z (array-like, default: None) – Array of treatment recommendations

  • ml_out (array-like, default: None) – Original ML function outputs

  • cf_ml_out (array-like, default: None) – Counterfactual ML function outputs

  • data (array-like, default: None) – 2D array of estimation inputs

  • Y_ind (int, default: None) – Index of outcome variable in data

  • Z_ind (int, default: None) – Index of treatment recommendation variable in data

  • ml_out_ind (int, default: None) – Index of original ML output variable in data

  • cf_ml_out_ind (int, default: None) – Index of counterfactual ML output variable in data

  • aps_ind (int, default: None) – Index of APS variable in data

  • estimator (str, default: “2SLS”) – Method of IV estimation

  • single_nondegen (bool, default: False) – Indicator for whether the original ML algorithm takes on a single non-degenerate value in the sample

  • verbose (bool, default: True) – Whether to print output of estimation

Returns

Tuple containing array of predicted float value scores and fitted OLS results.

Return type

tuple(np.ndarray, OLSResults)

Notes

The process of estimating counterfactual value works as follows. First we fit the below OLS regression using historical recommendations and outcome Z and Y.

\[Y_i = \beta_0 + \beta_1 Z_i + \beta_2 p^s(X_i;\delta) + \epsilon_i\]

\(\beta_1\) is our estimated effect of treatment recommendation.

Then we take the original ML output ML1 and the counterfactual ML output ML2 and estimate the below value equation.

\[\hat{V}(ML') = \frac{1}{n} \sum_{i = 1}^n (Y_i + \hat{\beta_{ols}}(ML'(X_i) - ML(X_i))\]
IVaps.estimator.estimate_treatment_effect(aps=None, Y=None, Z=None, D=None, data=None, Y_ind=None, Z_ind=None, D_ind=None, aps_ind=None, estimator: str = '2SLS', verbose: bool = True)[source]

Main treatment effect estimation function

Parameters
  • aps (array-like, default: None) – Array of estimated APS values

  • Y (array-like, default: None) – Array of outcome variables

  • Z (array-like, default: None) – Array of treatment recommendations

  • D (array-like, default: None) – Array of treatment assignments

  • data (array-like, default: None) – 2D array of estimation inputs

  • Y_ind (int, default: None) – Index of outcome variable in data

  • Z_ind (int, default: None) – Index of treatment recommendation variable in data

  • D_ind (int, default: None) – Index of treatment assignment variable in data

  • aps_ind (int, default: None) – Index of APS variable in data

  • estimator (str, default: “2SLS”) – Method of IV estimation

  • verbose (bool, default: True) – Whether to print output of estimation

Returns

Fitted IV model object

Return type

IVResults

Notes

Treatment effect is estimated using IV estimation. The default is to use the 2SLS method of estimation, with the equations illustrated below.

\[D_i = \gamma_0(1-I) + \gamma_1 Z_i + \gamma_2 p^s(X_i;\delta) + v_i \ Y_i = \beta_0(1-I) + \beta_1 D_i + \beta_2 p^s(X_i;\delta) + \epsilon_i\]

\(\beta_1\) is our causal estimation of the treatment effect. \(I\) is an indicator for if the ML funtion takes only a single nondegenerate value in the sample.

aps, Y, Z, D, and data should never have any overlapping columns. This is not checkable through the code, so please double check this when passing in the inputs.

IVaps.estimator.estimate_treatment_effect_controls(aps, Y, Z, D, W, estimator: str = '2SLS', verbose: bool = True)[source]

Main treatment effect estimation function with added controls

Parameters
  • aps (array-like, default: None) – Array of estimated APS values

  • Y (array-like, default: None) – Array of outcome variables

  • Z (array-like, default: None) – Array of treatment recommendations

  • D (array-like, default: None) – Array of treatment assignments

  • W (array-like, default: None) – Array of control variables

  • estimator (str, default: “2SLS”) – Method of IV estimation

  • verbose (bool, default: True) – Whether to print output of estimation

Returns

Fitted IV model object

Return type

IVResults

Notes

Treatment effect is estimated using IV estimation. The default is to use the 2SLS method of estimation, with the equations illustrated below.

\[D_i = \gamma_0(1-I) + \gamma_1 Z_i + \gamma_2 p^s(X_i;\delta) + \gamma_3 W_i + v_i \ Y_i = \beta_0(1-I) + \beta_1 D_i + \beta_2 p^s(X_i;\delta) + + \beta_3 W_i \epsilon_i\]

\(\beta_1\) is our causal estimation of the treatment effect. \(I\) is an indicator for if the ML funtion takes only a single nondegenerate value in the sample.