Set identification

In statistics and econometrics, set identification (or partial identification) extends the concept of identifiability (or "point identification") in statistical models to situations where the distribution of observable variables is not informative of the exact value of a parameter, but instead constrains the parameter to lie in a strict subset of the parameter space. Statistical models that are set identified arise in a variety of settings in economics, including game theory and the Rubin causal model.

Though the use of set identification dates to a 1934 article by Ragnar Frisch, the methods were significantly developed and promoted by Charles Manski starting in the 1990s.[1] Manski developed a method of worst-case bounds for accounting for selection bias. Unlike methods that make additional statistical assumptions, such as Heckman correction, the worst-case bounds rely only on the data to generate a range of supported parameter values.[2]

Definition

Let be a statistical model where the parameter space is either finite- or infinite-dimensional. Suppose is the true parameter value. We say that is set identified if there exists such that ; that is, that some parameter values in are not observationally equivalent to . In that case, the identified set is the set of parameter values that are observationally equivalent to .[1]

Example: missing data

This example is due to Tamer (2010). Suppose there are two binary random variables, Y and Z. The econometrician is interested in . There is a missing data problem, however: Y can only be observed if .

By the law of total probability,

The only unknown object is , which is constrained to lie between 0 and 1. Therefore, the identified set is

Given the missing data constraint, the econometrician can only say that . This makes use of all available information.

Statistical inference

Set estimation cannot rely on the usual tools for statistical inference developed for point estimation. A literature in statistics and econometrics studies methods for statistical inference in the context of set-identified models, focusing on constructing confidence intervals or confidence regions with appropriate properties. For example, a method developed by Chernozhukov, Hong & Tamer (2007) (and which Lewbel (2019) describes as complicated) constructs confidence regions that cover the identified set with a given probability.

Notes

References

  • Chernozhukov, Victor; Hong, Han; Tamer, Elie (2007). "Estimation and Confidence Regions for Parameter Sets in Econometric Models". Econometrica. The Econometric Society. 75 (5): 1243–1284. doi:10.1111/j.1468-0262.2007.00794.x. hdl:1721.1/63545. ISSN 0012-9682.
  • Lewbel, Arthur (2019-12-01). "The Identification Zoo: Meanings of Identification in Econometrics". Journal of Economic Literature. American Economic Association. 57 (4): 835–903. doi:10.1257/jel.20181361. ISSN 0022-0515.
  • Tamer, Elie (2010). "Partial Identification in Econometrics". Annual Review of Economics. 2 (1): 167–195. doi:10.1146/annurev.economics.050708.143401.

Further reading

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.