Set identification
In statistics and econometrics, set identification (or partial identification) extends the concept of identifiability (or "point identification") in statistical models to situations where the distribution of observable variables is not informative of the exact value of a parameter, but instead constrains the parameter to lie in a strict subset of the parameter space. Statistical models that are set identified arise in a variety of settings in economics, including game theory and the Rubin causal model.
Though the use of set identification dates to a 1934 article by Ragnar Frisch, the methods were significantly developed and promoted by Charles Manski starting in the 1990s.[1] Manski developed a method of worst-case bounds for accounting for selection bias. Unlike methods that make additional statistical assumptions, such as Heckman correction, the worst-case bounds rely only on the data to generate a range of supported parameter values.[2]
Definition
Let be a statistical model where the parameter space is either finite- or infinite-dimensional. Suppose is the true parameter value. We say that is set identified if there exists such that ; that is, that some parameter values in are not observationally equivalent to . In that case, the identified set is the set of parameter values that are observationally equivalent to .[1]
Example: missing data
This example is due to Tamer (2010). Suppose there are two binary random variables, Y and Z. The econometrician is interested in . There is a missing data problem, however: Y can only be observed if .
By the law of total probability,
The only unknown object is , which is constrained to lie between 0 and 1. Therefore, the identified set is
Given the missing data constraint, the econometrician can only say that . This makes use of all available information.
Statistical inference
Set estimation cannot rely on the usual tools for statistical inference developed for point estimation. A literature in statistics and econometrics studies methods for statistical inference in the context of set-identified models, focusing on constructing confidence intervals or confidence regions with appropriate properties. For example, a method developed by Chernozhukov, Hong & Tamer (2007) (and which Lewbel (2019) describes as complicated) constructs confidence regions that cover the identified set with a given probability.
Notes
References
- Chernozhukov, Victor; Hong, Han; Tamer, Elie (2007). "Estimation and Confidence Regions for Parameter Sets in Econometric Models". Econometrica. The Econometric Society. 75 (5): 1243–1284. doi:10.1111/j.1468-0262.2007.00794.x. hdl:1721.1/63545. ISSN 0012-9682.
- Lewbel, Arthur (2019-12-01). "The Identification Zoo: Meanings of Identification in Econometrics". Journal of Economic Literature. American Economic Association. 57 (4): 835–903. doi:10.1257/jel.20181361. ISSN 0022-0515.
- Tamer, Elie (2010). "Partial Identification in Econometrics". Annual Review of Economics. 2 (1): 167–195. doi:10.1146/annurev.economics.050708.143401.
Further reading
- Ho, Kate; Rosen, Adam M. (2017). "Partial Identification in Applied Research: Benefits and Challenges" (PDF). In Honore, Bo; Pakes, Ariel; Piazzesi, Monika; Samuelson, Larry (eds.). Advances in Economics and Econometrics (PDF). Cambridge: Cambridge University Press. pp. 307–359. doi:10.1017/9781108227223.010. ISBN 978-1-108-22722-3.
- Manski, Charles F. (May 1990). "Nonparametric Bounds on Treatment Effects". The American Economic Review: Papers and Proceedings. 80 (2): 319–323. ISSN 0002-8282. JSTOR 2006592.
- Manski, Charles F.; Pepper, John V. (July 2000). "Monotone Instrumental Variables: With an Application to the Returns to Schooling" (PDF). Econometrica. 68 (4): 997–1010. doi:10.1111/1468-0262.00144. ISSN 0012-9682. JSTOR 2999533.
- Manski, Charles F. (2003). Partial Identification of Probability Distributions. New York: Springer-Verlag. ISBN 978-0-387-00454-9.