Identification Problems in Psychometrics

Symposium organised by Ernesto San Martín, Department of Statistics & Measurement Center MIDE UC, Pontificia Universidad Católica de Chile.

Chair: Ernesto San Martín, Friday 24th July, 9.30 - 10.50, Palmeston Lecture Theatre, Fisher Building.

St John's College Dining HallJavier Revuelta, Department of Social Psychology and Methodology, Universidad Autónoma de Madrid, Spain. Identifiability for GLLIRM models that are more general than the NCM.

Alejandro Jara, Department of Statistics, Universidad de Concepción, Chile, and Ernesto San Martín, Department of Statistics & Measurement Center MIDE-UC, Pontificia Universidad Católica de Chile, Chile. Bayesian Semiparametric IRT-type Models.

Timo Bechger, and Gunter Maris, CITO, Dutch National Institute for Educational Measurement, The Netherlands. Equivalent Diagnostic Classification Models.

Ernesto San Martín, Jean-Marie Rolin Department of Statistics & Measurement Center MIDE-UC, Pontificia Universidad Católica de Chile, Chile, and Paul De Boeck. Institut de Statistique, Université catholique de Louvain, Beligum, Faculty of Psychiology, K. U. Leuven, Belgium. Identification of Multiple Classification Latent Class Models (MCLCM).

ABSTRACTS

Identification problems in psychometrics: General Description of the symposium
Ernesto San Martín
Identification problems are still of interest in psychometrics. Parameter identification is not only relevant to ensuring a coherent inference, but also to ensuring a correct interpretation of the parameters of interest in terms of the sampling process. When a statistical model is indexed by unidentified parameters, such a model has no empirical meaning. In this symposium we will focus our attention to the following two problems, motivated by the following questions:

Identification of semi-parametric IRT-models:  

  • It seems to be "naturally" accepted that the identification of the 1PL, 2PL and 3PL models is sufficient for identifying the structural Rasch model obtained after integrating out the abilities. Does any proof exist for this argument? 
  • It is substantively possible to doubt about the normality of the abilities. This leads to the consideration of a semi-parametric IRT model, where the parameters of interest are the item parameters and the distribution G generating the abilities. How is it possible to identify the parameters of interest? Is it possible to fix a parameter for getting the identifiability? How do the eventual identification restrictions impact on the estimation procedures of G?

and Identification of Congnitive Diagnosis Models (CDM):

  • Two relevant examples of CDM are the DINA model and the NIDA model (also known as MCLCM). Is it true that these models are generalizations of the standard LCM?
  • What are the parameters of interest in a NIDA model? And in a DINA model? This question needs to be answered through an identification analysis, but what kind of analysis?
  • c. What about the identifiability of DINA models?

Identifiability for GLLIRM models that are more general than the NCM
Javier Revuelta
The GLLIRM model was originally defined as a linearly constrained Nominal Categories Model. Because the NCM parameters are identifiable, the GLLIRM parameters are identifiable if the linear equations that relate both parameterizations have a unique solution. However, there are some situations where the NCM cannot be applied in a meaningful way but the GLLIRM is of interest. One example is the estimation of interaction parameters between items, which implies that the local independence assumption is violated. In that case, the identifiability of the GLLIRM parameters cannot be based on their relation with the NCM and shall be addressed by investigating the structure of the information matrix that is obtained from the marginal distribution of the response patterns. The conditions for the identifiability of the GLLIRM parameters with respect to the marginal distribution are presented together with an example from the context of surveys. In the example, the dataset is analyzed by a GLLIRM model that can be understood as a modification of log-linear models by defining the parameters as linear functions of latent ability.

Bayesian Semiparametric IRT-type Models
Alejandro Jara and Ernesto San Martín
We study the Bayesian identification and consistency of semiparametric extension of IRT-type models, where the uncertainty on the abilities' distribution is modeled using a prior distribution on the space of all probability measures. Specifically, we establish conditions for the identification and consistency in the Bernoulli, Poisson and Negative Binomial versions of the Rasch model and of a 3PL model with constant discrimination parameters. We show that, under the Rasch model for unbounded count responses, to fix one difficulty parameter or one characteristic of the non-specified abilities distribution is a sufficient condition for the identification and Bayesian consistency of the posterior mean. Contrary to common belief, this restriction is not sufficient for the identification of the Bernoulli Rasch model. This is an example where the identification of the conditional model is not sufficient for the identification of the statistical model. For the Rasch model an infinite number of items is a necessary and sufficient condition for the identification and consistency of Bayesian estimators. The same conclusions are established for semiparametric extensions of a 3PL with constant discrimination parameters. The implications of the theoretical results are evaluated using simulated data.

Equivalent Diagnostic Classification Models
Timo Bechger and Gunter Maris
Notwithstanding the current popularity of diagnostic classification models (DCM), it is yet unclear whether the parameters of these models are identifiable from data. When a model is not identifiable there are different parameter values that give an equivalent model. In this contribution we will focus on a related issue: DCMs that represent different substantive theories but are nevertheless equivalent and imply the same distribution for the data. The existence of such equivalent models implies that we cannot substantially interpret any of them. Specifically, we will demonstrate how methods developed to detect equivalent models in the context of the MIRID model can be applied to the NIDO model. An example will illustrate the seriousness of the problem.

Identification of Multiple Classification Latent Class Models (MCLCM)
Ernesto San Martín, Jean-Marie Rolin and Paul De Boeck
MCLCM can be considered as a formalization of the hypothesis that the responses come about in a process that involves the application of a number of mental operations. Each of these mental operations corresponds to one latent classification. With binary latent classifications, one of the classes in every classification corresponds to mastery of this mental operation and the other to non-mastery. In this paper, we propose two strategies to identify the parameters of interest of a MCLCM. After carefully defining (through an identification analysis at the marginal latent model and at the conditional model) the parameters of interest, we present a first strategy based on the so-called determinantal method. The rationale is to show that the MCLCM is a particular case of a LCM. The limitations of this strategy will be discussed. A second strategy will be discussed; necessary identification conditions will be established. Sufficient conditions will be illustrated with one latent class and at least two mental operations. Extensions of this strategy will be discussed. (158)