Research Reports

3/2012 Bayesian general multivariate latent variable modeling of longitudinal item response data
Caio L. N. Azevedo, Jean-Paul Fox, Dalton F. Andrade

Longitudinal item response data are characterized by examinees that are assessed at different time points or measurement occasions such that time-specific measurements are nested within examinees. Besides the usual nesting of response observations within examinees, the time-specific latent traits are also nested within examinees. In the well-known hierarchical modeling approach, the complex dependencies due to the nested structure of the data are commonly modeled by introducing random effects such that observations and latent traits are conditionally independently distributed. However, the implied compound symmetry structure is often not sufficient to model the complex time-heterogenous dependencies.Therefore, a Bayesian general multivariate item response modeling framework is proposed that accounts for the complex within-examinee latent trait dependencies. Flexible parametric covariance structures are considered to modelspecific within-examinee dependencies. Furthermore, it can handle many measurement occasions, different item response functions, and different latent trait population distributions, and generalizes some works of current literature. Due to identification rules and restricted parametric covariance structures, a conditional modeling approach is pursued to specify proper priors for the unrestricted parameters and to implement an efficient MCMC algorithm, by conditioning on baseline population parameters. The study is motivated by a large-scale longitudinal research program of the Brazilian Federal government to improve the teaching quality and general structure of schools for primary education. It is shown that the growth in math achievement can be accurately measured when accounting for complex dependencies over grades using time-heterogenous covariances structures.


PDF icon rp-2012-3.pdf
2/2012 A note on identification and metric issues for skew IRT models
Caio L. N. Azevedo, Heleno Bolfarine, Dalton F. Andrade

The skew-normal distribution (SND) is a flexible family of densities which preserves some useful properties of the original normal distribution. Some stochastic representations for the SND have beenproposed in the literature. The Henze (H) and Sahu, Branco and Dey (SBD) are the two most used ones. On the other hand, the centered parametrization is useful for inference purposes. The main goals ofthis article are: establish a link between the standard H and SDB skew-normal distributions and use this result to model the latent traits for IRT models. We proved that standard H and SDB distributions are related to each other through a function of the asymmetry parameter and also that they are exactly the same under centered parametrization (CP). Using these results, we showed that the common density obtained through the CP is useful to model the latent traits for unidimensional IRT models. This approach allows to represent asymmetric latent traits behavior and ensures the model identification as well.


PDF icon rp-2012-2.pdf
1/2012 Likelihood Based Inference for Linear and Nonlinear Mixed-Effects Models with Censored Response Using the Multivariate-t Distribution
Larissa A. Matos, Marcos O. Prates, Ming H. Chen, Víctor H. Lachos

Mixed models are commonly used to represent longitudinal or repeated measures data. An additional complication arises when the response is censored, for example, due to limits of quantification of the assay used. Normal distributions for random effects and residual errors are usually assumed, but such assumptions make inferences vulnerable to the presence of outliers. Motivated by a concern of sensitivity to potential outliers or data with tails longerthan-normal, we aim to develop a likelihood based inference for linear and nonlinear mixed effects models with censored response (NLMEC/LMEC) based on the multivariate Student-t distribution, being a flexible alternative to the use of the corresponding normal distribution. We propose an ECM algorithm for computing the maximum likelihood estimates for NLMEC/LMEC with standard errors of the fixed effects and likelihood function as a by-product. This algorithm uses closed-form expressions at the E-step, which relies on formulas for the mean and variance of a truncated multivariate-t distribution, and can be computed using available software. The proposed algorithm is implemented in the R package tlmec. An appendix which includesfurther mathematical details, the R code, and datasets for examples and simulations are available as supplements. The newly developed procedures are illustrated with two case studies, involving the analysis of longitudinal HIV viral load in two recent AIDS studies. In addition, a simulation study is conducted to assess the performance of the proposed approach and its comparison with the approach by Vaida and Liu (2009).


PDF icon rp-2012-1.pdf
11/2011 Generelized Skew- Normal/Independent Fields with Applications
Marcos O. Prates, Dipak K. Dey, Víctor H. Lachos

The last decade has witnessed major developments in Geographical Information Systems (GIS) technology resulting in the need for Statisticians to develop models that account for spatial clustering and variation. Study of spatial patterns are very important in epidemiological and environmental problems. Due to spatial characteristics it is extremely important to correctly incorporate spatial dependence in modeling. This paper develops a novel spatial process using generalized skew-normal/independent distributions when the usual Gaussian process assumptions are invalid and transformation to a Gaussian random field is not appropriate. Our proposed method incorporates skewness as well as heavy tail behavior of the data while maintaining spatial dependence using a Conditional Auto Regressive (CAR)structure. We use Bayesian hierarchical methods to fit such models. Consequently we use a Bayesian model selection approach to choose appropriate models for a empirical data set.


PDF icon rp-2011-11.pdf
10/2011 Influence Diagnostics in Linear and Nonlinear Mixed-Effects Models with Censored Data
Larissa A. Matos, Víctor H. Lachos, N. Balakrishnan, Filidor E. Vilca-Labra

HIV RNA viral load measures are often subjected to some upper and lower detection limits depending on the quantification assays, and consequently the responses are either left or right censored. Linear and nonlinear mixed-effects models with modifications to accommodate censoring (LMEC and NLMEC) are routinely used to analyze this type of data. Recently, Vaida and Liu (2009) proposed an exact EM-type algorithm for LMEC/NLMEC, called SAGE algorithm (Meng and Van Dyk, 1997), that uses closed-form expressions at the E-step, as opposed to Monte Carlo simulations. Motivated by this algorithm, we propose here an exact ECM algorithm (Meng and Rubin, 1993) for LMEC/NLMEC, which enable us to develop local influence analysis for mixed effects models on the basis of the conditional expectation of the complete-data log-likelihood function. This is because the observed data log-likelihood function associated with the proposed model is somewhat complex that makes itdifficult to apply directly the approach of Cook (1977, 1986). Some useful perturbation schemes are discussed. Finally, the results obtained from the analyses of two HIV AIDS studies on viral loads are presented to illustrate the newly developed methodology.


PDF icon rp-2011-10.pdf
9/2011 Bayesian Estimation of a Skew-t Stochastic Volatility Model
Carlos A. Abanto-Valle, Víctor H. Lachos, Dipak K. Dey

In this paper we present a stochastic volatility (SV) model assuming that the return shock has a skew-Student-t distribution. This allows a parsimonious, flexible treatment of asymmetry and heavy tails in the conditional distribution of returns. An efficient Markov chain Monte Carlo estimation method is described that exploits a skew-normal mixture representation of the error distribution with a gamma distribution as the mixing distribution. We apply the methodology to the NASDAQ daily index returns.


PDF icon rp-2011-9.pdf
8/2011 Skew-Normal/Independent Linear Mixed Models for Censored Responses with Applications to HIV Viral Loads
Víctor H. Lachos, Dipankar Bandyopadhyay, Dipak K. Dey, Luis M. Castro

Often in biomedical studies, the routine use of linear mixed-effects models (based on Gaussian assumptions) can be questionable when the longitudinal responses are skewed in nature. Skew-normal/elliptical models are widely used in those situations. Often, those skewed responses might also be subjected to some upper and lower quantification limits (viz. longitudinal viral load measures in HIV studies), beyond which they are not measurable. In this paper, we develop a Bayesian analysis of censored linear mixed models replacing the Gaussian assumptions with skew-normal/independent (SNI) distributions. The SNI is an attractive class of asymmetric heavy-tailed distributions that includes the skew-normal, the skew-t, skew-slash and the skew-contaminated normal distributions as special cases. The proposed model provides flexibility in capturing the effects of skewness and heavy tail for responses which are either left- or right-censored. For our analysis, we adopt a Bayesian framework and develop a MCMC algorithm to carry out the posterior analyses. The marginal likelihood is tractable, and utilized to compute not only some Bayesian model selection measures but also case-deletion influence diagnostics based on the Kullback-Leibler divergence. The newly developed procedures are illustrated with a simulation study as well as a HIV case study involving analysis of longitudinal viral loads.


PDF icon rp-2011-8.pdf
7/2011 An Improved p Chart for Monitoring High Quality Processes Based on Cornish-Fisher Quantile Correction
Silvia Joekes, Emanuel P. Barbosa

The conventional Shewhart 3-sigma p control chart constructed by the normal approximation for the binomial data suffers a serious inaccuracy in the modeling process and control limits specification when the true rate of nonconforming items is small. We offer an improved p chart which can provide a large improvement over the usual p chart for attributes. This new chart, based on the Cornish-Fisher expansion, is corrected to order $n^{-3/2}$, where $n$ is the sample size of inspections units. This chart is also better than the modified p chart corrected only to order $n^{-1}$, especially in thesense that it allows monitoring lower values of p. We compare our improved p chart with both and show the benefits of including a new term of correction in the Cornish-Fisher expansion of quantiles for monitoring high-quality processes.


PDF icon rp-2011-7.pdf
6/2011 Análise Comparativa para o Problema de Locação-Alocação: Modelo Não Linear Geral Versus Modelo das P-Medianas com Variáveis Inteiras ? Um Estudo de Caso
Marina Lima Morais, Sandra A. Santos

The purpose of this work is to investigate the solution of a location-allocation problem which consists in determining the best location to a given number of silos that receive all the harvested coffee from a group of farms whose locations are known and the allocation of the harvested coffee; this must be done in a way to minimize the transportation costs. The computational research is done with real values obtained from a community cooperative called Cooxupe, operating in the region of Alfenas, in the state of Minas Gerais. We have investigated in this work the necessary theoretical concepts to formulate the problem using two different forms: as nonlinear problem and as p-median problem, and have worked on the modelling of the problem so it can be solved with a symbolical computation package. We have investigated the efficiency of the MATLAB, commands fmincon and bintprog, for nonlinear problem and p-median, as well as the algorithms used by these commands; the problem was defined using two different norms: the taxicab norm and the euclidean norm. We present the numerical results obtained in our investigation, in other words, the latitude and longitude for the best placement of the silos and the amount of these, and compared with the allocation made by the cooperative.


PDF icon rp-2011-6.pdf
5/2011 On the Characterization and Estimation of Glog-Normal Scale Mixture Distributions and Their Application in Genetics
Filidor E. Vilca-Labra, Mariana R. Motta, Víctor Leiva
PDF icon rp-2011-5.pdf