1. Redes de Sensores sem Fios
Alejandro Frery - UFAL - resumo
2. Distributions Associated with the Inverse Gaussian Distributions
Antonio Sanhueza - Universidad de La Frontera, Chile - resumo
3. Robust Estimation of Pair-Copulas
Beatriz Vaz de Melo Mendes - UFRJ - resumo
4. A Review of Tweedie Asymptotics
Bent Jørgensen - University of Southern Denmark - resumo
5. The Ecological Footprint of Taylor's Power Law
Bent Jørgensen - University of Southern Denmark - resumo
6. Wavelet-based spectral methods for extracting self-similarity measures
in time-varying two-dimensional rainfall maps
Brani Vidakovic - GaTech / Emory University School of Medicine - resumo
7. Modelling and Data Analysis of Tracks of Interacting Particles: the Case of Elk
David R. Brillinger - University of California, Berkeley, USA - resumo
8. The Log-Exponentiated Weibull Regression Model for Interval-Censored Data
Edwin M. M. Ortega - ESALQ - USP - resumo
9. Identifying The Finite Dimensionality Of Curve Time Series
Flavio A. Ziegelmann - UFRGS - resumo
10. Linear Models for Output-Buffered Systems
James Ramsay - McGill University - resumo
11. Mounting Statistical Challenges for Interdisciplinary Research in Clinical Sciences and Bioinformatics
Pranab K. Sen - University of North Carolina at Chapel Hill, USA - resumo
12. A class of dynamic Piecewise Exponential Models with Random Time Grid
Rosangela H. Loschi - UFMG - resumo
13. Bayesian Analysis for the Destructive Negative Binomial Cure Rate Models
Vicente Garibay Cancho - USP - resumo
14. Statistical modeling based on Birnbaum-Saunders distributions: EM-algorithm, robustness and application
Victor Leiva - Universidad de Valparaíso, Chile - resumo
15. Gaussian Inequalities and Conjectures
Wenbo Li - University of Delaware, USA - resumo
James Ramsay - McGill University
How systems transform streams of input information into one or more output behaviours is a preoccupation over virtually all the sciences, whether pure or applied. In many of these fields, models tend to be defined in terms of systems of differential equations, or dynamic systems. Until recently, statisticians had little to say about how to fit such models to data, or about how to draw inferences from data on such systems.
In this talk, we introduce dynamic systems as a relatively simple variation of a standard regression problem indexed by time, and outline an approach to fitting such systems to real data. Data on a variety of real-world input/output situations are used to illustrate what can be achieved by proposing that inputs are buffered in a variety of ways so as to transfer sharp changes in input to smooth output responses.
Victor Leiva - Universidad de Valparaíso - Valparaíso - Chile
The Student-t distribution is habitually used for modeling symmetric data. This family has ap-
pealing properties such as robust estimates, easy number generation, and effcient computation of
the ML estimates via the EM-algorithm; see Dempster et al. (1977) and Lange et al. (1989). The
Birnbaum-Saunders (BS) distribution is a positively skewed model that is related to the normal
distribution and has received considerable attention; see Johnson et al. (1995), Leiva et al. (2009)
and Balakrishnan et al. (2009). In this presentation, robust modeling and influence diagnostics in
BS regression models is carried out. Specifically, preliminary aspects related to BS and log-BS dis-
tributions and their generalization from the Student-t distribution are presented. Also, Student-t
BS regression models including maximum likelihood estimation based on the EM algorithm and
diagnostic tools are discussed. Finally, the presented and discussed results are applied to real data
by an R package, which shows the utility of the proposed model.
Keywords: EM algorithm; Generalized Birnbaum-Saunders distribution; Infuence diagnostics;
Likelihood methods; Log-linear models; Robustness; Sinh-normal distribution.
Beatriz Vaz de Melo Mendes - UFRJ
In this paper we extend our previous work and study robust estimation of pair-copula models through the minimization of weighted goodness of fit statistics. Different weight functions emphasize different regions on the unit cube where contaminations may be located. The resulting WMDE estimators are compared to the classical maximum likelihood estimators MLE, and to their weighted version WMLE, an estimator obtained in two steps. All estimators are compared in a comprehensive simulation study. For each epsilon-contaminated pair-copula model specified, we show that there is a robust estimator improving over the MLE and able to capture the correct strength of dependence of the data, despite the contamination percentual and location, and the sample size. Two applications are provided. One aims to forecast the number of cases of dengue in Rio de Janeiro, and the second uses high frequency data in finance.
A Pegada Ecológica Da Lei De Potências Do Taylor
Bent Jørgensen - University of Southern Denmark – Dinamarca
[Esta palestra será proferida em português/This talk will be in Portuguese]
Half a century ago, the English entomologist L. R. Taylor published a short paper in Nature (Taylor, 1961), where he proposed a simple power law to describe the species-specific relationship between the temporal or spatial variance of animal populations and their mean abundances. If Y is the population count with mean μ, then Taylor's power law says that the variance is Var(Y) = aμb, where a and b are parameters. This form of variance function is well-known in statistics, but Taylor's paper marked the beginning of a remarkable development in applied science, characterized by both triumph and controversy.
The possibility of controversy is easy to spot, since the power law harnesses but a single degree of freedom of the complex population dynamics of animals, with infinite possibilities for alternative explanations. Yet Taylor's triumph was that within his lifetime, his law was confirmed empirically again and again to such an extent that it earned the name "universal". As early as 1983, Taylor was able to declare that his law had been observed for 444 different species of birds, moths and aphids sampled over Great Britain (Taylor et al., 1983). Even more remarkable, the following decades witnessed a development, where Taylor's law was observed in an ever expanding variety of different circumstances in areas such as ecology, epidemiology and genetics, ranging from, say, the number of sexual partners reported by HIV infected individuals (Anderson et al., 1988) to the physical distribution of genes on human chromosome 7 (Kendal, 2004).
The apparent lack of a definitive theoretical explanation for Taylor's law has led various authors to make statements such as "Taylor's power law is merely an empirical model which lacks definite theoretical background" (Pedigo and Buntin, 1994), an affirmation that evidently reflected the state of affairs some twenty years ago. Yet already in 1984, M.C.K. Tweedie, an English radiotherapy physicist and statistician, published a paper where he proposed a class of natural exponential families with power variance functions of Taylor’s form (Tweedie, 1984). Then, in 1994 a convergence theorem of central limit type was published (Jørgensen et al., 1994), where the Tweedie distributions appear as limiting laws. Taylor's power law may hence be viewed as the direct manifestation of a central limit effect, thereby providing a plausible explanation for the ubiquity of Taylor's power law in nature.
In the talk I will first explain the type of sampling where Taylor’s power law can be expected to apply, and then I will review some of the extensive empirical evidence for the law. I will also explain the scaling invariance that characterizes Tweedie’s distribution, and leads to the convergence theorem, and touch upon the fractal interpretation of these results. Finally, I will review the main characteristics of the different types of Tweedie distributions, and their interpretation in terms of the clustering behaviour of the population.
R. M. Anderson and R. M. May. Epidemiological parameters of HIV transmission. Nature, 333:514-519, 1988.
R. M. Anderson and R. M. May. Epidemiological parameters of HIV transmission. Nature, 333:514-519, 1988.
B. Jørgensen, J. R. Martínez, and M. Tsao. Asymptotic behaviour of the variance function. Scand. J. Statist., 21:223-243, 1994.
W. S. Kendal. A scale invariant clustering of genes on Human chromosome 7. BMC Evolutionary Biology, 4(3), 2004. URL http://www.biomedcentral.com/1471-2148/4/3.
L. P. Pedigo and G. D. Buntin. Handbook of Sampling Methods for Arthropods in Agriculture. CRC Press, Florida, second edition, 1994.
L. R. Taylor. Aggregation, variance and the mean. Nature, 189:732-735, 1961.
L. R. Taylor, R. A. J. Taylor, I. P. Woiwod, and J. N. Perry. Behavioural dynamics. Nature, 303:801-–804, 1983.
M. C. K. Tweedie. An index which distinguishes between some important exponential families. In J. K. Ghosh and J. Roy, editors, Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, pages 579-604, Calcutta, 1984. Indian Statistical Institute.
David R. Brillinger - University of California - Berkeley - EUA
Suppose there are J interacting particles, indexed by j, moving about in a d-dimensional space. (The case of animals moving about in a reserve, d = 2, will be the example considered in particular.) Suppose that the j-th particle is at location rj(t) at time t. Discrete time approximations to the following Stochastic Differential Equation (SDE) models will be considered and fit to data obtained for a large reserve in Oregon,
Pranab K. Sen - University of North Carolina at Chapel Hill – EUA
In drug development studies, dental and clinical research, and in bioinformatics, in general, the present emphasis on disease genes and gene-environment interaction has led to unusually high dimensional data models, often, with inadequately smaller sample sizes. In such interdisciplinary research there is too much of information which may be difficult to disseminate in a valid and precise statistical way. No wonder why the advent of modern information technology has opened the doors for computer scientists in prescribing countless algorithms which are capable of providing ready-made quantitative assessment of acquired data. Actually, data mining or statistical learning tools have been revamped to import the much needed statistical motivation, validity and analysis of such high-dimensional low sample size (HDLAA) models. There are mounting problems and challenging tasks for statistical reasoning to validate and optimise such data mining schemes. In HDLSS models, specific distributional assumptions are highly susceptible to nonrobustness to plausible model departures, particularly when the sample size is disproportionately small, thus preempting routine adaption of conventional statistical methodology. In clinical trials with dynamic treatment regimens, environmental toxicity studies, toxicological epidemiology, and in igeneral, in genomic studies, it is therefore highly desirable to incorporate robust statistical procedures where robustness is to be interpreted in a much broader way than in conventional parametric models. Structural constraints, identifiability perspectives and molecular biological undercurrents all call for such robust statistical procedures. Some of these challenging problems are appraised along with suitable data models.
Rosangela H. Loschi – Universidade Federal de Minas Gerais - MG - Brasil
(Joint work with Fabio N. Demarqui (UFMG), Dipak K. Dey (U. of Connecticut) and Enrico A. Colosimo (UFMG))
We introduce a general class of PEM in which the time grid is random and both the failure rates and the regression coefficients in different intervals are correlated. This new approach extends static models such as in the bool of Ibrahim et al. (2001b) as well as the dynamic model based on an arbitrary time grid proposed by Gamerman( APP. Stat.,1991). By considering a dynamic generalized modeling approach along with a product distribution for the random time grid, we assume a class of correlated Gamma prior distributions for the failure rates and use the clustering structure of the PPM to model the randomness of the time grid of the PEM. We further develop procedures to evaluate the performance of the proposed model and carry out sensitivity analysis for different prior specifications. Finally, to illustrate the use of the proposed model, the analysis of survival times of patients with brain cancer is performed. Such data set is obtained from SEER (Surveillance, Epidemiology and End Results) database. The results are compared with those obtained by fitting the dynamic model proposed by Gamerman( APP. Stat.,1991).
Apoio Financeiro: CNPq e FAPEMIG
Flavio A. Ziegelmann - Universidade Federal do Rio Grande do Sul – RS - Brasil
(Joint work with Qiwei Yao (LSE) and Neil Bathia (LSE))
The curve time series framework provides a convenient vehicle to accommodate some nonstationary features into a stationary setup. We propose a new method to identify the dimensionality of curve time series based on the dynamical dependence across different curves. The practical implementation of our method boils down to an eigenanalysis of a finite-dimensional matrix. Furthermore, the determination of the dimensionality is equivalent to the identification of the nonzero eigenvalues of the matrix, which we carry out in terms of some bootstrap tests. Asymptotic properties of the proposed method are investigated. In particular, our estimators for zero-eigenvalues enjoy the fast convergence rate n while the estimators for non-zero eigenvalues converge at the standard pn-rate. The proposed methodology is illustrated with both simulated and real data sets.
Edwin M. M. Ortega - ESALQ – Universidade de São Paulo – SP - Brasil
In interval-censored survival data, the event of interest is not observed exactly but is only known to occur within some time interval. Such data appear very frequently. In this paper, we are concerned only with parametric forms, and so a location-scale regression model based on the exponentiated Weibull distribution is proposed for modeling interval-censored data. We show that the proposed log exponentiated Weibull regression model for interval-censored data represents a parametric family of models that include other regression models that are broadly used in lifetime data analysis. Assuming the use of interval-censored data, we employ a frequentist analysis, a jackknife estimator, a parametric bootstrap and a Bayesian analysis for the parameters of the proposed model. We derive the appropriate matrices for assessing local in°uences on the parameter estimates under different perturbation schemes and present some ways to assess global influuences. Furthermore, for different parameter settings, sample sizes and censoring percentages, various simulations are performed; in addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to a modified deviance residual in log exponentiated Weibull regression models for interval-censored data.
Keywords: Censored data; Exponentiated Weibull distribution; Regression model; Residual analysis; Sensitivity analysis.
Vicente Garibay Cancho - Universidade de São Paulo – SP - Brasil
In this paper we developed a Bayesian analysis based on a new a exible cure rate survival model. In our approach the number of competing causes of the event of interest to follow a compound negative binomial distribution. This model is more exible in terms of dispersion than the promotion time cure model. Moreover, it gives an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of the initial risk factors in a competitive scenario. In other words, what is recorded is only the undamaged portion of the original number of risk factors. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Keywords: competing risks; cure rate models; long-term survival models; negative binomial distribution.
Alejandro César Frery - Universidade Federal de Alagoas – AL – Brasil
As redes de sensores sem fios são formadas por uma grande quantidade de nós aptos a (i) fazer medições, (ii) processá-las e armazená-las, e (iii) comunicar-se com outros nós. Esses dispositivos maciçamente distribuídos permitem monitorar diversos fenômenos com uma série de vantagens diante de outras abordagens: custo reduzido, altas densidades espacial e temporal do sensoriamento e risco reduzido de acidentes dos operadores. Nesta palestra apresentaremos as redes de sensores sem fios como dispositivos de amostragem e reconstrução de sinais estocásticos, e comentaremos algumas das muitas vertentes de pesquisa na área.
Antonio Sanhueza - Universidad de La Frontera - Chile
Although the Gaussian or normal distribution is one of the most used probability models in statistics for modeling continuous variates, there are also other models with good properties which are appropriate for modeling this class of variates, mainly when it has non-negative range. One of these models is the inverse Gaussian (IG) distribution, which is a flexible probability model with non-negative support and attractive properties that has been widely studied and applied. For instance, the IG distribution belongs to the exponential family, it is related to the chi-square distribution, and it is closed under convolution. It has been applied in diverse fields, including agricultural, biology, economics, engineering, environmental sciences and physics.
This talk will present a new family of “log-distributions" related to the IG distribution, which has been developed and called sinhyperbolic mixture inverse Gaussian. Also, this presentation will consider the study of the “associated distribution" to the sinhyperbolic mixture inverse Gaussian, which is referred as extended mixture inverse Gaussian. It will be characterized from a statistical and probabilistic point of view these two new models related to the IG distribution. We can approach a great quantity of problems associated with these two new models, including parameter estimates and diagnostics for these models.
Finally, it will be also considered a new class of models, which is generated from symmetrical distributions in R and generalize the IG distribution, which we called inverse Gaussian type models. In addition, we introduce a regression model based on this new family of distributions. The inverse Gaussian type distribution is a more flexible model for fitting different type of data. Moreover, this model also produces qualitatively robust parameter estimates in the presence of atypical data. Specifically, its theoretical characterization is presented, including inference on the parameters, regression, and diagnostics under the assumption of the inverse Gaussian type distribution. Furthermore, we discussed some applications from real data sets.
Wenbo Li - University Of Delaware - EUA
Gaussian inequalities play a fundamental role in the study of high dimensional probability. We first provide an overview of various Gaussian inequalities and then present several recent results and conjectures for Gaussian measure/vectors, together with various applications.
Bent Jørgensen - University of Southern Denmark
Conventional central limit theory concerns the asymptotic normality of centered and scaled sums of i.i.d. random variables with finite variances. The corresponding enormous domain of attraction of the normal distribution is, in a certain sense, also its biggest liability, because no other feature of the distribution beyond its first two moments is used in the normal approximation. A more differentiated approach is obtained by considering so-called Tweedie asymptotics, which involve exponentially tilted and scaled sums of i.i.d. random variables with finite variances. The resulting set of limiting distributions is the three-parameter class of Tweedie exponential dispersion models with power variance functions. Tweedie asymptotics covers a considerable range of different types of asymptotics, ranging from Poisson and compound Poisson convergence via a gamma approximation to results involving exponentially tilted extreme stable distributions and Lévy processes. We review the area of Tweedie asymptotics, including recent results and open problems.
Brani Vidakovic - GaTech / Emory University School of Medicine
Many environmental time-evolving spatial phenomena are characterized by a large number of energetic modes, the occurrence of irregularities, and the self organization over a wide range of space or time scales. Precipitation is a classical example characterized by both strong intermittency and multi-scale dynamics, and indeed it is partly these features that generate persistence, long-range dependence, and extremes (whether be it droughts or extreme floods). Over the last two decades, time-frequency or time-scale transforms have become indispensable tools in the analysis of such phenomena, and as a consequence, a number of wavelet-based spectral methods are now routinely employed to estimate Hurst exponents and other measures of regularity and scaling. We propose new wavelet-based spectra for the analysis of 2-D images. The new Covariance Wavelet Spectrum is applied to the analysis of time sequences of two-dimensional spatial rainfall radar images characterized by either convective or frontal systems. Intermittent spatial patterns connected to the precipitation formation mechanisms were encoded in low-dimensional informative descriptors appropriate for classification and discrimination analysis and possible integration with climate models. We found that convective rainfall spatial patterns compared to frontal patterns produce spectral
signatures consistent with more irregular fields.
This is a joint work with Pepa Ram\'irez-Cobo, Kichun Sky Lee, Annalisa Molini, Amilcare Porporato, and Gabriel Katul.