Algorithms for estimation of variable length Markov Chains and applications

David Henriques da Matta / Universidade Federal de Goiás
There are many studies in the field of linguistics where the interest is to analyze the differences between Brazilian Portuguese and European Portuguese (henceforth BP and EP respectively). Both the BP on EP, have the same words set in their structure (lexicon). However, these languages have different syntaxes and different prosodies. The key point of this process of differentiation, is related to the question of finding
estimation methods. To better understand this theoretical context, we discuss here some basic concepts of variable length Markov chains, as well as a simulation study to find evidence whether to use BIC or AIC as the selection criteria of models to tune the pruning constant of the algorithm Context (Rissanen (1983); Buhlman and Wyner (1999)).

