Análise de componentes principais robusta em dados de poluição do ar: aplicação à otimização de uma rede de monitoramento

Nenhuma Miniatura disponível
Data
2014-10-30
Autores
Cotta, Higor Henrique Aranda
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal do Espírito Santo
Resumo
Studies of data from air pollution originating from a network of air monitoring involve a large number of variables and observations. From the standpoint of statistical techniques, it is possible to analyze separately each variable of interest. However, this type of analysis can not contemplate the relationship dynamics between these variables. Because of this, it is necessary to use statistical techniques to handle, measure and analyze these data generated jointly. This branch of statistics known as Multivariate Statistics. One important multivariate technique in the area of air pollution is the Principal Component Analysis (PCA), which builds linear combinations of variables to explain the variance-covariance structure of the original data. Air pollution in the Principal Component Analysis is used for: creating indexes of air quality, identi cation of pollution sources, management of air quality monitoring network, preprocessor variables for generalized additive models, besides other applications. In this work PCA is used to study the management and scaling of the Network for Monitoring Air Quality in the Greater Vitoria Region. This work deals with the use of Principal Component Analysis (PCA) in time series with additive outliers. The PCA is one of the most important multivariate techniques which are linear combinations constructed to explain the variance-covariance structure of the original data. Although PCA assumes that the data are serially independent, this assumption is not found in practice situation in time series, e.g. Air Pollution data. PCs calculated from time series observations maintains their orthogonality property, but the components are found to be auto and cross-correlated, which depends on the correlation structure of the original series. These properties and their impact in the use of PCA are one of main objective of this work. Another contribution is related to the study of PCA time series under the presence of additive outliers by proposing a Robust PCA (RPCA) method. It is well known that additive outliers in time series destroys the correlation structure of the data. Since the PCs are computed by using the covariance matrix, the outliers also a ect the properties of PCs. Therefore the Robust PCA should be used in this context. The Robust PCA method proposed here is justi ed empirical and theoretically, and a real data set based on Air Pollution time serie is used to show the usefulness of the Robust PCA method in a real application.
Descrição
Palavras-chave
Principal component analysis , Air pollution , Time series analysis , Time domain , Frequency domain , Outliers , Robustez
Citação
COTTA, Higor Henrique Aranda. Análise de componentes principais robusta em dados de poluição do ar: aplicação à otimização de uma rede de monitoramento. 2014. 74 f. Dissertação (Mestrado em Engenharia Ambiental) - Universidade Federal do Espírito Santo, Centro Tecnológico, Vitória, 2014.