PPGI - Teses de doutorado

URI Permanente para esta coleção

Navegar

Submissões Recentes

Agora exibindo 1 - 5 de 17
  • Item
    CRF+LG: uma abordagem híbrida para o reconhecimento de entidades nomeadas em português
    (Universidade Federal do Espírito Santo, 2019-02-07) Pirovani, Juliana Pinheiro Campos; Oliveira, Elias Silva de; Laporte, Éric; Lima, Priscila Machado Vieira; Ciarelli, Patrick Marques; Gonçalves, Claudine Santos Badue
    Named Entity Recognition involves automatically identifying and classifying entities such as persons, places, and organizations, and it is a very important task in Information Extraction. Named Entity Recognition systems can be developed using the following approaches: linguistics, machine learning or hybrid. This work proposes the use of a hybrid approach, called CRF+LG, for Named Entity Recognition in Portuguese texts in order to explore the advantages of both linguistics and machine learning approaches. The proposed approach uses Conditional Random Fields (CRF) considering the term classification obtained by a Local Grammar (LG) as an additional informed feature. Conditional Random Fields is a probabilistic method for structured prediction. Local grammars are handmade rules to identify expressions within the text. The aim was to study this way of including the human expertise (Local Grammar) in the machine learning Conditional Random Fields approach and to analyze how it can contribute to the performance of this approach. To achieve this aim, a Local Grammar was built to recognize the 10 named entities categories of HAREM, a joint assessment for the Named Entity Recognition in Portuguese. Initially, the Golden Collection of the First and Second HAREM, considered as a reference for Named Entity Recognition systems in Portuguese, were used as training and test sets, respectively, for evaluation of the CRF+LG. After that, the proposed approach was evaluated in two other datasets. The results obtained outperform the results of systems reported in the literature that were evaluated under equivalent conditions. This gain was approximately 8 percentage points in F-measure in comparison to a system that also used CRF and 2 points in comparison to a system that used Neural Networks. Some systems that used Neural Networks presented superior results, but using massive corpora for unsupervised learning of features, which was not the case of this work. The Local Grammar built can be used individually when there is no training set available and in conjunction with other machine learning techniques to improve its performance. We also analyzed the boundaries (lower bound and upper bound) of the proposed approach. The lower bound indicates the minimum performance and the upper bound indicates the maximum gain that we can achieve for the task in question when using this approach.
  • Item
    RDNA: arquitetura definida por resíduos para redes de data centers
    (Universidade Federal do Espírito Santo, 2018-08-24) Liberato, Alextian Bartholomeu; Ribeiro, Moisés Renato Nunes; Martinello, Magnos; Rothenberg, Christian Rodolfo Esteve; Sampaio, Leobino Nascimento; Mota, Vinícius Fernandes Soares; Villaça, Rodolfo da Silva
    Recently, we have seen the increasing use of information and communication technologies. Institutions and users simply require high-quality connectivity of their data, expecting instant access anytime, anywhere. An essential element for providing quality in the connectivity is the architecture of the communication network in Data Center Networks (DCNs). This is because a significant part of Internet traffic is based on data communication and processing that takes place within the Data Center (DC) infrastructure. However, the routing protocols, the forwarding model, and management that are currently running, prove to be insufficient to meet the current demands for cloud connectivity. This is mainly due to the dependency on the table lookup operation, that leads to an end-to-end latency increment. Besides, traditional recovery mechanisms have used additional states in the switch tables, increasing the complexity of management routines, and drastically reducing the scalability for routes protection. Another difficulty is the multicast communication within DC, existing solutions are complex to implement and do not support group configuration at the current required rates. In this context, this thesis explores the numerical system of residues centered in the Chinese remainder theorem (CRT) as a foundation, applied in the design of a new routing system for DCN. More specifically, we introduce RDNA architecture that advances the state-of-the-art from a simplification of the forwarding model to the core, based on the remainder of the division (modulo). In this sense, the route is defined as a residue between a route identification and local identification (prime numbers) assigned to the core switches. Edge switches receive inputs by configuring flows according to the network policy defined by the controller. Each flow is mapped to the edge, through a primary and an emergency route identification. These residue operations allow forwarding the packet through the respective output port. In failure situations, the emergency route identification enables fast recovery by sending the packets through an alternate output port. RDNA is scalable by assuming a 2-tier Clos Network topology widely used in DCNs. In order to compare RDNA with other works of the literature, we analyzed the scalability in terms of the number of bits required for unicast and multicast communication. In the analysis, the number of nodes in the network, the degree of the nodes and the number of physical hosts for each topology were varied. In unicast communication, the RDNA reduced by 4.5 times the header size, compared to the COXCast proposal. In multicast communication, a linear programming model is designed to minimize a polynomial function. RDNA reduced header size by up to 50% compared to the same number of members per group. As proof of concept, two prototypes were implemented, one in the Mininet emulated environment and another in the NetFPGA SUME platform. The results presented that RDNA achieves deterministic latency in packet forwarding, 600 nanoseconds in switching time per core element, ultra-fast failure recovery in the order of microseconds and no latency variation (no jitter) in the core network.
  • Item
    Indexação multidimensional para problemas da mochila multiobjetivo com paretos de alta cardinalidade
    (Universidade Federal do Espírito Santo, 2018-07-31) Baroni, Marcos Daniel Valadão; Varejão, Flávio Miguel; Rodrigues, Alexandre Loureiros; Martins, Simone de Lima; Rauber, Thomas Rauber; Boeres, Maria Claudia Silva
    Several real problems involve the simultaneous optimization of multiple criteria, which are generally conflicting with each other. These problems are called multiobjective and do not have a single solution, but a set of solutions of interest, called efficient solutions or non-dominated solutions. One of the great challenges to be faced in solving this type of problem is the size of the solution set, which tends to grow rapidly given the size of the instance, degrading algorithms performance. Among the most studied multiobjective problems is the multiobjective knapsack problem, by which several real problems can be modeled. This work proposes the acceleration of the resolution process of the multiobjective knapsack problem, through the use of a k-d tree as a multidimensional index structure to assist the manipulation of solutions. The performance of the approach is analyzed through computational experiments, performed in the exact context using a state-of-the-art algorithm. Tests are also performed in the heuristic context, using the adaptation of a metaheuristic for the problem in question, being also a contribution of the present work. According to the results, the proposal was effective for the exact context, presenting a speedup up to 2.3 for bi-objective cases and 15.5 for 3-objective cases, but not effective in the heuristic context, presenting little impact on computational time. In all cases, however, there was a considerable reduction in the number of solutions evaluations.
  • Item
    An alternative approach of parallel preconditioning for 2D finite element problems
    (Universidade Federal do Espírito Santo, 2018-06-29) Lima, Leonardo Muniz de; Catabriga, Lucia; Almeida, Regina Célia Cerqueira de; Santos, Isaac Pinheiro dos; Souza, Alberto Ferreira de; Elias, Renato Nascimento
    We propose an alternative approach of parallel preconditioning for 2D finite element problems. This technique consists in a proper domain decomposition with reordering that produces narrowband linear systems from finite element discretization, allowing to apply, without significant efforts, traditional preconditioners as Incomplete LU Factorization (ILU) or even sophisticated parallel preconditioners as SPIKE. Another feature of that approach is the facility to recalculate finite element matrices whether for nonlinear corrections or for time integration schemes. That means parallel finite element application is performed indeed in parallel, not just to solve the linear system. We also employ preconditioners based on element-by-element storage with minimal adjustments. Robustness and scalability of these parallel preconditioning strategies are demonstrated for a set of benchmark experiments. We consider a group of two-dimensional fluid flow problems modeled by transport, and Euler equations to evaluate ILU, SPIKE, and some element-by-element preconditioners. Moreover, our approach provides load balancing and improvement to MPI communications. We study the load balancing and MPI communications through analyzer tools as TAU (Tuning Analysis Utilities).
  • Item
    Avaliação da aprendizagem em jogos digitais baseada em learning analytics sobre dados multimodais
    (Universidade Federal do Espírito Santo, 2018-09-21) Pereira Junior, Heraclito Amancio; Menezes, Crediné Silva de; Souza, Alberto Ferreira de; Castro Junior, Alberto Nogueira de; Queiroz, Sávio Silveira de; Tavares, Orivaldo de Lira; Cury, Davidson
    The use of digital games as a pedagogical tool has been successfully applied in the development of the skills, abilities and attitudes required of 21st century professionals, both in primary and secondary education, as well as in vocational training. Despite this, one issue has worried educators who think of using digital games: "How to assess the learning of digital games?". Assessment is an important part of the teaching-learning process. This importance, especially with regard to learning based on computational resources, including digital games, led to the emergence of a research area called Learning Analytics, which "applies techniques and methods of Computer Science, Pedagogy, Sociology, Psychology, Neuroscience and Statistics for the analysis of data collected during educational processes". In order to better understand these assessments, the collection has also considered multimodal data, those from different manifestations of the student, captured by sensors, during the learning process (touches, gestures, voices and facial expressions). Although the publications indicate that some methods, techniques and tools have been researched to support learning assessments in learning computing environments, and these studies have already obtained some results, they have not yet been sufficient to provide clear, comprehensive. In particular, with regard to digital games, there is still limited availability of consolidated resources for assessing student learning during play, which has been one of the major factors hindering a broadening of its use for educational purposes. This work brings a contribution to the solution of this problem through: a computational platform, in the form of a framework, designed based on the techniques and methods of Learning Analytics; a specialization of the ECD (Evidency Center Design) approach, for project evaluations of learning based on digital games, and a Process that organizes the stages and activities of this type of evaluation. Experiments, reported here, using a framework instance, have demonstrated both their own merit as an assessment tool and the specialization of ECD and the said process.