Appearance-based global localization with a hybrid weightless-weighted neural network approach

Silva, Avelino Forechi

Appearance-based global localization with a hybrid weightless-weighted neural network approach

Arquivos

thesis_avelino.pdf (4.47 MB)

Data

2018-02-02

Autores

Silva, Avelino Forechi

Editor

Universidade Federal do Espírito Santo

Resumo

Currently, self-driving cars rely greatly on the Global Positioning System (GPS) infrastructure, albeit there is an increasing demand for global localization alternative methods in GPS-denied environments. One of them is known as appearance-based global localization, which associates images of places with their corresponding position. This is very appealing regarding the great number of geotagged photos publicly available and the ubiquitous devices fitted with ultra-high-resolution cameras, motion sensors and multicore processors nowadays. The appearance-based global localization can be devised in topological or metric solution regarding whether it is modelled as a classification or regression problem, respectively. The topological common approaches to solve the global localization problem often involve solutions in the spatial dimension and less frequent in the temporal dimension, but not both simultaneously. It was proposed an integrated spatio-temporal solution based on an ensemble of kNN classifiers, where each classifier uses the Dynamic Time Warping (DTW) and the Hamming distance to compare binary features extracted from sequences of images. Each base learner is fed with its own binary set of features extracted from images. The solution was designed to solve the global localization problem in two phases: mapping and localization. During mapping, it is trained with a sequence of images and associated locations that represents episodes experienced by a robot. During localization, it receives subsequences of images of the same environment and compares them to its previous experienced episodes, trying to recollect the most similar “experience” in time and space at once. Then, the system outputs the positions where it “believes” these images were captured. Although the method is fast to train, it scales linearly with the number of training samples in order to compute the Hamming distance and compare it against the test samples. Often, while building a map, one collects high correlated and redundant data around the environment of interest. Some reasons are due to the use of high frequency sensors or to the case of repeating trajectories. This extra data would carry an undesired burden on memory and runtime performance during test if not treated appropriately during the mapping phase. To tackle this problem, it is employed a clustering algorithm to compress the network’s memory after mapping. For large scale environments, it is combined the clustering algorithms with a multi hashing data structure seeking the best compromise between classification accuracy, runtime performance and memory usage. So far, this encompasses solely the topological solution part for the global localization problem, which is not precise enough for autonomous cars operation. Instead of just recognizing places and outputting an associated pose, it is desired that a global localization system regresses a pose given a current image of a place. But, inferring poses for city-scale scenes is unfeasible at least for decimetric precision. The proposed approach to tackle this problem is as follows: first take a live image from the camera and use the localization system aforementioned to return the image-pose pair most similar to a topological database built as before in the mapping phase. And then, given the live and mapped images, a visual localization system outputs the relative pose between those images. To solve the relative camera pose estimation problem, it is trained a Convolutional Neural Network (CNN) to take as input two separated images in time and space in order to output a 6 Degree of Freedom (DoF) pose vector, representing the relative position and orientation between the input images. In conjunction, both systems solve the global localization problem using topological and metric information to approximate the actual robot pose. The proposed hybrid weightless-weighted neural network approach is naturally combined in a way that the output of one system is the input to the other producing competitive results for the Global Localization task. The full approach is compared against a Real Time Kinematic GPS system and a Visual Simultaneous Localization and Mapping (SLAM) system. Experimental results show that the proposed combined approach is able to correctly global localize an autonomous vehicle 90% of the time with a mean error of 1.20m compared to 1.12m of the Visual SLAM system and 0.37m of the GPS, 89% of the time.

Palavras-chave

Convolutional neural networks, Weightless neural networks, Autonomous vehicle navigation, Deep learning, Redes neurais convolucionais, Redes neurais sem peso, Carros autônomos

Citação

SILVA, Avelino Forechi. Appearance-based global localization with a hybrid weightless-weighted neural network approach. 2018. 104 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Espírito Santo, Centro Tecnológico, Vitória, 2018.

URI

http://repositorio.ufes.br/handle/10/9876

Coleções

Doutorado em Ciência da Computação

Página do item completo

Appearance-based global localization with a hybrid weightless-weighted neural network approach

Arquivos

Data

Autores

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

Descrição

Palavras-chave

Citação

URI

Coleções

Avaliação

Revisão

Suplementado Por

Referenciado Por