Semantic Segmentation and Mapping for Natural Environments

Digumarti, Sundara Tejaswi

doi:10.3929/ethz-b-000392796

Download

Full text (PDF, 44.53Mb)

Open access

Author

Digumarti, Sundara Tejaswi

Date

2019-12-03

Type

Doctoral Thesis

ETH Bibliography

yes

Altmetrics

Download

Full text (PDF, 44.53Mb)

Rights / license

In Copyright - Non-Commercial Use Permitted

Abstract

Research on reconstruction of objects and environments in three dimensions has made great progress over the past decade. Applications such as building maps of environments, creating 3D models for robot manipulation, and generating digital content for use in movies, games and virtual environments all benefit from techniques that can reconstruct objects with high fidelity and accuracy. Robustness of the reconstruction pipeline can be improved by incorporating semantic information from the environment. Semantic information also supplements the reconstructed model enabling applications such as robot manipulation, measurement of class specific metrics, and artistic control of objects. However, a vast majority of this research, both reconstruction and semantic segmentation, is targeted towards human-made objects and environments. These are characterized by geometry that is easier to parametrize, and features such as corners and edges, that can be tracked reliably even from viewpoints that are far apart. On the other hand, natural structures such as trees, foliage and corals consist of elements that are self-similar, repetitive, non-parametric and semi-rigid, have self-occluding geometry and display limited variation in colour information. This renders it challenging to apply the techniques developed for human made objects in natural environments. The focus of this thesis is to develop algorithms to tackle some of these challenges and enable high quality reconstruction of natural structures. Understanding semantics helps mitigate some of these challenges. To this end we propose three algorithms for semantic segmentation of vegetation. The first algorithm proposes the use of features based on surface curvatures as the representation of local geometry. The second one aims to learn these features using a Convolutional Neural Network (CNN). The third method also uses CNNs but performs semantic segmentation in single frame RGB-D images, as opposed to full point clouds used in the first two approaches. As this approach learns features from partial observations of geometry, it can be used in improving the robustness of the reconstruction framework. Due to complexity in deriving accurate parametric models of the unstructured geometry, we take a data-driven approach in all the three algorithms and learn features directly from the data. Data required for this purpose is generated using state-of-the-art simulation software. Evaluation on real data shows the extent to which knowledge transfers from simulation to reality. Improving camera tracking paves way for better reconstruction accuracy. Given that traditional feature based approaches perform poorly in natural environments, we employ deep neural networks to learn robust features directly from the environment. Here, we push the data-driven approach to its limit and investigate if a deep neural network can learn to predict poses from input images through end-to-end learning. Finally, we extend the scope of the aforementioned techniques for underwater environments to facilitate scanning and reconstruction of coral reefs. We demonstrate underwater 3D capture using commodity depth cameras and present an algorithm to calibrate a camera and its housing in order to undo the distortions caused due to refraction. Show more

Permanent link

https://doi.org/10.3929/ethz-b-000392796

Publication status

published

External links

Search print copy at ETH Library

Contributors

Examiner: Siegwart, Roland
Examiner: Beardsley, Paul
Examiner: Deussen, Oliver

Publisher

ETH Zurich

Subject

Semantic segmentation; Mapping; Deep Learning; Vegetation modeling; 3D Reconstruction