Advancing Software Reliability from Code to Compilation

Li, Shaohua

doi:10.3929/ethz-b-000676103

Download

Full text (PDF, 8.660Mb)

Open access

Author

Li, Shaohua

Date

2024

Type

Doctoral Thesis

ETH Bibliography

yes

Altmetrics

Download

Full text (PDF, 8.660Mb)

Rights / license

In Copyright - Non-Commercial Use Permitted

Abstract

Software takes charge of every critical aspect of our modern society, including communication, finance, transportation, and many more. It is thus crucial to ensure the reliability of software systems. Yet, guaranteeing that non-trivial software systems are free of defects is extremely difficult, if not impossible. Consequently, modern software systems are full of bugs, such as security vulnerabilities, semantic bugs, performance issues, etc. The motivating question of this thesis is: where can software go wrong? Software development is an intricate process with many different procedures in the pipeline. Beyond the source code written by developers, there are many other tools involved, such as code analysis tools used for identifying defects and compilers used for translating source code into machine code. Unfortunately, they can all go wrong. In this thesis, we study the reliability problem from three different levels: code, code analysis, and code compilation. At a high level, we design new methodologies to identify and detect bugs at all of these levels. For the reliability of code, we focus on eliminating undefined behavior, a major source of reliability bugs such as buffer-overflow and use-after-free, in modern C/C++ software. We develop a general detection approach to identify undefined behaviors practically and effectively. To improve detection efficiency, we further present two novel concepts to accelerate the existing detection frameworks. For the reliability of code analysis, we aim to validate existing bug detection tools for undefined behaviors. We propose and design the first program generator that can automatically produce a large number of programs with various undefined behaviors. We then use this generator to validate sanitizers, one of the most popular toolsets for undefined behavior detection. For the reliability of code compilation, we concentrate on solidifying the modern compiler implementations. We introduce a novel data-driven program generation technique that can generate expressive and well-formed programs based on real-world code snippets. At the conceptual level, this thesis highlights the prevalence of reliability problems in the software development pipeline, from code to compilation. At the technical level, this thesis presents five new tools for detecting software defects in source code, code analysis tools, and compilers. Show more

Permanent link

https://doi.org/10.3929/ethz-b-000676103

Publication status

published

External links

Search print copy at ETH Library

Contributors

Examiner: Su, Zhendong
Examiner: Payer, Mathias
Examiner: Zeller, Andreas

Publisher

ETH Zurich

Subject

Programming Languages; Computer security; Software engineering; Compilers

Organisational unit

02150 - Dep. Informatik / Dep. of Computer Science
09628 - Su, Zhendong / Su, Zhendong

More

Show all metadata

ETH Bibliography

yes

Altmetrics

Research Collection

Search

Advancing Software Reliability from Code to Compilation Mendeley CSV RIS BibTeX

Advancing Software Reliability from Code to Compilation

Mendeley

CSV

RIS

BibTeX