Scalable Automated Reasoning for Programs and Deep Learning

Singh, Gagandeep

doi:10.3929/ethz-b-000445921

Download

Full text (PDF, 2.583Mb)

Open access

Author

Singh, Gagandeep

Date

2020

Type

Doctoral Thesis

ETH Bibliography

yes

Altmetrics

Download

Full text (PDF, 2.583Mb)

Rights / license

In Copyright - Non-Commercial Use Permitted

Abstract

With the widespread adoption of modern computing systems in different real-world applications such as autonomous vehicles, medical diagnosis, and aviation, it is critical to establish formal guarantees on their correctness before they are employed in the real-world. Automated formal reasoning about modern systems has been one of the core problems in computer science and has therefore attracted considerable interest from the research community. However, the problem has turned out to be quite challenging because of the ever-increasing scale, complexity, and diversity of these systems, which has so far limited the applicability of formal methods for their automated analysis. The central problem addressed in this dissertation is: are there generic methods for designing fast and precise automated reasoning for modern systems? We focus on two practically important problem domains: numerical software and deep learning models. We design new concepts, representations, and algorithms for providing the desired formal guarantees. We build our methodology on the elegant abstract interpretation framework for static analysis, which enables automated reasoning about infinite concrete behaviors with finite computable representations. Our methods are generic for the particular problem domain and allow precise analysis beyond the reach of prior work. For programs, we present a new theory of online decomposition that dynamically decomposes expensive computations, thereby reducing their complexity without any precision loss. Our theory is generic and can be used for decomposing all existing subpolyhedra domains without sacrificing precision. We leverage data-driven machine learning to further improve the performance of numerical program analysis without significant precision loss. For neural networks, we designed a new abstraction equipped with custom approximations of non-linearities commonly used in neural networks for fast and scalable analysis. We also created a new convex relaxations framework that produces more precise relaxations than possible with prior work. We provide a novel combination of our abstractions and relaxation framework with precise solvers, which enables state-of-the-art certification results. This thesis presents two new publicly available software systems: ELINA and ERAN. ELINA provides optimized implementations of the popular Polyhedra, Octagon, and Zone domain, based on our theory of online decomposition enabling fast and precise analysis of large Linux device drivers containing $>$ 500 variables in a few seconds. ERAN contains our custom abstraction, convex relaxation framework, and combination of relaxations with solvers for enabling fast and precise analysis of large neural networks, containing tens of thousands of neurons, within a few seconds. Both the systems were developed from scratch, and are currently state-of-the-art for their respective domains, producing results not possible with other competing systems. Show more

Permanent link

https://doi.org/10.3929/ethz-b-000445921

Publication status

published

External links

Search print copy at ETH Library

Contributors

Examiner: Vechev, Martin
Examiner: Püschel, Markus