Hardware systems for low-latency audio processing: Event-based and multichannel synchronous sampling approaches

Kiselev, Ilya

doi:10.3929/ethz-b-000502086

Show simple item record

dc.contributor.author

Kiselev, Ilya

dc.contributor.supervisor

Liu, Shih-Chii

dc.contributor.supervisor

Hahnloser, Richard H.R.

dc.contributor.supervisor

Conradt, Jörg

dc.date.accessioned

2021-08-25T06:14:10Z

dc.date.available

2021-08-24T20:41:35Z

dc.date.available

2021-08-25T06:14:10Z

dc.date.issued

2021

dc.identifier.uri

http://hdl.handle.net/20.500.11850/502086

dc.identifier.doi

10.3929/ethz-b-000502086

dc.description.abstract

Neuromorphic technology is slowly maturing with a variety of useable event-driven spiking sensors and hardware implementations of spiking neural networks. Sensory processing algorithms are still under investigation and their usefulness in natural environments are still relatively unexplored compared to algorithms using conventional sensors and digital hardware. We developed hardware test beds that allow us to explore event-based sensory processing algorithms and regular sampling based algorithms in real-world conditions. The goal of my thesis is three-fold: 1) to develop a hardware test bed for implementing spiking networks together with spiking sensors to study a possibility of using multiple sensors of different modalities to improve classification performance in real-world conditions; 2) to implement a local automatic gain control mechanism to increase the input dynamic range of a spiking cochlea operating in natural environments where the sound dynamic range can be greater than 60 dB; 3) to implement a multi-microphone hardware platform that can be used for real-time beamforming as part of a wireless acoustic sensor network. The first part of the thesis describes development of a real-time hardware system that fuses information from neuromorphic spiking sensors of different modalities. The core of the system is a general purpose accelerator for spiking Deep Neural Networks (DNN) implemented on a Field-Programmable Gate Array (FPGA). We demonstrate the performance of the system on an audio-visual sensor fusion task using a Dynamic Vision Sensor (DVS) and a Dynamic Audio Sensor (DAS) spiking sensors for classification of digits from the Modified National Institute of Standards and Technology (MNIST) dataset augmented with specific audio tones for each digit. We demonstrate that reliable classification is possible with just a fraction of spikes produced by the sensors. On the other hand, processing the full stream of spikes increases the computational demand of the system proportionally to the increase of the spike rate. In addition, the spike rate of the audio sensor depends on the input signal amplitude, which makes it difficult to train classifiers to be invariant to input signals with a wide dynamic range. However, it is known that biological audio and visual processing systems can accommodate to input signals that differ by orders of magnitude, while maintaining a moderate neuron spike rate. The second part of the thesis addresses the problem of increasing spike rates in response to high amplitude signals in the spiking silicon cochlea by developing a local spike-based gain control algorithm, that constantly monitors the spike rate at the output of each channel and adapts the corresponding channel gain, so that its spike rate would not exceed a predefined threshold. We implemented this algorithm in hardware for the Dynamic Audio Sensor Low Power (DASLP) silicon cochlea and studied its performance on synthetic tests and real audio classification problem. The third part of the thesis work is carried out within a multi-partner European project, COCOHA (COgnitive COntrol of a Hearing Aid, www.cocoha.org), that aimed to develop a system for attention decoding from electroencephalogram (EEG) signals for directing the speech of an attended talker to the user of a hearing aid device. The goal of this work is to construct a synchronized distributed multi-microphone platform which can be used for general auditory scene analysis. The developed platform is composed of multi-microphone modules which can perform synchronized audio sampling at different parts of the room and transmit the audio streams with low latency to a central processing unit, where the samples from different microphones can be aligned with a sub-microsecond precision. Synchronized sampling across the ad-hoc distributed microphone array enables a variety of algorithms to be used for further processing, e.g. for tasks such as beamforming, source separation or speech enhancement. The platform was used for testing a set of beamforming algorithms in the wild. All three parts serve a common goal of enabling application of novel auditory sensing technology in practically relevant settings, by coping with challenges of real-world deployment.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

ETH Zurich

en_US

dc.rights.uri

http://rightsstatements.org/page/InC-NC/1.0/

dc.subject

sensor fusion

en_US

dc.subject

Spiking deep neural networks

en_US

dc.subject

Event-Driven Sensors

en_US

dc.subject

automatic gain control

en_US

dc.subject

wireless acoustic sensor networks

en_US

dc.subject

wireless synchronization

en_US

dc.subject

audio source separation

en_US

dc.subject

beamforming

en_US

dc.title

Hardware systems for low-latency audio processing: Event-based and multichannel synchronous sampling approaches

en_US

dc.type

Doctoral Thesis

dc.rights.license

In Copyright - Non-Commercial Use Permitted

dc.date.published

2021-08-25

ethz.size

130 p.

en_US

ethz.code.ddc

DDC - DDC::6 - Technology, medicine and applied sciences::621.3 - Electric engineering

en_US

ethz.code.ddc

DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science

en_US

ethz.identifier.diss

27602

en_US

ethz.publication.place

Zurich

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02533 - Institut für Neuroinformatik / Institute of Neuroinformatics::03774 - Hahnloser, Richard H.R. / Hahnloser, Richard H.R.

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02533 - Institut für Neuroinformatik / Institute of Neuroinformatics::08836 - Delbrück, Tobias (Tit.-Prof.)

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02533 - Institut für Neuroinformatik / Institute of Neuroinformatics::03774 - Hahnloser, Richard H.R. / Hahnloser, Richard H.R.

en_US

ethz.relation.hasPart

10.1109/ISCAS.2016.7539099

ethz.relation.hasPart

10.1109/LCN.Workshops.2017.62

ethz.relation.hasPart

10.1109/ISCAS51556.2021.9401742

ethz.date.deposited

2021-08-24T20:41:41Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2021-08-25T06:14:23Z

ethz.rosetta.lastUpdated

2022-03-29T11:18:34Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Hardware%20systems%20for%20low-latency%20audio%20processing:%20Event-based%20and%20multichannel%20synchronous%20sampling%20approaches&rft.date=2021&rft.au=Kiselev,%20Ilya&rft.genre=unknown&rft.btitle=Hardware%20systems%20for%20low-latency%20audio%20processing:%20Event-based%20and%20multichannel%20synchronous%20sampling%20approaches

Search print copy at ETH Library

Files in this item

Name:: PhD_Thesis_Kiselev.pdf
Size:: 9.423Mb
Format:: Adobe PDF
Label:: Full text

Download

Publication type

Doctoral Thesis [30274]

Show simple item record

Research Collection

Search

Hardware systems for low-latency audio processing: Event-based and multichannel synchronous sampling approaches Mendeley CSV RIS BibTeX

Files in this item

Publication type

Hardware systems for low-latency audio processing: Event-based and multichannel synchronous sampling approaches

Mendeley

CSV

RIS

BibTeX