Show simple item record

dc.contributor.author
Dahiya, Aneesh
dc.contributor.supervisor
Spurr, Adrian
dc.contributor.supervisor
Hilliges, Otmar
dc.date.accessioned
2021-06-09T07:53:33Z
dc.date.available
2021-05-17T20:46:30Z
dc.date.available
2021-05-18T12:01:10Z
dc.date.available
2021-06-09T07:53:33Z
dc.date.issued
2021-03
dc.identifier.uri
http://hdl.handle.net/20.500.11850/484477
dc.identifier.doi
10.3929/ethz-b-000484477
dc.description.abstract
Estimating 3D hand pose from a monocular RGB image is a challenging task. This is largelydue to the limited amount of available labeled data, as annotating images for 3D hand poserequires a complex multi-camera setup and a controlled lab-like setting. This in turn introducesa domain gap between the different hand pose datasets and the unconstrained settings of thereal world. In this thesis, we develop a self-supervised method to use unlabeled data from dif-ferent hand pose datasets to improve the accuracy of 3D hand pose estimation, and to bridgethe domain gap. We propose a novel contrastive learning framework for pose estimation, in-spired by the recent success of contrastive learning on image classification tasks. In a standardcontrastive learning framework, a model tries to learn a feature representation that is invariantunder any image augmentation. This can be beneficial, as the pose is invariant to appearancebased image augmentations. However, geometric augmentations (like rotation) change the poseequivariantly. However using geometric augmentations with contrastive self-supervision leadsto invariance. This can be detrimental to the pose estimation. We empirically show that thefeatures learned with our equivariant contrastive framework lead to more improvement whencompared to standard contrastive frameworks. Furthermore, we attain an improvement of7.6%in PA MKP-3D on FreiHAND with a standard ResNet-152, trained with additional unlabeleddata when compared to a fully supervised baseline. This enables us to achieve state-of-the-artperformance in a purely data driven way, without any task-specific specialized architecture.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
3D hand pose estimation
en_US
dc.subject
Self supervision
en_US
dc.title
Exploring self-supervised learning techniques for hand pose estimation
en_US
dc.type
Master Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
ethz.size
53 p.
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02658 - Inst. Intelligente interaktive Systeme / Inst. Intelligent Interactive Systems::03979 - Hilliges, Otmar / Hilliges, Otmar
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02658 - Inst. Intelligente interaktive Systeme / Inst. Intelligent Interactive Systems::03979 - Hilliges, Otmar / Hilliges, Otmar
en_US
ethz.date.deposited
2021-05-17T20:46:36Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2021-06-09T07:53:40Z
ethz.rosetta.lastUpdated
2023-02-06T22:05:03Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Exploring%20self-supervised%20learning%20techniques%20for%20hand%20pose%20estimation&rft.date=2021-03&rft.au=Dahiya,%20Aneesh&rft.genre=unknown&rft.btitle=Exploring%20self-supervised%20learning%20techniques%20for%20hand%20pose%20estimation
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record