Zur Kurzanzeige

dc.contributor.author
Acheson, Elise
dc.contributor.author
Volpi, Michele
dc.contributor.author
Purves, Ross S.
dc.date.accessioned
2022-03-22T10:05:57Z
dc.date.available
2019-06-21T05:07:31Z
dc.date.available
2019-06-21T11:40:01Z
dc.date.available
2020-03-06T11:10:35Z
dc.date.available
2020-03-06T11:11:18Z
dc.date.available
2022-03-22T10:05:57Z
dc.date.issued
2020
dc.identifier.issn
1362-3087
dc.identifier.issn
1365-8816
dc.identifier.other
10.1080/13658816.2019.1599123
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/348855
dc.description.abstract
Defining and identifying duplicate records in a dataset is a challenging task which grows more complex when the modeled entities themselves are hard to delineate. In the geospatial domain, it may not be clear where a mountain, stream, or valley ends and begins, a problem carried over when such entities are catalogued in gazetteers. In this paper, we take two gazetteers, GeoNames and SwissNames3D, and perform matching – identifying records in each that are about the same entity – across a sample of natural feature records. We first perform rule-based matching, establishing competitive results, then apply machine learning using Random Forests, a method well-suited to the matching task. We report on the performance of a wider array of matching features than has been previously studied, including domain-specific ones such as feature type, land cover class, and elevation. Our results show an increase in performance using machine learning over rules, with a notable performance gain from considering feature types, but negligible gains from other specialized matching features. We argue that future work in this area should strive to be more reproducible and report results on a realistic testing pipeline including candidate selection, feature extraction, and classification.
en_US
dc.language.iso
en
en_US
dc.publisher
Taylor & Francis
en_US
dc.subject
Gazetteer matching
en_US
dc.subject
record linking
en_US
dc.subject
random forest
en_US
dc.subject
natural features
en_US
dc.subject
feature types
en_US
dc.title
Machine learning for cross-gazetteer matching of natural features
en_US
dc.type
Journal Article
dc.date.published
2019-04-22
ethz.journal.title
International Journal of Geographical Information Science
ethz.journal.volume
34
en_US
ethz.journal.issue
4
en_US
ethz.journal.abbreviated
Int. J. Geographical Information Systems
ethz.pages.start
708
en_US
ethz.pages.end
734
en_US
ethz.identifier.wos
ethz.identifier.scopus
ethz.publication.place
Abingdon
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2019-06-21T05:07:33Z
ethz.source
WOS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2020-03-06T11:10:46Z
ethz.rosetta.lastUpdated
2022-03-29T20:43:25Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Machine%20learning%20for%20cross-gazetteer%20matching%20of%20natural%20features&rft.jtitle=International%20Journal%20of%20Geographical%20Information%20Science&rft.date=2020&rft.volume=34&rft.issue=4&rft.spage=708&rft.epage=734&rft.issn=1362-3087&1365-8816&rft.au=Acheson,%20Elise&Volpi,%20Michele&Purves,%20Ross%20S.&rft.genre=article&rft_id=info:doi/10.1080/13658816.2019.1599123&
 Printexemplar via ETH-Bibliothek suchen

Dateien zu diesem Eintrag

DateienGrößeFormatIm Viewer öffnen

Zu diesem Eintrag gibt es keine Dateien.

Publikationstyp

Zur Kurzanzeige