Options
BiPaSs : Further Investigation of Fast Pathfinding in Wikidata
Martin, Leon (2024): BiPaSs : Further Investigation of Fast Pathfinding in Wikidata, in: Bamberg: Otto-Friedrich-Universität, S. 110–126.
Faculty/Chair:
Author:
Publisher Information:
Year of publication:
2024
Pages:
Series ; Volume:
Studies on the Semantic Web
Source/Other editions:
Maribel Acosta, Silvio Peroni, Sahar Vahdati, u. a. (Hrsg.), Knowledge graphs: semantics, machine learning, and languages : proceedings of the 19th International Conference on Semantic Systems, 20-22 September 2023, Leipzig, Germany, Berlin: IOS Press, 2023, S. 110–126, ISBN: 9781643684246, 9781643684253
Year of first publication:
2023
Language:
English
Abstract:
Purpose:
A previous paper proposed a bidirectional A* search algorithm for quickly finding meaningful paths in Wikidata that leverages semantic distances between entities as part of the search heuristics. However, the work lacks an optimization of the algorithm’s hyperparameters and an evaluation on a large dataset among others. The purpose of the present paper is to address these open points.
Methodology:
Approaches aimed at enhancing the accuracy of the semantic distances are discussed. Furthermore, different options for constructing a dataset of dual-entity queries for pathfinding in Wikidata are explored. 20% of the compiled dataset are utilized to fine-tune the algorithm’s hyperparameters using the Simple optimizer. The optimized configuration is subsequently evaluated against alternative configurations, including a baseline, using the remaining 80% of the dataset.
Findings:
The additional consideration of entity descriptions increases the accuracy of the semantic distances. A dual-entity query dataset with 1,196 entity pairs is derived from the TREC 2007 Million Query Track dataset. The optimization yields the values 0.699/0.109/0.823 for the hyperparameters. This configuration achieves a higher coverage of the test set (79.2%) with few entity visits (24.7 on average) and moderate path lengths (4.4 on average). For reproducibility, the implementation called BiPaSs, the query dataset, and the benchmark results are provided.
Value:
Web search engines reliably generate knowledge panels with summarizing information only in response to queries mentioning a single entity. This paper shows that quickly finding paths between unseen entities in Wikidata is feasible. Based on these paths, knowledge panels for dual-entity queries can be generated that provide an explanation of the mentioned entities’ relationship, potentially satisfying the users’ information need.
A previous paper proposed a bidirectional A* search algorithm for quickly finding meaningful paths in Wikidata that leverages semantic distances between entities as part of the search heuristics. However, the work lacks an optimization of the algorithm’s hyperparameters and an evaluation on a large dataset among others. The purpose of the present paper is to address these open points.
Methodology:
Approaches aimed at enhancing the accuracy of the semantic distances are discussed. Furthermore, different options for constructing a dataset of dual-entity queries for pathfinding in Wikidata are explored. 20% of the compiled dataset are utilized to fine-tune the algorithm’s hyperparameters using the Simple optimizer. The optimized configuration is subsequently evaluated against alternative configurations, including a baseline, using the remaining 80% of the dataset.
Findings:
The additional consideration of entity descriptions increases the accuracy of the semantic distances. A dual-entity query dataset with 1,196 entity pairs is derived from the TREC 2007 Million Query Track dataset. The optimization yields the values 0.699/0.109/0.823 for the hyperparameters. This configuration achieves a higher coverage of the test set (79.2%) with few entity visits (24.7 on average) and moderate path lengths (4.4 on average). For reproducibility, the implementation called BiPaSs, the query dataset, and the benchmark results are provided.
Value:
Web search engines reliably generate knowledge panels with summarizing information only in response to queries mentioning a single entity. This paper shows that quickly finding paths between unseen entities in Wikidata is feasible. Based on these paths, knowledge panels for dual-entity queries can be generated that provide an explanation of the mentioned entities’ relationship, potentially satisfying the users’ information need.
GND Keywords: ; ;
Wissensgraph
Suchalgorithmus
Optimierung
Keywords: ; ; ;
knowledge graphs
pathfinding
hyperparameter optimization
Wikidata
DDC Classification:
RVK Classification:
Peer Reviewed:
Yes:
International Distribution:
Yes:
Type:
Conferenceobject
Activation date:
May 30, 2025
Permalink
https://fis.uni-bamberg.de/handle/uniba/108392