Options
BibDedupe : An Open-Source Python Library for Bibliographic Record Deduplication
Wagner, Gerit (2024): BibDedupe : An Open-Source Python Library for Bibliographic Record Deduplication, in: The journal of open source software : a developer friendly journal for research software packages, The Open Journal, Jg. 9, Nr. 97, 6318, S. 1–6, doi: 10.21105/joss.06318.
Faculty/Chair:
Author:
Title of the Journal:
The journal of open source software : a developer friendly journal for research software packages
ISSN:
2475-9066
Publisher Information:
Year of publication:
2024
Volume:
9
Issue:
97, 6318
Pages:
Language:
English
DOI:
Abstract:
BibDedupe is a Python library developed for bibliographic record deduplication in meta-analysis and research synthesis. It is constructed with a focus on four requirements: (1) Zero false positives: The primary objective is to prevent incorrectly merging distinct entries. This focus on zero false positives is crucial to ensure trustworthiness and prevent biased conclusions in the analysis. (2) Reproducibility: BibDedupe implements fixed rules to produce consistent results, in line with the scientific standard of reproducibility. (3) Efficiency: The library is also tuned for low false-negative rates and rapid processing, to ensure scalability of the duplicate identification process. (4) Continuous evaluation and improvement: It is continuously evaluated on over 160,000 records from 10 datasets to ensure its effectiveness, especially in follow-up refinements. Unlike general-purpose deduplication tools, BibDedupe is specifically designed for the unique requirements of bibliographic data in meta-analysis and research synthesis. In this context, BibDedupe aims to provide a Python library that improves the effectiveness and efficiency of duplicate identification, potentially benefitting review papers across scientific disciplines.
Keywords:
BibDedupe
Type:
Article
Activation date:
May 27, 2024
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/95383