Recovering Patient Journeys : A Corpus of Biomedical Entities and Relations on Twitter (BEAR)

Wührl, Amelie; Klinger, Roman

Faculty/Chair:

Fundamentals of Natural Language Processing

Author:

Wührl, Amelie

;

Klinger, Roman

Title of the compilation:

Proceedings of the Thirteenth Language Resources and Evaluation Conference

Editors:

Calzolari, Nicoletta

Béchet, Frédéric

Blache, Philippe

Choukri, Khalid

Cieri, Christopher

Declerck, Thierry

Goggi, Sara

Isahara, Hitoshi

Maegaard, Bente

Mariani, Joseph

Mazo, Hélène

Odijk, Jan

Piperidis, Stelios

Conference:

Thirteenth Language Resources and Evaluation Conference (LREC), Juni 2022 ; Marseille

Publisher Information:

Marseille : European Language Resources Association

Year of publication:

2022

Pages:

4439-4450

Language:

English

URL:

https://aclanthology.org/2022.lrec-1.472

Abstract:

Text mining and information extraction for the medical domain has focused on scientific text generated by researchers. However, their access to individual patient experiences or patient-doctor interactions is limited. On social media, doctors, patients and their relatives also discuss medical information. Individual information provided by laypeople complements the knowledge available in scientific text. It reflects the patient’s journey making the value of this type of data twofold: It offers direct access to people’s perspectives, and it might cover information that is not available elsewhere, including self-treatment or self-diagnose. Named entity recognition and relation extraction are methods to structure information that is available in unstructured text. However, existing medical social media corpora focused on a comparably small set of entities and relations. In contrast, we provide rich annotation layers to model patients’ experiences in detail. The corpus consists of medical tweets annotated with a fine-grained set of medical entities and relations between them, namely 14 entity (incl. environmental factors, diagnostics, biochemical processes, patients’ quality-of-life descriptions, pathogens, medical conditions, and treatments) and 20 relation classes (incl. prevents, influences, interactions, causes). The dataset consists of 2,100 tweets with approx. 6,000 entities and 2,200 relations.

GND Keywords:

Twitter <Softwareplattform>

;

Korpus <Linguistik>

;

Textanalyse

;

Patient

;

Reise

;

Maschinelles Lernen

Keywords:

Patient Journeys

DDC Classification:

004 Computer science

RVK Classification:

ST 306

Peer Reviewed:

Yes:

International Distribution:

Yes:

Open Access Journal:

Yes:

Type:

Conferenceobject

URI:

https://fis.uni-bamberg.de/handle/uniba/93886

Activation date:

March 7, 2024

Project(s):

Automatic Fact Checking for Biomedical Information in Social Media and Scientific Literature

Permalink https://fis.uni-bamberg.de/handle/uniba/93886

FIS

Versioning

Question on publication

Options

Versioning

Question on publication