Who Wrote When? Author Diarization in Social Media DiscussionsOpen Access

Boenninghoff, Benedikt; Hosseini, Henry; Nickel, Robert M.; Kolossa, Dorothea

Research article in edited proceedings (conference) | Peer reviewed

Abstract

We are proposing a novel framework for author diarization, i.e. attributing comments in online discussions to individual authors. We consider an innovative approach that merges pre-trained neural representations of writing style with author-conditional encoder-decoder diarization, enhanced by a Conditional Random Field with Viterbi decoding for alignment refinement. Additionally, we introduce two new large-scale German language datasets, one for authorship verification and the other for author diarization. We evaluate the performance of our diarization framework on these datasets, offering insights into the strengths and limitations of this approach.

Details about the publication

EditorsAl-Onaizan, Yaser; Bansal, Mohit; Chen, Yun-Nung
Book titleFindings of the Association for Computational Linguistics: EMNLP 2024
Page range15721-15734
PublisherSelbstverlag / Eigenverlag
Published byAssociation for Computational Linguistics
Place of publicationMiami, Florida, USA
StatusPublished
Release year2024
Language in which the publication is writtenEnglish
ConferenceEmpirical Methods in Natural Language Processing (EMNLP), Miami, Florida, United States
Link to the full texthttps://aclanthology.org/2024.findings-emnlp.922
KeywordsNLP; Deep Learning; Author Diarization; Social Media

Authors from the University of Münster

Hosseini, Henry
Department of Information Systems (WI)

Projects the publication originates from

Duration: 01/01/2023 - 31/12/2026 | 1st Funding period
Funded by: DFG - Research Unit
Type of project: Subproject in DFG-joint project hosted at University of Münster