The Multi-Feature Tagger of English (MFTE): Rationale, description and evaluationOpen Access

Foll Le Elen , Shakir Muhammad

Forschungsartikel (Zeitschrift) | Peer reviewed

Zusammenfassung

The Multi-Feature Tagger of English (MFTE) provides a transparent and easily adaptable open-source tool for multivariable analyses of English corpora. Designed to contribute to the greater reproducibility, transparency, and accessibility of multivariable corpus studies, it comes with a simple GUI and is available both as a richly annotated Python script and as an executable file. In this article, we detail its features and how they are operationalised. The default tagset comprises 74 lexico-grammatical features, ranging from attributive adjectives and progressives to tag questions and emoticons. An optional extended tagset covers more than 70 additional features, including many semantic features, such as human nouns and verbs of causation. We evaluate the accuracy of the MFTE on a sample of 60 texts from the BNC2014 and COCA, and report precision and recall metrics for all the features of the simple tagset. We outline how that the use of a well-documented, open-source tool can contribute to improving the reproducibility and replicability of multivariable studies of English.

Details zur Publikation

FachzeitschriftResearch in Corpus Linguistics
Jahrgang / Bandnr. / Volume13
Ausgabe / Heftnr. / Issue2
Seitenbereich63-93
StatusVeröffentlicht
Veröffentlichungsjahr2024 (12.01.2024)
DOI10.32714/ricl.13.02.03
Link zum Volltexthttp://dx.doi.org/10.32714/ricl.13.02.03
StichwörterCorpus Linguistics, English Linguistics, Register Studies, Multidimensional Analysis

Autor*innen der Universität Münster

Shakir, Muhammad
Professur für Variationslinguistik (Prof. Deuber)