Forthmann, B., & Doebler, P
Research article (journal) | Peer reviewedAutomated scoring of divergent thinking tasks is a current hot topic in creativity research. Most of the debated approaches are unsupervised machine learning approaches and researchers seemingly just started to evaluate supervised approaches. Hence, rediscovering the seminal work of Paulus et al. (1970) came as a big surprise to us. More than 50 years ago, they derived prediction formulas for an automated scoring of the Torrance Test of Creative Thinking (Torrance, 1966) that was based on a set of text mining variables (e.g., average word length, word counts, and so forth). They found quite impressive cross-validation results. This work reintroduces Paulus et al.’s (1970) approach and investigates how it performs compared with semantic distance scoring. The main contribution of Paulus et al.’s (1970) neglected masterpiece on divergent thinking assessment is echoed by the findings of this work: Creative quality of responses can be well predicted by means of simple text mining statistics. The validity was also stronger as compared with semantic distance. Importantly, using the Paulus et al. (1970) features in a state-of-the-art supervised machine learning approach does not outperform the simple stepwise regression used by Paulus et al. (1970). Yet, we found that supervised machine learning can outperform the Paulus et al. (1970) approach, when semantic distance is added to the set of prediction variables. We discuss challenges that are expected for future research that aim at combining unsupervised approaches based on word embeddings and supervised learning relying on text-mining features.
Forthmann, Boris | Professorship for statistics and research methods in psychology |