Derstroff, Adrian; Leistikow, Simon; Nahardani, Ali; Gruen, Katja; Franz, Marcus; Hoerr, Verena; Linsen, Lars
Research article (journal) | Peer reviewedUnderstanding how a classification result is generated and what role individual features play in the classification is crucial in many applications and, in particular, in medical contexts such as the translation of diagnosis biomarkers into clinical practice. The goal is to find (ideally simple) relationships between the features in multi-dimensional data and the classification for an explanation of the underlying phenomenon. Mathematical formulas allow for the expression of these relationships and can serve as classifiers. However, there are infinitely many mathematical formulas for the given features and they bear an inherent trade-off between complexity and accuracy. We present an interactive visual approach that supports domain experts to mitigate the trade-off issue. Core to our approach is a novel feature selection method, from which formulas are composed using symbolic regression and where state-of-the-art classifiers serve as a reference. To evaluate our approach and compare the achieved classification performance to the performance achieved by other state-of-the-art feature selection techniques, we test our methods with well-known machine learning data sets. Our evaluation shows that our feature selection method performs better than randomly selecting features for data sets with many features or when a low number of generations in the symbolic regression is required. Moreover, it consistently matches or outperforms state-of-the-art methods. Moreover, we apply our approach in a case study to a hemodynamic cohort data set, where we report our findings and domain expert feedback. Our approach was able to find formulas containing features that are in agreement with literature. Also, we could find formulas that performed better in the micro-averaged F1 score when compared to established histological indices.
Derstroff, Adrian | Professorship for Practical Computer Science (Prof. Linsen) |
Hörr, Verena | Clinic of Radiology |
Leistikow, Simon | Professorship for Practical Computer Science (Prof. Linsen) |
Linsen, Lars | Professorship for Practical Computer Science (Prof. Linsen) |
Nahardani, Ali | Clinic of Radiology |