Multidimensional data stemming from measurements, observations, and simulations are a significant source of new knowledge. The huge amount of data however, requires techniques such as dimensionality reduction and projection methods to enable more efficient exploration and analysis of multidimensional datasets. Data attributes may range on different scales, often depending on arbitrary measurement units. Therefore, data preprocessing and, in particular, data normalization is necessary prior to applying any dimensionality reduction method. Existing data normalization techniques usually assume certain data characteristics, e.g., obeying standard statistical models, or poorly scale as the data size increases. Improper normalization of raw data attributes may result in artificial misleading data structures (clusters, outliers, shapes, density hierarchies) in the lower-dimensional domain. We propose a research project aimed at developing efficient, scalable and generally applicable approaches for normalizing multidimensional data. New normalization techniques will be coupled with linear and non-linear projection methods. Then, data structures observed by the users in the projection domain reliably represent intrinsic features of the raw data. Optimization, analysis and interpretability of normalization coefficients when preprocessing time-varying and ensemble datasets are parts of the proposed techniques.
Molchanov, Vladimir | Professorship for Practical Computer Science (Prof. Linsen) |
Molchanov, Vladimir | Professorship for Practical Computer Science (Prof. Linsen) |