Towards Glyph-based Visualizations for Big Data Clustering

Mandy Keck, Dietrich Kammer, Thomas Gründer, Thomas Thom, Martin Kleinsteuber, Alexander Maasch, Rainer Groh: Towards Glyph-based Visualizations for Big Data Clustering . Proceedings of the 10th International Symposium on Visual Information Communication and Interaction, VINCI '17 ACM, New York, NY, USA, 2017, ISBN: 978-1-4503-5292-5.

Abstract

Data Analysts have to deal with an ever-growing amount of data resources. One way to make sense of this data is to extract features and use clustering algorithms to group items according to a similarity measure. Algorithm developers are challenged when evaluating the performance of the algorithm since it is hard to identify features that influence the clustering. Moreover, many algorithms can be trained using a semi-supervised approach, where human users provide ground truth samples by manually grouping single items. Hence, visualization techniques are needed that help data analysts achieve their goal in evaluating Big data clustering algorithms. In this context, Multidimensional Scaling (MDS) has become a prominent visualization tool. In this paper, we propose a combination with glyphs that can provide a detailed view of specific features involved in MDS. In consequence, human users can understand, adjust, and ultimately improve clustering algorithms. We present a thorough glyph design, which is founded in a comprehensive survey of related work and report the results of a controlled experiments, where participants solved data analysis tasks with both glyphs and a traditional textual display of data values.

BibTeX (Download)

@conference{VinciGlyph,
title = {Towards Glyph-based Visualizations for Big Data Clustering },
author = {Mandy Keck and Dietrich Kammer and Thomas Gründer and Thomas Thom and Martin Kleinsteuber and Alexander Maasch and Rainer Groh},
url = {http://doi.acm.org/10.1145/3105971.3105979},
doi = {10.1145/3105971.3105979},
isbn = {978-1-4503-5292-5},
year  = {2017},
date = {2017-08-15},
booktitle = {Proceedings of the 10th International Symposium on Visual Information Communication and Interaction},
pages = {129--136},
publisher = {ACM},
address = {New York, NY, USA},
series = {VINCI '17},
abstract = {Data Analysts have to deal with an ever-growing amount of data resources. One way to make sense of this data is to extract features and use clustering algorithms to group items according to a similarity measure. Algorithm developers are challenged when evaluating the performance of the algorithm since it is hard to identify features that influence the clustering. Moreover, many algorithms can be trained using a semi-supervised approach, where human users provide ground truth samples by manually grouping single items. Hence, visualization techniques are needed that help data analysts achieve their goal in evaluating Big data clustering algorithms. In this context, Multidimensional Scaling (MDS) has become a prominent visualization tool. In this paper, we propose a combination with glyphs that can provide a detailed view of specific features involved in MDS. In consequence, human users can understand, adjust, and ultimately improve clustering algorithms. We present a thorough glyph design, which is founded in a comprehensive survey of related work and report the results of a controlled experiments, where participants solved data analysis tasks with both glyphs and a traditional textual display of data values.},
keywords = {Glyphs, Information Visualization},
pubstate = {published},
tppubtype = {conference}
}