Data collection and visualization have traditionally been seen as activities reserved for experts. However, by drawing simple geometric figures — known as glyphs — anyone can visually record their own data, as shown below. Still, the resulting hand-drawn infographics do not provide direct access to the underlying data, hindering digital editing of both the glyphs and their values.
Overview of our method: given an input hand-drawn infographic (a), the user defines a parametric glyph template by specifying all elements that compose the glyph and their visual variations (b). Based on this template, we synthesize an annotated dataset of glyph drawings (c) and use it to train a glyph detector (d) and a parameter predictor (e). A simple interface (see videos below) allows users to review the recovered data, make corrections, and even refine the template to achieve higher accuracy.
We have created a benchmark of 10 hand-drawn infographics collected from diverse sources, including from data designers who advocate drawing for visualization, as well as from coursework of design schools.
For Triangle design, we show Input image (top left), sampled glyphs from the training dataset (top right), detected glyphs (center left, green boxes indicate True Positive detection and red boxes indicate False Positive detection), reconstruction from the estimated parameter values (center right, red outlines indicate glyphs for which one or several parameter values are erroneous), reconstruction from the ground truth parameter values (bottom left).
Others:We conducted a study to evaluate the second usage scenario where users record data by drawing glyphs according to a prescribed template. We have collected drawings of the same glyphs from multiple participants and evaluate the sensitivity of our approach to variations in individual drawing styles.
Results for the Leaf design from 12 participants. Input image (top, drawings from 12 participants and PID indicates participant ID), detected glyphs (second row, green boxes indicate True Positive detection and red boxes indicate False Positive detection), reconstruction from the estimated parameter values (third row, red outlines indicate glyphs for which one or several parameter values are erroneous), reconstruction from the ground truth parameter values (bottom).
@article{QI2025104251,
title = {Sketch2Data: Recovering data from hand-drawn infographics},
journal = {Computers & Graphics},
volume = {130},
pages = {104251},
year = {2025},
issn = {0097-8493},
doi = {https://doi.org/10.1016/j.cag.2025.104251},
url = {https://www.sciencedirect.com/science/article/pii/S0097849325000925},
author = {Anran Qi and Theophanis Tsandilas and Ariel Shamir and Adrien Bousseau}}