Data Vocalization and Voice Interfaces (CiceroDB)

SIGMOD 2019 Talk on Voice-Based OLAP

Overview

The communication between user and computer is more and more shifting towards voice interfaces. This trend is evidenced by devices and services such as Google Home, Amazon Echo, or Apple's Siri. We study the question of how to exploit voice interfaces for data analysis.

Enabling voice-based access to structured data entails two research challenges. First, we need to translate speech input into queries. Second, we need to summarize potentially large query results via voice output ("data vocalization"). Our research on voice interfaces covers various scenarios in terms of query and data types. Beyond user-centric research questions (e.g., "how to resolve ambiguities?", "how to describe data?"), we also study possibilities to specialize backends and query processing methods to voice interfaces for increased efficiency. Research results are integrated into CiceroDB, a database system designed from the ground up for voice-based analysis of large data sets.

Publications

Immanuel Trummer. "Data Vocalization with CiceroDB." CIDR 2019.

Immanuel Trummer, Yicheng Wang, Saketh Mahankali. "A holistic approach for query evaluation and result vocalization in voice-based OLAP." SIGMOD 2019.

Immanuel Trummer, Mark Bryan, and Ramya Narasimha. "Vocalizing Large Time Series Efficiently." VLDB 2018.

Immanuel Trummer, Jiancheng Zhu, and Mark Bryan. "Optimizing Voice-Based Output of Relational Data". VLDB 2017.

Demonstrations

Mark Bryan, Immanuel Trummer, and Ramya Narasimha. "Voice-Based Analysis of Time Series Data". BOOM 2018. Winner of the JP Morgan Award!

Mark Bryan, Jiancheng Zhu, and Immanuel Trummer. "Optimizing Voice Output of Relational Data". BOOM 2017. Winner of the Lockheed Martin Award!

Funding

Google Faculty Research Award 2017 for "Optimizing Voice-Based Output of Relational Data".

Student Awards

Mark Bryan wins honorable mention for 2018 CRA Outstanding Undergraduate Researcher Award.