Welcome! Here you can participate in the creation of Timers and Such, an open source dataset of spoken English commands involving numbers.
Number recognition is necessary for several common voice interface use cases (setting timers, setting alarms, converting units of measurement, math operations), but there is currently no dataset that covers numbers or these use cases.
By creating such a dataset, we hope to:
1. Advance reproducible research in spoken language understanding (as well as other adjacent research areas, such as speech recognition, representation learning for audio and language, sequence modeling, and systematic generalization);
2. Benefit researchers and developers by providing code and simple baseline pre-trained models; and
3. Enable makers and enthusiasts to easily train models for voice interfaces involving numbers.
The dataset will be released under a CC0 license, similar to the Common Voice dataset.