LookOnceToHear
L
Lookoncetohear
Overview :
LookOnceToHear is an innovative smart earphone interaction system that allows users to select the target speaker they want to hear by simply using visual recognition. This technology was nominated for Best Paper at CHI 2024. It achieves real-time speech extraction through synthetic audio mixing, head-related transfer functions (HRTFs), and binaural room impulse responses (BRIRs), providing users with a novel way to interact.
Target Users :
This product is suitable for researchers and developers who need to perform speech recognition and extraction in noisy environments. For example, it can help people with hearing impairments understand conversations better in noisy environments, or perform speech analysis and processing in multi-speaker environments.
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 86.4K
Use Cases
In meetings, use LookOnceToHear to select and listen to the voice of a specific speaker
Help people with hearing impairments concentrate on conversations in noisy public places
In audio analysis research, used to distinguish and extract multiple sound sources
Features
Users select the desired voice by looking at the target speaker for a few seconds
Utilizes the Scaper toolkit to synthesize and generate audio mixtures
Provides a self-contained dataset and training .jams specification files
Supports real-time speech extraction and evaluation of target speech listening models
Offers model checkpoints for easy training and evaluation by users
Suitable for speech recognition and extraction in noisy environments
How to Use
Download and unzip the provided .zip file to the 'data/' directory
Run the command to initiate the training process
Use Scaper's 'generate_from_jams' function to generate audio mixtures on the .jams specification files
Download and load the target speech listening model checkpoint for evaluation
Adjust model parameters as needed to optimize performance
In practical applications, users simply need to look at the target speaker to start speech extraction
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase