

Pixelplayer
Overview :
PixelPlayer is a system that can, by watching a large number of unmarked videos, learn to locate the image regions producing sound and separate the input audio into a set of components representing the sound of each pixel. Our method leverages the natural synchronous features of the visual and auditory modalities to learn a joint model for parsing sound and images without the need for additional human labeling. The system is trained using a large number of training videos featuring solo and duet performances of different instrumental combinations. There is no supervision on which instruments appear, where they are, and what sounds they produce for each video. In the testing phase, the system's input consists of videos with performances of different instruments and monaural auditory inputs. The system performs audio-visual source separation and localization, separating the input audio signal into N sound channels, each corresponding to a different instrumental category. In addition, the system can localize sound and assign different audio waveforms to each pixel in the input video.
Target Users :
["Perform unsupervised audio-visual separation","Analyze audio-visual relationships"]
Use Cases
PixelPlayer can be used to separate different instrument sounds in mixed audio.
PixelPlayer can be used to study the relationship between visual and auditory perception.
PixelPlayer can be used to explore the contribution of different pixel regions to the overall auditory experience.
Features
Audio-visual source separation and localization
Separate audio signals into components representing the sound of each pixel
Assign different audio waveforms to each pixel in the input video
Featured AI Tools
English Picks

Tensorpix
TensorPix is an online video enhancement platform that employs artificial intelligence technology to improve video quality. It offers a rapid and efficient video upscale service without the need for downloading or installing any software. Users can process videos in bulk, restore colors, clarify details, and correct distortions. Core features include: online resolution enhancement, repairing blur and noise, increasing frame rate, and color enhancement, among others. It is suitable for fixing old recordings and low-quality videos as well as for the post-production refinement of new recorded videos, significantly enhancing video texture with convenience and speed.
Video Editing
6.5M

LTX Studio
LTX Studio is an innovative video production platform integrated with AI technology, which enables users to fully control all aspects of video production from concept to final cut. Through AI technology, the platform transforms creative ideas into coherent video narratives, offering features such as character consistency, automatic editing, and deep frame control, aimed at simplifying the video production process and enhancing creative efficiency.
Video Editing
2.2M