SALMONN
S
SALMONN
Overview :
Developed by the Department of Electronic Engineering, Tsinghua University, and ByteDance, SALMONN is a large language model (LLM) that supports voice, audio events, and music input. Unlike models that only support voice or audio event input, SALMONN can perceive and understand various audio inputs, thereby achieving new capabilities such as multilingual speech recognition and translation, as well as audio-speech co-inference. This can be seen as giving the LLM 'auditory' and cognitive auditory abilities, making SALMONN a step towards artificial general intelligence with auditory capabilities.
Target Users :
SALMONN can be applied to fields such as speech recognition, speech translation, and audio processing.
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 92.2K
Use Cases
Input: gunshots.wav, Output: ...
Input: duck.wav, Output: ...
Input: music.wav, Output: ...
Features
Multilingual Speech Recognition
Multilingual Speech Translation
Audio-Speech Co-Inference
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase