# Audio

ASMR AI
ASMR AI
ASMR AI is the first AI ASMR video generator with realistic binaural audio. It provides text-to-ASMR and image-to-ASMR content generation through Google Veo 3. It has functions for relaxation, sleep assistance, and stress relief.
Video Generation
37.0K
Morse Code Translator
Morse Code Translator
The Morse Code Translator is an online tool that converts text to Morse Code and Morse Code back to text. Users can easily perform translations by listening to the audio and watching light signals, and they can also download WAV files. The Morse Code Translator provides multiple Morse Code systems for various language translation needs.
Translation
37.0K
Maidio
Maidio
Maidio is an innovative audio content application that utilizes AI technology to automatically convert RSS news into engaging conversational podcasts. It employs advanced natural language processing techniques to present news in a dialogue format between a host and an assistant, allowing users to access information in a more entertaining manner. The app supports various personalization features, including the creation of themed stations and intelligent priority sorting, making it suitable for those who enjoy consuming news through audio. It is available on multiple platforms, including iPhone, iPad, and Mac, and is completely free of charge.
Speech-to-text
54.6K
MaiYou Radio
Maiyou Radio
MaiYou Radio is an app that utilizes AI technology for news broadcasting. It employs intelligent algorithms to convert text-based news into lively conversations, providing users with a more natural and engaging listening experience. The app's main advantages are its personalization and intelligence, allowing users to create multiple themed radio stations based on their interests while automatically ranking news items by importance. Additionally, it supports both local and cloud-based voice synthesis and features an audio export function for users to publish their generated programs as podcasts. Developed by Fangtangjun (Chongqing) Technology Co., Ltd., MaiYou Radio is a free educational app suitable for users interested in news and AI technology.
Speech-to-text
53.0K
English Picks
Hailuo
Hailuo
Hailuo AI is a smart AI assistant that offers various interaction methods, including chat, video, and audio, capable of easily handling long text contexts to help users solve problems. It is characterized by powerful natural language processing technology and a user-friendly experience, aiming to provide efficient and smart solutions to users. The product is positioned as a general-purpose AI assistant for a wide audience, with a pricing strategy that is not explicitly defined.
Personal Assistance
71.2K
Chinese Picks
PodRedit
Podredit
PodRedit is a podcast sharing platform where users can discover and listen to various popular podcasts. The platform gathers a wealth of high-quality podcast content covering various fields such as relationships, culture, and business, providing users with a convenient channel for listening to and sharing podcasts. After logging in, PodRedit supports batch subtitle recognition. With its rich content and user-friendly experience, PodRedit meets the demand for high-quality audio content, establishing itself as a key hub for podcast lovers.
Other categories
51.1K
English Picks
PodSnap.AI
Podsnap.ai
PodSnap.AI is a service that leverages cutting-edge AI technology to provide users with podcast summaries. By subscribing, users receive AI-generated summaries of their chosen podcasts delivered directly to their inbox. This service saves time and helps users quickly grasp key information from podcasts, making it especially suitable for busy professionals and learners. The product was founded by Dr. Rok Strni?a, an entrepreneur with over 15 years of experience in the tech industry, who received his Ph.D. in computer science from the University of Cambridge and has held significant positions at notable companies such as Citrix, Winton, and Improbable.
AI information platform
60.4K
Journi
Journi
Transform your smartphone into your personal tour guide with Journi, featuring immersive audio guides narrated by locals. Explore must-see attractions brought to life by the voices of local experts through an interactive map. Leveraging AI technology, Journi personalizes your journey, offering unique customized recommendations and insights, making each exploration a tailor-made adventure. Journi empowers you to explore cities freely, experience the city's pulse, from ancient landmarks to hidden treasures.
Travel
54.4K
easywithai.com
Easywithai.com
Easy With AI is a platform that boasts the largest collection of AI tools and resources on the internet. You can find and search for AI tools across 50+ different categories. Easy With AI offers convenience and a rich repository of AI tool resources for various users, including AI writing assistants, social media tools, email tools, AI content detection tools, customer service tools, website building tools, e-commerce tools, image tools, audio tools, video tools, music generators, video generators, podcasting tools, presentation-making tools, design tools, live streaming tools, chatbots, voice tools, mobile apps, transcription tools, meeting assistants, architectural tools, productivity tools, educational tools, AI Chrome extensions, and more. You can find the AI tools that best suit your needs and interests on Easy With AI.
AI Information Platform
130.0K
Butter Reader
Butter Reader
ButterReader is an innovative audio plugin that transforms blog text into captivating audio content, making learning and information consumption smoother. With a customizable player, you can easily convert text into a delightful audio experience. The product features design flexibility, voice selection, and control settings, making it suitable for various use cases. ButterReader allows users to seamlessly play audio content on mobile devices, enabling users to enjoy content even while multitasking.
Text to Speech
52.2K
Ad Auris
Ad Auris
Ad Auris is an app that converts articles into audio for playback. Users can listen to articles of interest at any time and place, and the app also supports saving to platforms like Spotify. This app aims to enhance user reading efficiency and convenience, enabling them to enjoy reading amidst busy schedules.
Text to Speech
62.9K
Konch
Konch
Konch is an excellent automatic transcription platform that supports over 30 languages. It uses advanced AI technology to quickly and accurately transcribe audio or video files into text. Users can choose between fully AI-generated transcription results or opt for human review and correction. Konch also supports converting YouTube videos to text and offers advanced editing features, multilingual translation, flexible text format export, and more. Users can leverage Konch in various scenarios, including transcribing audio or video, research transcription, digital archives, and podcast transcription.
Speech-to-text
48.0K
FreGrad
Fregrad
FreGrad is a lightweight and fast frequency-aware diffusion audio codec designed to generate realistic audio. Its framework includes discrete wavelet transform, frequency-aware expansion convolution, and a series of quality enhancement techniques for model generation. In experiments, FreGrad achieves a 3.7x speedup in training speed and a 2.2x speedup in inference speed compared to baseline models, while reducing model size by 0.6x (only 1.78 million parameters) without sacrificing output quality.
AI audio editing
50.0K
Unified-IO 2
Unified IO 2
Unified-IO 2 is a unified multi-modal generation model that can understand and generate images, text, audio, and actions. It utilizes a single encoder-decoder Transformer model to process inputs and outputs of different modalities (images, text, audio, actions, etc.) as representations within a shared semantic space. This model is trained from scratch on large-scale multi-modal pre-training data, using multi-modal denoising objectives for optimization. To learn a wide range of skills, the model is further fine-tuned on 120 existing datasets, which include prompts and data augmentation. Unified-IO 2 achieves state-of-the-art performance on the GRIT benchmark, achieving strong results across 30+ benchmarks, including image generation and understanding, text understanding, video and audio understanding, and robotics manipulation.
AI Model
70.1K
Jellypod
Jellypod
Jellypod+ is an app that turns your email subscriptions into a personalized podcast. It delivers concise summaries of your daily news in audio format, designed for your busy lifestyle. Jellypod+ aims to break away from traditional media's one-size-fits-all approach and curate news tailored to your unique interests. The app also includes a built-in email reader and newsletter forwarding feature, enabling you to view detailed newsletter content without leaving the app and automatically forward incoming newsletters to your personal inbox. Additionally, Jellypod+ offers adjustable playback speed, multiple voice options, offline mode, customizable podcast generation schedules, multiple daily podcast themes organization, an ad-free experience, and privacy-focused email address protection.
Personal Care
53.3K
Huddles
Huddles
Huddles is a new, lightweight audio or video connection method that allows you to have casual conversations or participate in in-depth collaborative meetings anytime, anywhere. You can create and join Huddles within Slack to communicate with team members in real-time through audio or video, share screens and documents, and improve work efficiency. Huddles is not only suitable for informal discussions, but also for problem-solving, brainstorming, and collaborative document writing. Huddles supports multiple participants and can meet the diverse needs of teams.
AI meeting assistant
49.4K
Read
Read
Read is a news audio generation platform. It automatically gathers content of user interest and generates personalized daily audio news briefs, helping users efficiently obtain the information they need. The product features AI-generated natural speech, supports email subscriptions, and provides personalized recommendations, offering powerful functionality. Perfect for users who want to stay informed about daily events and news they care about.
News Assistant
61.5K
GlossAi
Glossai
GlossAi is a full-cycle video and audio content repurposing tool that allows you to transform long-form content into short video clips suitable for various social media platforms. It increases user engagement, reduces costs, and saves time. It can also generate multi-channel digital and organic marketing campaigns.
Video Editing
45.0K
Emastered
Emastered
eMastered is an online audio mastering tool created by Grammy-winning engineers. It utilizes AI technology to rapidly and easily enhance audio quality. Users can upload tracks and automatically apply professional EQ, compression, and other processing to achieve high-quality master recordings. eMastered offers both a free trial and paid subscriptions, suitable for musicians, production companies, and various other users.
Music Production
337.0K
Dublai.com
Dublai.com
Dublai is a startup company providing AI-powered audio and video dubbing services. You can dub your content in English, Portuguese, Spanish, Italian, French, German, and Japanese with guaranteed quality and speed.
Text to Speech
92.5K
Jamit.app
Jamit.app
Jamit is the world's first Podcast 3.0 platform, offering decentralized hosting, global reach, interactive rewards, and unique NFT experiences. Users can discover and listen to stories from various genres, create and cultivate their own communities, and enjoy the independence of being Jamit creators and owners.
Social Media Services
43.6K
33 Subtitles
33 Subtitles
33 Subtitles is a precise AI video subtitle recognition and translation software. It can convert audio and video into text or SRT subtitle files and supports subtitle translation into other languages. It uses an optimized Whisper AI speech-to-text model, with accuracy close to human level. It integrates multiple AI translation engines, supporting translation interfaces such as ChatGPT, DeepL, Microsoft, and Baidu. It also provides an efficient and user-friendly visual subtitle editor, supporting subtitle summarization and pre-extraction of voice functions. 33 Subtitles supports over 50 languages including English, Japanese, Korean, French, and Thai.
Video Editing
847.6K
TinyStudio
Tinystudio
TinyStudio is a free Mac application that leverages the powerful performance of M1/M2 chips to provide fast and efficient subtitle generation services. Users can generate subtitles for video and audio files with a single click, without any technical expertise required. TinyStudio utilizes OpenAI's Whisper technology, allowing it to process data locally without an internet connection. The application also supports subtitle import and export, and features a rule-based correction system to ensure accuracy and reliability. With its user-friendly interface, TinyStudio is easy to use and is ideal for boosting the efficiency of vloggers, marketers, and social media enthusiasts. TinyStudio is a highly effective video editing tool for vloggers, marketers, and social media enthusiasts. Download TinyStudio now and experience the power of a free, fast, and efficient subtitle tool!
AI text generation
133.0K
NVAS3d
Nvas3d
NVAS3d is a project for estimating sound at any location within a scene containing multiple unknown sound sources. It achieves novel-view acoustic synthesis by using audio recordings from multiple microphones and the 3D geometry and materials of the scene.
AI Audio Enhancer
48.0K
SALMONN
SALMONN
Developed by the Department of Electronic Engineering, Tsinghua University, and ByteDance, SALMONN is a large language model (LLM) that supports voice, audio events, and music input. Unlike models that only support voice or audio event input, SALMONN can perceive and understand various audio inputs, thereby achieving new capabilities such as multilingual speech recognition and translation, as well as audio-speech co-inference. This can be seen as giving the LLM 'auditory' and cognitive auditory abilities, making SALMONN a step towards artificial general intelligence with auditory capabilities.
AI speech recognition
92.2K
Bespoke
Bespoke
Bespoke is an AI-generated personalized podcast service that delivers podcasts perfectly tailored to your daily life. Generate a custom podcast with a single click to access the content you crave, anytime, anywhere. Join the waitlist to experience more customization options and a wider selection of podcasts!
Audio Production
46.4K
Speaking AI
Speaking AI
Speaking AI is a text-to-speech conversion tool powered by advanced large language models. It can engage in natural, emotionally expressive conversations and achieve zero-shot voice cloning. It captures your unique tone, pitch, and inflection, allowing you to replicate and utilize your own voice in unprecedented ways. Speaking AI has made breakthrough advancements in voice cloning technology, resulting in remarkably natural-sounding clones. With Speaking AI, you can clone your voice in just 10 seconds by simply recording it. We are committed to advancing human progress through cutting-edge AI technologies, especially in the development and application of voice cloning.
Speech-to-text
13.1M
TranscribeAI
Transcribeai
TranscribeAI is a revolutionary Mac application designed to effortlessly transcribe audio files into text. Leveraging cutting-edge artificial intelligence technology, this application delivers unmatched accuracy and speed, saving you valuable time and effort. Whether you're a journalist, researcher, content creator, or anyone who regularly needs to transcribe audio, TranscribeAI is your perfect tool.
AI speech-to-text
82.2K
Fluxon
Fluxon
Fluxon is an ultra-realistic AI voice generator that can transform text into lifelike voices in any language. It can clone any voice in less than 10 minutes of sample audio. You can create dialogues within the same audio file by using multiple voices. You can also synthesize a single voice by training a custom voice, enabling the creation of lip-sync videos. Fluxon offers a REST API, allowing you to integrate AI voice generation into your applications. It can be used for a wide range of purposes, such as adding professional and realistic voiceovers to marketing and explainer videos, generating clear and high-quality audiobooks from text, creating lifelike voices for NPCs, providing professional translations for content, creating more natural-sounding voices for chatbots, and automatically converting any text content into podcasts.
Text to Speech
147.9K
Koolio.ai
Koolio.ai
Koolio.ai is an audio content creation platform that empowers users to transform concepts into complete content in minutes. Our intuitive and user-friendly interface allows creators to focus on what matters most - their content. Whether it's transcribing audio, collaborating with others, automatically selecting sound effects or music to enhance your creations, or easily manipulating and processing audio, Koolio.ai streamlines the process of producing high-quality audio content.
Audio Production
48.0K
Featured AI Tools
Flow AI
Flow AI
Flow is an AI-driven movie-making tool designed for creators, utilizing Google DeepMind's advanced models to allow users to easily create excellent movie clips, scenes, and stories. The tool provides a seamless creative experience, supporting user-defined assets or generating content within Flow. In terms of pricing, the Google AI Pro and Google AI Ultra plans offer different functionalities suitable for various user needs.
Video Production
42.0K
NoCode
Nocode
NoCode is a platform that requires no programming experience, allowing users to quickly generate applications by describing their ideas in natural language, aiming to lower development barriers so more people can realize their ideas. The platform provides real-time previews and one-click deployment features, making it very suitable for non-technical users to turn their ideas into reality.
Development Platform
44.2K
ListenHub
Listenhub
ListenHub is a lightweight AI podcast generation tool that supports both Chinese and English. Based on cutting-edge AI technology, it can quickly generate podcast content of interest to users. Its main advantages include natural dialogue and ultra-realistic voice effects, allowing users to enjoy high-quality auditory experiences anytime and anywhere. ListenHub not only improves the speed of content generation but also offers compatibility with mobile devices, making it convenient for users to use in different settings. The product is positioned as an efficient information acquisition tool, suitable for the needs of a wide range of listeners.
AI
41.7K
MiniMax Agent
Minimax Agent
MiniMax Agent is an intelligent AI companion that adopts the latest multimodal technology. The MCP multi-agent collaboration enables AI teams to efficiently solve complex problems. It provides features such as instant answers, visual analysis, and voice interaction, which can increase productivity by 10 times.
Multimodal technology
42.8K
Chinese Picks
Tencent Hunyuan Image 2.0
Tencent Hunyuan Image 2.0
Tencent Hunyuan Image 2.0 is Tencent's latest released AI image generation model, significantly improving generation speed and image quality. With a super-high compression ratio codec and new diffusion architecture, image generation speed can reach milliseconds, avoiding the waiting time of traditional generation. At the same time, the model improves the realism and detail representation of images through the combination of reinforcement learning algorithms and human aesthetic knowledge, suitable for professional users such as designers and creators.
Image Generation
41.4K
OpenMemory MCP
Openmemory MCP
OpenMemory is an open-source personal memory layer that provides private, portable memory management for large language models (LLMs). It ensures users have full control over their data, maintaining its security when building AI applications. This project supports Docker, Python, and Node.js, making it suitable for developers seeking personalized AI experiences. OpenMemory is particularly suited for users who wish to use AI without revealing personal information.
open source
41.7K
FastVLM
Fastvlm
FastVLM is an efficient visual encoding model designed specifically for visual language models. It uses the innovative FastViTHD hybrid visual encoder to reduce the time required for encoding high-resolution images and the number of output tokens, resulting in excellent performance in both speed and accuracy. FastVLM is primarily positioned to provide developers with powerful visual language processing capabilities, applicable to various scenarios, particularly performing excellently on mobile devices that require rapid response.
Image Processing
40.8K
Chinese Picks
LiblibAI
Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase