

Voice Cursor
Overview :
Voice Cursor is an experimental text editor built on the native audio capabilities of Gemini 2.0, demonstrating how to integrate Gemini's new text-to-speech API into a text editor for smooth and contextual voice generation. This project not only showcases the powerful new features of Gemini 2.0 but also provides a practical application example, allowing developers and users to explore and utilize this new technology. The product background includes innovative projects from Google Creative Lab aimed at pushing technological boundaries and providing new modes of interaction. Currently, the product is free, primarily targeting developers and technology enthusiasts seeking innovative solutions to enhance productivity and improve accessibility.
Target Users :
The target audience includes developers and technology enthusiasts, particularly those interested in natural language processing and speech synthesis technologies. Voice Cursor provides an experimental platform for them to explore and utilize the native audio capabilities of Gemini 2.0, enabling the creation of new application scenarios and enhancing the accessibility and interactivity of text content.
Use Cases
Developers can utilize Voice Cursor to create text editors with voice feedback, enhancing the writing experience for visually impaired individuals.
Content creators can use Voice Cursor to convert text into audio for creating materials for videos and podcasts.
Educators can leverage Voice Cursor to convert teaching materials into audio, providing assistive learning tools for students with reading difficulties.
Features
Integration of Gemini 2.0 text-to-speech capabilities
Offers 8 different Gemini voice options with unique characteristics
Supports 15 different emotional tones to shape text expression
Visual integration with color-coded highlighting of active voices and tones
Instant generation with fast audio synthesis provided by Gemini's latest model
Clone the repository and install dependencies to get started
Create a .env.local file containing the AI Studio API key to enable functionality
Start the development server for local testing and experience
How to Use
1. Clone the Voice Cursor GitHub repository to your local environment.
2. Install the necessary project dependencies.
3. Create a .env.local file and insert the API key obtained from Google AI Studio.
4. Start the development server, usually by running the command `npm run dev`.
5. Open `http://localhost:3000` in your browser to start experiencing Voice Cursor.
6. Highlight text, and Voice Cursor will generate audio based on the selected voice and tone.
7. Explore different emotional tone options by modifying the `src/lib/tone-options.ts` file to customize the audio output.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M