Aya Vision 32B
A
Aya Vision 32B
Overview :
Aya Vision 32B is an advanced vision-language model developed by Cohere For AI, boasting 32 billion parameters and supporting 23 languages, including English, Chinese, and Arabic. This model combines the latest multilingual language model Aya Expanse 32B and the SigLIP2 vision encoder, achieving visual and language understanding integration through a multimodal adapter. It excels in the vision-language field, capable of handling complex image and text tasks such as OCR, image captioning, and visual reasoning. The release of this model aims to promote the popularization of multimodal research, providing a powerful tool for global researchers with its open-source weights. The model is licensed under CC-BY-NC and is subject to Cohere For AI's fair use policy.
Target Users :
This model is suitable for researchers, developers, and enterprises that need to handle vision-language tasks, especially those requiring multilingual support and high-performance models.
Total Visits: 25.3M
Top Region: US(17.94%)
Website Views : 67.1K
Use Cases
Use Aya Vision 32B for image captioning in Cohere Playground
Interact with the model through an interactive conversation using Hugging Face Space
Use the model for multilingual OCR tasks
Features
Supports 23 languages, covering various language scenarios
Can process image input and generate text output
Supports 16K context length, suitable for complex tasks
Provides interactive experiences, such as Cohere Playground and Hugging Face Space
Allows chat interaction with the model via WhatsApp
How to Use
Install the necessary transformers library: `pip install 'git+https://github.com/huggingface/transformers.git@v4.49.0-AyaVision'`
Load the model and processor: `AutoProcessor.from_pretrained(model_id)` and `AutoModelForImageTextToText.from_pretrained(model_id)`
Prepare input data, including images and text content
Format the input data using the `processor.apply_chat_template` method
Call the model's `generate` method to generate output text
Decode the generated tokens and get the final result
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase