vdr-2b-multi-v1
V
Vdr 2b Multi V1
Overview :
vdr-2b-multi-v1 is a multilingual embedding model launched by Hugging Face, specifically designed for visual document retrieval. This model encodes document page screenshots into dense vector representations, allowing for the search and query of visually rich multilingual documents without the need for OCR or data extraction processes. Developed based on the MrLight/dse-qwen2-2b-mrl-v1 model, it has been trained on a self-constructed multilingual query-image pair dataset, making it an upgraded version of mcdse-2b-v1 with enhanced performance. The model supports Italian, Spanish, English, French, and German and includes a high-quality open-source multilingual synthetic training dataset with 500,000 samples, characterized by low VRAM usage and fast inference capabilities, demonstrating excellent performance in cross-language retrieval.
Target Users :
This model is designed for users who require multilingual visual document retrieval, such as researchers, business analysts, and content creators. It is particularly suitable for quickly and accurately finding document information in linguistically diverse environments.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 48.3K
Use Cases
Researchers can quickly retrieve key charts and content from academic papers in different languages using this model.
Business analysts can perform cross-language searches for visual data and analytical results in industry reports.
Content creators can easily find inspirational materials and references in multilingual documents.
Features
Supports multilingual document retrieval (Italian, Spanish, English, French, German)
Low VRAM and fast inference, with a speed-up of 3 times compared to the base model and lower VRAM consumption
Strong cross-language retrieval capabilities, enabling document searches across different languages
Utilizes Matryoshka representation learning, reducing vector size by 3 times while maintaining 98% of embedding quality
Direct integration with SentenceTransformers and LlamaIndex, facilitating easy embedding generation
How to Use
1. Install the llama-index-embeddings-huggingface or sentence-transformers library via pip.
2. Import the corresponding model class, such as HuggingFaceEmbedding or SentenceTransformer.
3. Create an instance of the model, specifying the model name and other parameters like device type.
4. Use the model's get_image_embedding or encode method, passing in the image file path or query text to obtain the embedding vector.
5. Utilize the obtained embedding vector for document retrieval and other operations.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase