

Vdr 2b Multi V1
Overview :
vdr-2b-multi-v1 is a multilingual embedding model launched by Hugging Face, specifically designed for visual document retrieval. This model encodes document page screenshots into dense vector representations, allowing for the search and query of visually rich multilingual documents without the need for OCR or data extraction processes. Developed based on the MrLight/dse-qwen2-2b-mrl-v1 model, it has been trained on a self-constructed multilingual query-image pair dataset, making it an upgraded version of mcdse-2b-v1 with enhanced performance. The model supports Italian, Spanish, English, French, and German and includes a high-quality open-source multilingual synthetic training dataset with 500,000 samples, characterized by low VRAM usage and fast inference capabilities, demonstrating excellent performance in cross-language retrieval.
Target Users :
This model is designed for users who require multilingual visual document retrieval, such as researchers, business analysts, and content creators. It is particularly suitable for quickly and accurately finding document information in linguistically diverse environments.
Use Cases
Researchers can quickly retrieve key charts and content from academic papers in different languages using this model.
Business analysts can perform cross-language searches for visual data and analytical results in industry reports.
Content creators can easily find inspirational materials and references in multilingual documents.
Features
Supports multilingual document retrieval (Italian, Spanish, English, French, German)
Low VRAM and fast inference, with a speed-up of 3 times compared to the base model and lower VRAM consumption
Strong cross-language retrieval capabilities, enabling document searches across different languages
Utilizes Matryoshka representation learning, reducing vector size by 3 times while maintaining 98% of embedding quality
Direct integration with SentenceTransformers and LlamaIndex, facilitating easy embedding generation
How to Use
1. Install the llama-index-embeddings-huggingface or sentence-transformers library via pip.
2. Import the corresponding model class, such as HuggingFaceEmbedding or SentenceTransformer.
3. Create an instance of the model, specifying the model name and other parameters like device type.
4. Use the model's get_image_embedding or encode method, passing in the image file path or query text to obtain the embedding vector.
5. Utilize the obtained embedding vector for document retrieval and other operations.
Featured AI Tools

Globe Explorer
Globe Explorer is a new AI-powered search engine that offers a personalized search experience, supports multilingual searches, and is committed to delivering high-quality search results. It can automatically organize search keywords into mind maps, aiding users in quickly and clearly comprehending information.
AI search
2.9M

Perplexity
Perplexity is a tool that boosts your assistant's efficiency. It supports uploading text or PDF files (up to 25MB) and allows you to upgrade to GPT-4. It acts as a personal search assistant, helping users quickly find the information they need. Try Pro's pricing varies based on individual needs, offering both a free trial and paid versions. Its core focus is on enhancing personal productivity and search efficiency.
AI search
1.8M