EXAONE-3.5-2.4B-Instruct-GGUF
E
EXAONE 3.5 2.4B Instruct GGUF
Overview :
EXAONE-3.5-2.4B-Instruct-GGUF is a series of bilingual (English and Korean) generative models fine-tuned for instructions, developed by LG AI Research, with parameter sizes ranging from 2.4B to 32B. These models support long context processing of up to 32K tokens, demonstrating state-of-the-art performance in real-world use cases and long context comprehension, while remaining competitive in the general domain compared to similarly sized recently released models. The significance of this model lies in its optimization for deployment on small or resource-constrained devices, while delivering powerful performance.
Target Users :
Designed for researchers and developers who need to deploy high-performance language models on resource-constrained devices, as well as application developers working on long text and multilingual text generation. This model is particularly suited for these users as it offers optimized deployment options and robust performance while supporting long context understanding and multilingual capabilities.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 50.2K
Use Cases
Researchers utilize the EXAONE-3.5-2.4B-Instruct-GGUF model for semantic understanding studies of long texts.
Developers leverage this model for real-time multilingual translation features on mobile devices.
Businesses employ the model to optimize automated response systems in customer service, enhancing efficiency and accuracy.
Features
Supports long context processing capabilities of up to 32K tokens.
Includes three different model sizes: 2.4B, 7.8B, and 32B, to meet various deployment needs.
Demonstrates state-of-the-art performance in real-world use cases.
Supports bilingual (English and Korean) text generation.
Fine-tuned for instructions, enabling better understanding and execution of directives.
Provides multiple quantized versions of the model to accommodate varying computational and storage needs.
Allows inference across various frameworks such as TensorRT-LLM, vLLM, SGLang, and more.
The text generated by the model does not reflect the views of LG AI Research, and LG AI Research actively works to mitigate potential risks associated with the model.
How to Use
1. Install llama.cpp; for specific installation instructions, refer to the llama.cpp GitHub repository.
2. Download the GGUF format file for the EXAONE 3.5 model.
3. Use the huggingface-cli tool to download the specified model file to your local directory.
4. Run the model using the llama-cli tool and set a system prompt, e.g., 'You are the EXAONE model from LG AI Research, a helpful assistant.'
5. Choose the appropriate quantized version of the model for deployment and inference as needed.
6. Deploy the model in supported frameworks like TensorRT-LLM, vLLM, etc., for practical applications.
7. Monitor the text generated by the model to ensure compliance with LG AI's ethical guidelines.
8. Further optimize the model's use and performance based on technical reports, blogs, and guidance available on GitHub.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase