EXAONE-3.5-2.4B-Instruct-AWQ
E
EXAONE 3.5 2.4B Instruct AWQ
Overview :
EXAONE-3.5-2.4B-Instruct-AWQ is a series of bilingual (English and Korean) instruction-tuned generative models developed by LG AI Research, with parameter sizes ranging from 2.4B to 32B. These models support long context processing of up to 32K tokens and demonstrate state-of-the-art performance in real-world use cases and long context understanding, while remaining competitive in general domains compared to similarly sized recently released models. The model has been optimized for deployment on small or resource-constrained devices and utilizes AWQ quantization technology, achieving 4-bit grouped weight quantization (W4A16g128).
Target Users :
The target audience includes developers and researchers who need to deploy high-performance language models on resource-constrained devices, as well as NLP application developers handling long-text data and multilingual support. EXAONE-3.5-2.4B-Instruct-AWQ is particularly suited for scenarios where language models need to be deployed on mobile devices or in edge computing environments due to its optimized deployment capabilities and long context processing abilities.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 53.0K
Use Cases
Generating dialogue responses in English and Korean.
Providing language model services on resource-constrained mobile devices.
Serving as a tool for long text processing and analysis for research and business intelligence.
Features
Supports long context processing capabilities of up to 32K tokens.
Optimized 2.4B model suitable for deployment on resource-constrained devices.
7.8B model offers the same scale as predecessors with improved performance.
32B model delivers powerful performance.
Supports both English and Korean languages.
AWQ quantization technology achieves 4-bit grouped weight quantization.
Supports various deployment frameworks such as TensorRT-LLM, vLLM, etc.
How to Use
1. Install the necessary libraries such as transformers and autoawq.
2. Load the model and tokenizer using AutoModelForCausalLM and AutoTokenizer.
3. Prepare the input prompts, which can be in English or Korean.
4. Use the tokenizer.apply_chat_template method to template the messages and convert them into input IDs.
5. Call the model.generate method to generate text.
6. Use the tokenizer.decode method to decode the generated IDs back into text.
7. Adjust model parameters such as max_new_tokens and do_sample as needed to control the length and diversity of the generated text.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase