

EXAONE 3.5 32B Instruct AWQ
Overview :
EXAONE-3.5-32B-Instruct-AWQ is a series of instruction-tuned bilingual (English and Korean) generation models developed by LG AI Research, with parameters ranging from 2.4B to 32B. These models support long-context handling of up to 32K tokens, demonstrating state-of-the-art performance in real-world use cases and long-context comprehension while remaining competitive in general domains compared to similarly sized recently-released models. The model employs AWQ quantization technology, achieving 4-bit group-level weight quantization to optimize deployment efficiency.
Target Users :
The target audience includes researchers, developers, and enterprises that require text generation and processing in multilingual environments. The model's capability for long-context handling and bilingual processing makes it particularly suitable for applications that involve vast amounts of text data and cross-linguistic communication.
Use Cases
Researchers use this model for multilingual text translation and generation studies.
Developers leverage the model's long-context handling capabilities to create intelligent assistant applications.
Businesses utilize this model to optimize automated response systems in customer service.
Features
Supports long-context handling capabilities for up to 32K tokens.
Exhibits state-of-the-art performance in bilingual generation models in English and Korean.
Achieves 4-bit group-level weight quantization through AWQ quantization technology.
The model comprises 30.95 billion parameters, including 64 layers and 40 attention heads.
Supports rapid startup and deployment, compatible with various frameworks such as TensorRT-LLM and vLLM.
Provides a pre-quantized EXAONE 3.5 model for easy deployment across different devices.
The text generated by the model does not reflect the views of LG AI Research, ensuring content neutrality.
How to Use
1. Install the necessary libraries, such as transformers>=4.43 and autoawq>=0.2.7.post3.
2. Load the model and tokenizer from Hugging Face using AutoModelForCausalLM and AutoTokenizer.
3. Prepare the input prompt, which can be in English or Korean.
4. Use the tokenizer.apply_chat_template method to convert messages into the model's input format.
5. Call the model.generate method to produce text.
6. Use tokenizer.decode to convert the generated tokens into readable text.
7. Adjust model parameters such as max_new_tokens and do_sample as needed to control the length and diversity of the generated text.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M