Skywork-MoE-Base-FP8
S
Skywork MoE Base FP8
Overview :
Skywork-MoE is a 146-billion parameter high-performance Mixture of Experts (MoE) model, featuring 16 experts and 2.2 billion activation parameters. This model is initialized from the dense checkpoint of the Skywork-13B model. Two innovative techniques are introduced: gated logic normalization, enhancing expert diversity; and adaptive auxiliary loss coefficient, allowing layer-specific auxiliary loss coefficient adjustment. Skywork-MoE demonstrates comparable or superior performance to models with more parameters or activation parameters across various popular benchmark tests, such as C-Eval, MMLU, CMMLU, GSM8K, MATH, and HumanEval.
Target Users :
The Skywork-MoE model is suitable for researchers and developers working on large-scale language model training and inference. It offers efficient parameter utilization and powerful computational performance, particularly beneficial in resource-constrained or scenarios requiring rapid inference.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 46.1K
Use Cases
Researchers utilize Skywork-MoE for training and testing natural language processing task models.
Companies leverage the Skywork-MoE model for automatic product documentation generation and chatbot development.
Educational institutions adopt the Skywork-MoE model to assist in automatic generation of teaching materials and automated grading of student assignments.
Features
A large-scale MoE model with 146 billion parameters
16 experts and 2.2 billion activation parameters
Gated logic normalization technique
Adaptive auxiliary loss coefficient adjustment
Excellent performance in multiple benchmark tests
Supports FP8 precision operation, optimizing resource utilization
How to Use
Install necessary dependencies, including the corresponding version of PyTorch and vllm.
Clone the vllm codebase provided by Skywork and compile it.
Set up a Docker environment and run vllm directly using the Docker image provided by Skywork.
Configure the model path and working directory to begin using the Skywork MoE model for tasks such as text generation.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase