Chinese Tiny LLM : The first Chinese large language model, focusing on Chinese understanding and generation.

Chinese Tiny LLM

AI Model AI Language Learning #Chinese #Language Model #Pre-trained #CHC-Bench Fresh Picks Open Source

Overview :

Chinese Tiny LLM (CT-LLM) is the first large language model designed specifically for Chinese. It boasts 2 billion parameters and has been pre-trained on a 120 billion Chinese text corpus. CT-LLM prioritizes enhancing the understanding and generation of the Chinese language. Through pre-training on massive Chinese data, it achieves efficient processing of Chinese text. While optimized for Chinese processing, CT-LLM also demonstrates proficiency in handling English and programming code, showcasing the model's cross-lingual adaptability. In the Chinese language benchmark CHC-Bench, CT-LLM exhibits outstanding performance, proving its efficiency in understanding and applying Chinese. CT-LLM is trained from scratch, primarily using Chinese data for pre-training. It openly shares all relevant information, including the entire data filtering process, training dynamics, training and evaluation data, and intermediate model checkpoints. This open-source approach allows other researchers and developers to access these resources, leveraging them for their own research or further model refinement.

Target Users :

Utilizes for Chinese text processing, generation, and understanding tasks.

Total Visits： 557

Top Region： US(78.22%)

Website Views ： 65.4K

Use Cases

Used for Chinese NLP research

Automatic generation of Chinese articles

Sentiment analysis of Chinese text

Features

2 billion parameter large language model

Excellent performance in Chinese language tasks