AIM
A
AIM
Overview :
This paper introduces AIM, a family of visual models pre-trained using autoregressive objectives. Inspired by their textual counterparts, the large language models (LLMs), these models exhibit similar scaling properties. Specifically, we highlight two key findings: (1) the performance of visual features improves with increasing model capacity and dataset size, and (2) the value of the objective function correlates with model performance on downstream tasks. By pre-training a 70-billion parameter AIM on 2 billion images, we achieved 84.0% accuracy on ImageNet-1k using a frozen backbone. Interestingly, even at this scale, we observe no signs of performance saturation, suggesting that AIM may represent a new frontier in training large-scale visual models. AIM's pre-training is similar to that of LLMs and does not require any image-specific strategies to stabilize large-scale training.
Target Users :
Suitable for autoregressive pre-training on large-scale image datasets, as well as scenarios requiring training of large-scale visual models.
Total Visits: 3.1M
Top Region: US(14.90%)
Website Views : 61.3K
Use Cases
Large-scale image recognition in autonomous driving systems
Pre-training of large-scale data in medical image analysis
Training large-scale visual models for smart surveillance systems
Features
Autoregressive Image Model Pre-training
Large-Scale Visual Model Training
Performance Optimization and Scaling
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase