C3PO
C
C3PO
Overview :
C3PO is a user feedback-based LLM model alignment technique that allows for fine-tuning of LLMs based on single feedback sentences, mitigating over-generalization. This technique provides reference implementations, benchmark data, and necessary components to facilitate research on the proposed technique.
Target Users :
Fine-tune LLM models from single-sentence user feedback to achieve results that are more aligned with user preferences and less over-generalized.
Total Visits: 0
Website Views : 73.4K
Features
Sample relevant categories, prompts, and completions from feedback
Train benchmark models to fine-tune each feedback
Compare methods and benchmark responses
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase