TOFU : The TOFU dataset provides a benchmark for fictional forgetting tasks for large language models.

TOFU

AI Model AI Science Research #Language Model #Forgetting #Benchmark Dataset #Chatbot Standard Picks Open Source

Overview :

The TOFU dataset contains question-answer pairs generated based on 200 fictional authors that do not exist. It is used to evaluate the forgetting performance of large language models on real-world tasks. The task aims to forget models fine-tuned on various forgetting set ratios. This dataset uses the question-answer format, making it suitable for popular chatbot models like Llama2, Mistral, or Qwen. However, it can also be used for any other large language model. The corresponding codebase is written for Llama2 chatbot and Phi-1.5 models but can be easily adapted to other models.

Target Users :

Evaluate the forgetting ability of language models and train forgettable chatbot models.

Total Visits： 474.6M

Top Region： US(19.34%)

Website Views ： 54.6K

Use Cases

Fine-tune a Llama model using the TOFU dataset and then forget the model on different sizes of forgetting sets to evaluate forgetting performance.

Build a chatbot based on the TOFU dataset, training a forgettable model to avoid the robot remembering or leaking sensitive information.

Use the forgetting functionality in the TOFU codebase to test the performance differences of various models when forgetting specific information.

Features

Provides a benchmark forgetting dataset

Supports the evaluation of forgetting performance in large language models