ShieldGemma
S
Shieldgemma
Overview :
ShieldGemma is a series of security content auditing models built on Gemma 2, developed by Google, focusing on four harm categories: adult content, dangerous content, hate speech, and harassment. These are text-to-text decoder-only large language models available only in English, with open weights, including models with 2B, 9B, and 27B parameter sizes. These models are designed as part of a responsible generative AI toolkit to enhance the safety of AI applications.
Target Users :
The ShieldGemma model is suitable for developers and businesses that need to conduct security audits on text content, such as social media platforms, online forums, and content publishing systems. These models can help filter out content that violates security policies, enhancing the overall quality and compliance of the content.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 50.2K
Use Cases
Social media platforms use the ShieldGemma model to automatically detect and filter hate speech.
Online education platforms utilize the model to ensure healthy and positive discussions in forums.
Internal company forums adopt the ShieldGemma model to monitor and prevent workplace harassment.
Features
Text-to-text decoder-only large language models, focused on content auditing.
Offers three different model sizes: 2B, 9B, and 27B parameters.
Formats input using specific patterns for optimal performance.
Outputs text strings indicating whether user input or model output violates the provided policy.
Has undergone a safety evaluation for fairness features, compliant with internal guidelines.
Part of a responsible generative AI toolkit aimed at improving the safety of AI applications.
How to Use
Install the necessary library: `pip install -U transformers[accelerate]`.
Import AutoTokenizer and AutoModelForCausalLM from Hugging Face.
Load the ShieldGemma model using AutoTokenizer and AutoModelForCausalLM.
Format the prompt as needed, including context, user prompts, and descriptions of security policies.
Input the formatted prompt into the model and use it for content auditing.
Determine if the content violates security policies based on the model’s output of 'Yes' or 'No'.
Adjust model parameters or prompt formats as necessary to optimize auditing effectiveness.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase