Content Safety (Text)#
Azure Content Safety is a content moderation service developed by Microsoft that help users detect harmful content from different modalities and languages. This tool is a wrapper for the Azure Content Safety Text API, which allows you to detect text content and get moderation results. See the Azure Content Safety for more information.
Requirements#
For AzureML users, the tool is installed in default image, you can use the tool without extra installation.
For local users,
pip install promptflow-tools
[!NOTE] Content Safety (Text) tool is now incorporated into the latest
promptflow-tools
package. If you have previously installed the packagepromptflow-contentsafety
, please uninstall it to avoid the duplication in your local tool list.
Prerequisites#
Create an Azure Content Safety resource.
Add “Azure Content Safety” connection in prompt flow. Fill “API key” field with “Primary key” from “Keys and Endpoint” section of created resource.
Inputs#
You can use the following parameters as inputs for this tool:
Name |
Type |
Description |
Required |
---|---|---|---|
text |
string |
The text that need to be moderated. |
Yes |
hate_category |
string |
The moderation sensitivity for Hate category. You can choose from four options: disable, low_sensitivity, medium_sensitivity, or high_sensitivity. The disable option means no moderation for hate category. The other three options mean different degrees of strictness in filtering out hate content. The default option is medium_sensitivity. |
Yes |
sexual_category |
string |
The moderation sensitivity for Sexual category. You can choose from four options: disable, low_sensitivity, medium_sensitivity, or high_sensitivity. The disable option means no moderation for sexual category. The other three options mean different degrees of strictness in filtering out sexual content. The default option is medium_sensitivity. |
Yes |
self_harm_category |
string |
The moderation sensitivity for Self-harm category. You can choose from four options: disable, low_sensitivity, medium_sensitivity, or high_sensitivity. The disable option means no moderation for self-harm category. The other three options mean different degrees of strictness in filtering out self_harm content. The default option is medium_sensitivity. |
Yes |
violence_category |
string |
The moderation sensitivity for Violence category. You can choose from four options: disable, low_sensitivity, medium_sensitivity, or high_sensitivity. The disable option means no moderation for violence category. The other three options mean different degrees of strictness in filtering out violence content. The default option is medium_sensitivity. |
Yes |
For more information, please refer to Azure Content Safety
Outputs#
The following is an example JSON format response returned by the tool:
Output
{
"action_by_category": {
"Hate": "Accept",
"SelfHarm": "Accept",
"Sexual": "Accept",
"Violence": "Accept"
},
"suggested_action": "Accept"
}
The action_by_category
field gives you a binary value for each category: Accept or Reject. This value shows if the text meets the sensitivity level that you set in the request parameters for that category.
The suggested_action
field gives you an overall recommendation based on the four categories. If any category has a Reject value, the suggested_action
will be Reject as well.