sysops

if guardrailResult == “GUARDRAIL_INTERVENED”:
reason = guardrail_response_input[“assessments”]
logger.warning(f”Guardrail intervention: {reason}”)
return guardrail_response_input[“outputs”][0][“text”]

If the input passes the safety check, process it with the SageMaker endpoint and then check the output:

else:
logger.info(“Input passed guardrail check”)
# Format input for the model
endpoint_input = ‘usernn’ + input_text + ‘assistantnn’
try:
# Set up SageMaker predictor
predictor = sagemaker.predictor.Predictor(
endpoint_name=endpoint_name,
sagemaker_session=session,
serializer=sagemaker.serializers.JSONSerializer(),
deserializer=sagemaker.deserializers.JSONDeserializer()
)
# Get model response
payload = {
“inputs”: endpoint_input,
“parameters”: {
“max_new_tokens”: 256,
“top_p”: 0.9,
“temperature”: 0.6
}
}
endpoint_response = predictor.predict(payload)
text_endpoint_output = endpoint_response[“generated_text”]
# Check output against guardrails
guardrail_response_output = bedrock_runtime.apply_guardrail(
guardrailIdentifier=guardrail_id,
guardrailVersion=guardrail_version,
source=’INPUT’,
content=[{‘text’: {‘text’: text_endpoint_output}}]
)
guardrailResult_output = guardrail_response_output[“action”]
if guardrailResult_output == “GUARDRAIL_INTERVENED”:
reason = guardrail_response_output[“assessments”]
logger.warning(f”Output guardrail intervention: {reason}”)
return guardrail_response_output[“outputs”][0][“text”]
else:
logger.info(“Output passed guardrail check”)
return text_endpoint_output

except ClientError as e:
logger.error(f”AWS API error: {str(e)}”)
raise
except Exception as e:
logger.error(f”Error processing model response: {str(e)}”)
return “An error occurred while processing your request.”

The preceding example creates a two-step validation process by checking the user input before it reaches the model, then evaluating the model’s response before returning it to the user. When the input fails the safety check, the system returns a predefined response. Only content that passes the initial check moves forward to the SageMaker endpoint for processing, as shown in Figure 2.

Figure 2: Implementation flow using the ApplyGuardrail API

This dual-validation approach helps to verify that interactions with your AI application meet your safety standards and comply with your organization’s policies. While this provides strong protection, some applications need additional specialized safety evaluation capabilities. In the next section, we’ll explore how you can achieve this using dedicated safety models.
Using foundation models as external guardrails
Building on the previous safety layers, you can add foundation models designed specifically for content evaluation. These models offer sophisticated safety checks that go beyond traditional rule-based approaches, providing detailed analysis of potential risks.
Foundation models for safety evaluation
Several foundation models are specifically trained for content safety evaluation. For this post, we use Llama Guard as an example. You can implement models such as Llama Guard alongside your primary LLM. Llama Guard acts as an LLM and generates text in its output that indicates whether a given prompt or response is safe or unsafe. If unsafe, it also lists the content categories violated.
Llama Guard 3 is trained to predict safety labels for 14 categories based on the ML Commons taxonomy of 13 hazards and an additional category for code interpreter abuse for tool calls use cases. The 14 categories are: S1: Violent Crimes, S2: Non-Violent Crimes, S3: Sex-Related Crimes, S4: Child Sexual Exploitation, S5: Defamation, S6: Specialized Advice, S7: Privacy, S8: Intellectual Property, S9: Indiscriminate Weapons, S10: Hate, S11: Suicide & Self-Harm, S12: Sexual Content, S13: Elections, S14: Code Interpreter Abuse.
Llama Guard 3 provides content moderation in eight languages: English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai.
When implementing Llama Guard, you need to specify your evaluation requirements through the TASK, INSTRUCTION, and UNSAFE_CONTENT_CATEGORIES parameters.

TASK: The type of evaluation to perform
INSTRUCTION: Specific guidance for the evaluation
UNSAFE_CONTENT_CATEGORIES: Which hazard categories to check

You can use the requirements to specify which hazard categories to monitor based on your use case. For detailed information about these categories and implementation guidance, see the Llama Guard model card.
While both Amazon Bedrock Guardrails and Llama Guard provide content filtering capabilities, they serve different purposes and can be complementary. Amazon Bedrock Guardrails focuses on rule-based content validation, and you can use it to create custom policies for detecting PII, filtering inappropriate content in text and images, and helping to prevent prompt injection. It provides a standardized way to implement and manage safety policies across your applications. Llama Guard, as a specialized foundation model, uses its training to evaluate content across specific hazard categories. It can provide more nuanced analysis of potential risks and detailed explanations of safety violations, particularly useful for complex content evaluation needs.
Implementation options with SageMaker
When implementing external safety models with SageMaker, you have two deployment options:

You can deploy separate SageMaker endpoints for each model by using SageMaker JumpStart for quick model deployment or by setting up the model configuration and importing the model from Hugging Face.
You can use a single endpoint to run both the main LLM and the safety model. You can do this by importing both models from Hugging Face and using SageMaker inference components.

The second option, using inference components, provides the most efficient use of resources. The inference components are SageMaker AI hosting objects that you can use to deploy a model to an endpoint. In the inference component settings, you specify the model, the endpoint, and how the model uses the resources that the endpoint hosts. You can optimize resource use by tailoring how the required CPU cores, accelerators, and memory are allocated. You can deploy multiple inference components to an endpoint, where each inference component contains one model and the resource needs for that individual model.
After you deploy an inference component, you can directly invoke the associated model when you use the InvokeEndpoint API action. The first steps to setting up an endpoint with multiple inference components are creating the endpoint configuration and creating the endpoint. The following is an example of this:

# create the endpoint configuration

endpoint_name = sagemaker.utils.name_from_base(“”)
endpoint_config_name = f”{endpoint_name}-config”

sm_client.create_endpoint_config(
EndpointConfigName = endpoint_config_name,
ExecutionRoleArn = “”,
ProductionVariants = [
{
“VariantName”: “AllTraffic”,
“InstanceType”: “”,
“InitialInstanceCount”: ,
“ModelDataDownloadTimeoutInSeconds”: ,
“ContainerStartupHealthCheckTimeoutInSeconds”: ,
“ManagedInstanceScaling”: {
“Status”: “ENABLED”,
“MinInstanceCount”: ,
“MaxInstanceCount”: ,
},
“RoutingConfig”: {“RoutingStrategy”: “LEAST_OUTSTANDING_REQUESTS”},
}
]
)
# create the endpoint by providing the configuration that we just specified.
create_endpoint_response = sm_client.create_endpoint(
EndpointName = endpoint_name, EndpointConfigName = endpoint_config_name
)

The next step is to create the two inference components. Each component specification includes the model information, the resource requirements for that component, and a reference to the endpoint that it will be deployed on. The following is an example of such components:

# Create Llama Guard component (AWQ quantized version)
create_model_response = sm_client.create_model(
ModelName = ,
ExecutionRoleArn = “”,
PrimaryContainer = {
“Image”: inference_image_uri,
“Environment”: env_guardllm, # environment variables for this model
},
)
sm_client.create_inference_component(
InferenceComponentName = ,
EndpointName = endpoint_name,
VariantName = “AllTraffic”,
Specification={
“ModelName”: “”,
“StartupParameters”: {
“ModelDataDownloadTimeoutInSeconds”: ,
“ContainerStartupHealthCheckTimeoutInSeconds”: ,
},
“ComputeResourceRequirements”: {
“MinMemoryRequiredInMb”: ,
“NumberOfAcceleratorDevicesRequired”: ,
},
},
RuntimeConfig={
“CopyCount”: ,
}
)
# Create second inference component for the main model
create_model_response = sm_client.create_model(
ModelName = ,
ExecutionRoleArn = “”,
PrimaryContainer = {
“Image”: inference_image_uri,
“Environment”: env_mainllm,
},
)
sm_client.create_inference_component(
InferenceComponentName = ,
EndpointName = endpoint_name,
VariantName = variant_name,
Specification={
“ModelName”: ,
“StartupParameters”: {
“ModelDataDownloadTimeoutInSeconds”: ,
“ContainerStartupHealthCheckTimeoutInSeconds”: ,
},
“ComputeResourceRequirements”: {
“MinMemoryRequiredInMb”: ,
“NumberOfAcceleratorDevicesRequired”: ,
},
},
RuntimeConfig={
“CopyCount”: initial_copy_count,
},
)

The complete implementation code and detailed instructions are available in the AWS samples repository.
Safety evaluation workflow
Using SageMaker inference components, you can create an architectural pattern with your safety model as a checkpoint before and after your main model processes requests. The workflow operates as follows:

A user sends a request to your application.
Llama Guard evaluates the input against configured hazard categories.
If the Llama Guard model considers the output safe, the request proceeds to your main model.
The model’s response undergoes another Llama Guard evaluation.
Safe responses are returned to the user. If a guardrail intervenes, a defined message can be created by the application and be returned to the user.

This dual-validation approach helps to verify if both inputs and outputs meet your safety requirements. The workflow is shown in Figure 3:

Figure 3: Dual-validation workflow

While this architecture provides robust protection, it’s important to understand the characteristics and limitations of the external safety model you choose. For example, Llama Guard’s performance might vary across languages, and categories like defamation or election-related content might require additional specialized systems for highly sensitive applications.
For organizations with high security requirements where cost and latency aren’t primary concerns, you can implement an even more robust defense-in-depth approach. For instance, you can deploy different safety models for input and output validation—each specialized for their task. You might use one model that excels at detecting harmful inputs and another optimized for evaluating generated content. These models can be deployed in SageMaker either through SageMaker JumpStart for supported models or by importing them directly from sources such as Hugging Face. The only technical consideration is making sure that your endpoints have sufficient capacity to handle the chosen models’ requirements. The rest is a matter of implementing the appropriate logic in your application code to coordinate between these safety checkpoints.
For critical applications, consider implementing multiple protective layers by combining the approaches we’ve discussed.
Extending protection with third-party guardrails
While AWS provides comprehensive safety features through built-in safeguards, Amazon Bedrock Guardrails, and support for safety-focused foundation models, some applications require additional specialized protection. Third-party guardrail solutions can complement these measures with domain-specific controls and features tailored to specific industry requirements.
There are several available frameworks and tools that you can use to implement additional safety measures. Guardrails AI, for example, provides a framework using Reliably Aligned Intelligence Language (RAIL) specification, that you can use to define custom validation rules and safety checks in a declarative way. Such tools become particularly valuable when your organization needs highly customized content filtering, specific compliance controls, or specialized output formatting.
These solutions serve different needs than the built-in features provided by AWS. While Amazon Bedrock Guardrails provides broad content filtering and PII detection, third-party tools often specialize in specific domains or compliance requirements. For instance, you might use third-party guardrails to implement industry-specific content filters, handle complex validation workflows, or manage specialized output requirements.
Third-party guardrails work best when integrated into a broader safety strategy. Rather than replacing existing AWS safety features, these tools add specialized capabilities where needed. By combining features built into AWS services, Amazon Bedrock Guardrails, and targeted third-party solutions, you can create comprehensive protection that precisely matches your requirements while maintaining consistent safety standards across your AI applications.
Conclusion
In this post, you’ve seen comprehensive approaches to implementing safety guardrails for AI applications using Amazon SageMaker. Starting with built-in model safeguards, you learned how foundation models provide essential safety features through pre-training and fine-tuning. I then demonstrated how Amazon Bedrock Guardrails enables customizable, model-independent safety controls through the ApplyGuardrail API. Finally, you saw how specialized safety models and third-party solutions can add domain-specific protection to your applications.
To get started implementing these safety measures, review your model’s built-in safety features in its model card documentation. Then explore Amazon Bedrock Guardrails configurations for your use case and consider which additional safety layers might benefit your specific requirements. Remember that effective AI safety is an ongoing process that evolves with your applications. Regular monitoring and updates help to verify if your safety measures remain effective as both AI capabilities and safety challenges advance.
If you have feedback about this post, submit comments in the Comments section below.

Laura Verghote
Laura is a Senior Solutions Architect for public sector customers in the EMEA region. She works with customers to design and build solutions in the AWS Cloud, bridging the gap between complex business requirements and technical solutions. She joined AWS as a technical trainer and has wide experience delivering training content to developers, administrators, architects, and partners across EMEA.”]

Related Posts