Skip to content

Messages API

Generate conversational AI responses with AWS Bedrock foundation models—including Claude, Nova, Llama, and more—through an Anthropic-compatible Messages API interface.

Route Prefix

By default, all Anthropic-compatible routes are prefixed with /anthropic. This means the Messages API is available at /anthropic/v1/messages instead of /v1/messages. You can customize this prefix using the ANTHROPIC_ROUTES_PREFIX configuration variable documented in Operations Configuration.

Why Choose Messages API?

  • Multiple Models
    Access models from Anthropic, Amazon, Meta, and more through one API. Choose the best model for your task without vendor lock-in.

  • Multi-Modal
    Process text, images, videos, and documents together. Support for URLs, data URIs, and direct S3 references.

  • Built-In Safety
    AWS Bedrock Guardrails provide content filtering and safety policies.

  • AWS Scale & Reliability
    Run on AWS infrastructure with service tiers for optimized latency. Multi-region model access for availability and performance.

Quick Start: Available Endpoints

Endpoint Method What It Does Powered By
/v1/messages POST Conversational AI with multi-modal support AWS Bedrock Converse API
/v1/messages/count_tokens POST Count tokens in a message without sending AWS Bedrock CountTokens API

Feature Compatibility

Feature Status Notes
Messages & Roles
Text messages Full support for all text content
Image input (image) HTTP URLs, data URIs, base64
Document input (document) PDF (base64/URL), plain text, content blocks
Document citations Citation locations in responses (PDF only on some models)
Search result input (search_result) Pass search results as context
System messages System prompts
Image & Document input from S3 S3 URLs
Tool Calling
Tool use (tools) Full Anthropic-compatible schema
Tool choice (auto, any, tool) Control tool selection behavior
Tool choice none Remove tools from request instead
Parallel tool calls Multiple tools in one turn
Web search tool (web_search) Available on models with system tool support (e.g., Amazon Nova 2)
Claude server tools Bash, text editor, computer use (Claude 3.5+), memory (Claude 3.7-4.5)
Generation Control
max_tokens Output length limits (required)
temperature Mapped to Bedrock inference params
top_p Nucleus sampling control
top_k Top-k sampling control
stop_sequences Custom stop strings
Thinking
Prompt caching Cache prompts to reduce costs and latency
Extra model-specific params Extra model-specific parameters not supported by the Anthropic API
Streaming & Output
Text Text messages
Streaming (stream: true) Server-Sent Events (SSE)
Thinking content Extended thinking output in content blocks
Usage tracking
Input text tokens Billing unit
Output tokens Billing unit
Cache creation tokens Prompt caching metrics (streaming and non-streaming)
Cache read tokens Prompt caching metrics
Other
Metadata Logged
Bedrock Guardrails Content safety policies
Service tiers Mapped to Bedrock service tiers and latency options

Legend:

  • Supported — Fully compatible with Anthropic API
  • Available on Select Models — Check your model's capabilities
  • Partial — Supported with limitations
  • Unsupported — Not available in this implementation
  • Extra Feature — Enhanced capability beyond Anthropic API

Model Support

All models supported by AWS Bedrock Converse and Converse Stream API are supported.

Claude Claude Models Name Aliases

This API supports dynamic model name aliases matching the official Anthropic API. You can use Claude model names exactly as they appear in Anthropic's documentation, and they will be automatically resolved to the corresponding AWS Bedrock model identifiers.

Examples:

  • claude-opus-4-6anthropic.claude-opus-4-6-v1
  • claude-sonnet-4-6anthropic.claude-sonnet-4-6
  • claude-haiku-4-5-20251001anthropic.claude-haiku-4-5-20251001-v1:0

Aliases for non-Anthropic models are also supported as normal.

Advanced Features

Prompt Caching

Reduce costs and improve response times by caching frequently-used prompt components across multiple requests. This feature is particularly effective for applications with consistent system prompts, tool definitions, or conversation contexts.

Supported Models:

  • Anthropic Claude: Full support for system, messages, and tools caching
  • Amazon Nova: Support for system and messages caching

Documentation

See AWS Bedrock Prompt Caching - Supported Models for the complete list of models supporting prompt caching.

Cache Creation Costs

Cache creation incurs a higher cost than regular token processing. Only use prompt caching when you expect a high cache hit ratio across multiple requests with similar prompts.

How to Use:

Add cache_control blocks to the content you want to cache:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic.claude-opus-4-6-v1",
    "max_tokens": 1024,
    "system": [
      {
        "type": "text",
        "text": "You are a helpful assistant with extensive knowledge...",
        "cache_control": {"type": "ephemeral"}
      }
    ],
    "messages": [
      {"role": "user", "content": "What is 2 + 2?"}
    ]
  }'

Granular Cache Control:

Enable caching for specific sections by adding cache_control blocks:

  • System messages: Add to system text blocks
  • Messages: Add to the last message content block you want cached
  • Tools: Add to the last tool definition you want cached (Anthropic Claude only)
{
  "model": "anthropic.claude-opus-4-6-v1",
  "max_tokens": 1024,
  "system": [
    {
      "type": "text",
      "text": "System instructions...",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "tools": [
    {
      "name": "get_weather",
      "description": "Get weather data",
      "input_schema": {...},
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [...]
}

Benefits:

  • Cost Reduction: Cached tokens are billed at a lower rate than regular input tokens
  • Lower Latency: Cached prompts eliminate reprocessing time
  • Automatic Management: The API handles cache invalidation and updates

Usage Tracking:

Cached token usage is reported in the response:

{
  "usage": {
    "input_tokens": 300,
    "cache_creation_input_tokens": 1200,
    "cache_read_input_tokens": 0,
    "output_tokens": 100
  }
}

In subsequent requests with cache hits:

{
  "usage": {
    "input_tokens": 300,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 1200,
    "output_tokens": 100
  }
}

System Prompt

System prompts define the AI assistant's behavior, personality, and instructions (e.g., "You are a helpful assistant"). Most models support system prompts.

Unsupported Models

Some models don't support system prompts (mistral.mistral-7b-instruct-v0:2, mistral.mistral-8x7b-instruct-v0:1). By default, stdapi.ai silently drops system messages for these models, allowing cross-model compatibility. To receive errors instead, configure DROP_UNSUPPORTED_SYSTEM_PROMPT=false.

AWS S3 S3 Image Support

Access images directly from your S3 buckets without generating pre-signed URLs or downloading files locally.

Supported Formats:

  • Images: JPEG, PNG, GIF, WebP

How to Use:

Simply reference your S3 images using the s3:// URI scheme in image source fields:

{
  "model": "anthropic.claude-opus-4-6-v1",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this image"},
        {
          "type": "image",
          "source": {
            "type": "url",
            "url": "s3://my-bucket/images/photo.jpg"
          }
        }
      ]
    }
  ]
}

IAM Permissions Required

Your API service must have IAM permissions to read from the specified S3 buckets. S3 objects must be in the same AWS region as the executed model or accessible via your IAM role. Standard S3 data transfer and request costs apply.

Benefits:

  • No pre-signed URLs - Direct S3 access without generating temporary URLs
  • Security - Images stay in your AWS account with IAM-controlled access
  • Performance - Optimized data transfer within AWS infrastructure
  • Large images - No size limitations of data URIs or base64 encoding

Document Input

Send documents as context for the model to analyze and reference. Supports multiple source types:

  • Base64 PDF: Inline PDF documents encoded in base64
  • URL PDF: PDF documents fetched from HTTP(S) URLs (downloaded server-side)
  • Plain text: Raw text content as documents
  • Content blocks: Structured content with text and images

Enable citations on document blocks to get precise source references in responses:

{
  "type": "document",
  "source": {
    "type": "text",
    "media_type": "text/plain",
    "data": "The capital of France is Paris."
  },
  "title": "Geography",
  "citations": {"enabled": true}
}

Citation Support

Citation support varies by model and document format. PDF documents generally have the best citation support across models.

Server Tools

Server tools are built-in capabilities that foundation models can use directly without requiring you to implement backend integrations. Different model providers support different server tools through their native tool formats.

Web Search Tool

The Anthropic web_search tool is supported on models that declare web search as a system tool. When you include a web_search tool in your request, it is automatically mapped to the model's native system tool (e.g., nova_grounding for Amazon Nova 2 models).

Supported Models:

  • Amazon Nova Amazon Nova 2 (e.g., amazon.nova-premier-v1:0): Mapped to nova_grounding

Usage:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-premier-v1:0",
    "max_tokens": 2048,
    "messages": [
      {"role": "user", "content": "What are the latest news today?"}
    ],
    "tools": [
      {"type": "web_search_20250305", "name": "web_search"}
    ]
  }'

Model Compatibility

Requesting web_search on a model that does not support it will return a 400 Bad Request error.

Claude Claude Server Tools

Anthropic Claude models support server-side tools that are executed by the model provider. These tools are passed through to Bedrock via additionalModelRequestFields in their native Anthropic JSON format.

Supported Tools by Model:

Tool Claude 3.5 Sonnet v2 Claude 3.7+
bash
text_editor (str_replace_editor)
computer
memory

Usage:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: computer-use-2025-01-24" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
    "max_tokens": 4096,
    "messages": [
      {"role": "user", "content": "Run a Python script that prints hello world."}
    ],
    "tools": [
      {"type": "bash_20250124", "name": "bash"},
      {"type": "text_editor_20250124", "name": "str_replace_editor"}
    ]
  }'

Beta Headers

Claude server tools require specific anthropic-beta flags on Bedrock. These flags are automatically injected when the corresponding server tools are included in the request — no manual header required:

  • bash, text_editor, computercomputer-use-2024-10-22 (Claude 3.5) or computer-use-2025-01-24 (Claude 3.7+)
  • memorycontext-management-2025-06-27 (Claude 3.7-4.5)

You can still pass additional anthropic-beta flags via the HTTP header or request body for non-tool beta features (e.g., output-128k-2025-02-19).

Model Compatibility

Requesting a server tool on a model that does not support it will return a 400 Bad Request error. Non-Claude models do not support these tools.

Unsupported Anthropic Server Tools

The following Anthropic server tools are not supported via Bedrock:

  • code_execution — Code execution sandbox
  • web_search — Web search (only available on Nova Premier via nova_grounding)
  • web_fetch — Web page fetching
  • tool_search — Tool search
  • container_upload — Container file upload

Requests using these tools on any Claude model will return a 400 Bad Request error.

Provider-Specific Parameters

Unlock advanced model capabilities by passing provider-specific parameters directly in your requests. These parameters are forwarded to AWS Bedrock and allow you to access features unique to each foundation model provider.

Documentation

See Bedrock Model Parameters for the complete list of available parameters per model.

How It Works:

Add provider-specific fields at the top level of your request body alongside standard Anthropic parameters. The API automatically forwards these to the appropriate model provider via AWS Bedrock.

Configuration Options:

Option 1: Per-Request

Add provider-specific parameters directly in your request body.

Option 2: Server-Wide Defaults

Configure default parameters for specific models via the DEFAULT_MODEL_PARAMS environment variable:

export DEFAULT_MODEL_PARAMS='{
  "anthropic.claude-sonnet-4-5-20250929-v1:0": {
    "anthropic_beta": ["extended-thinking-2024-12-12"]
  }
}'

Parameter Priority

Per-request parameters override server-wide defaults.

Behavior:

  • Compatible parameters: Forwarded to the model and applied
  • ⚠️ Unsupported parameters: Return HTTP 400 with an error message

Claude Anthropic Claude Features

Enable cutting-edge Claude capabilities including extended thinking and reasoning.

Extended Thinking

Enable extended thinking by passing the anthropic-beta header, just like the official Anthropic API:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: extended-thinking-2024-12-12" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
    "max_tokens": 1024,
    "messages": [{"role":"user","content":"Solve a complex problem"}]
  }'

Response with Thinking:

When extended thinking is enabled, the response includes thinking content blocks:

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me think about this step by step..."
    },
    {
      "type": "text",
      "text": "Here's the solution..."
    }
  ],
  "usage": {...}
}

Server-Wide Configuration

You can also configure beta flags server-wide using the DEFAULT_MODEL_PARAMS environment variable (see Provider-Specific Parameters).

Unsupported Beta Flags

Unsupported flags that would change output return HTTP 400 errors.

Documentation

See Using Claude on AWS Bedrock for more details on Claude-specific parameters.

Available Request Headers

This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.

Content Safety (Guardrails)

Header Purpose Valid Values
X-Amzn-Bedrock-GuardrailIdentifier Guardrail ID for content filtering Your guardrail identifier
X-Amzn-Bedrock-GuardrailVersion Guardrail version Version number (e.g., 1)
X-Amzn-Bedrock-Trace Guardrail trace level disabled, enabled, enabled_full

Performance Optimization

Header Purpose Valid Values
X-Amzn-Bedrock-Service-Tier Service tier selection priority, default, flex
X-Amzn-Bedrock-PerformanceConfig-Latency Latency optimization standard, optimized

Model-Specific Headers

Header Purpose Valid Values Models
anthropic-beta Enable Anthropic beta features Comma-separated feature names (e.g., extended-thinking-2024-12-12,context-management-2025-06-27) Anthropic Claude

Example with all headers:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-GuardrailIdentifier: your-guardrail-id" \
  -H "X-Amzn-Bedrock-GuardrailVersion: 1" \
  -H "X-Amzn-Bedrock-Trace: enabled" \
  -H "X-Amzn-Bedrock-Service-Tier: priority" \
  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
  -d '{
    "model": "anthropic.claude-opus-4-6-v1",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Detailed Documentation

For complete information about these headers, configuration options, and use cases, see:

Try It Now

Basic message:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-micro-v1:0",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Say hello world"}]
  }'

Streaming response:

curl -N -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-micro-v1:0",
    "max_tokens": 1024,
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku about the sea."}]
  }'

Multi-modal with image:

{
  "model": "amazon.nova-micro-v1:0",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this image"},
        {
          "type": "image",
          "source": {
            "type": "url",
            "url": "https://example.com/photo.jpg"
          }
        }
      ]
    }
  ]
}

With tool calling:

curl -X POST "$BASE/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic.claude-opus-4-6-v1",
    "max_tokens": 1024,
    "tools": [
      {
        "name": "get_weather",
        "description": "Get weather information",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "City name"}
          },
          "required": ["location"]
        }
      }
    ],
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"}
    ]
  }'

Ready to build with AI? Check out the Models API to see all available foundation models!