Skip to content

Embeddings API

Transform text into semantic vectors. Power your search, recommendations, and similarity features with AWS Bedrock embedding models through an OpenAI-compatible interface.

Why Choose Embeddings?

  • Semantic Search
    Find content based on meaning and context, not just exact words. For knowledge bases and document retrieval.

  • High Performance
    AWS Bedrock embedding models deliver fast vectors optimized for production workloads. Batch processing for large-scale operations.

  • Flexible Dimensions
    Choose vector dimensions that match your needs. Balance accuracy and storage/compute costs with model-specific dimension control.

  • Multimodal Embeddings
    Process images, videos, audio, and PDF documents alongside text. Unified embeddings for cross-modal search using base64 data URI input.

Quick Start: Available Endpoint

Endpoint Method What It Does Powered By
/v1/embeddings POST Transform text into semantic vectors AWS Bedrock Embedding Models

Feature Compatibility

Feature Status Notes
Input Types
Text input (single string) Full support for text embeddings
Multimodal input Image, audio, video, document (image + text)
Multiple input (batch array) Process multiple inputs efficiently
Token array input Array of token integers not supported
Output Formats
Float vectors Standard floating-point arrays
Base64 encoding Base64-encoded float32 arrays
Model Parameters
dimensions override Some models support dimension reduction
encoding_format Choose float or base64
Extra model-specific params Extra model-specific parameters not supported by the OpenAI API
Usage tracking
Input text tokens Estimated on some models

Legend:

  • Supported — Fully compatible with OpenAI API
  • Available on Select Models — Check your model's capabilities
  • Unsupported — Not available in this implementation
  • Extra Feature — Enhanced capability beyond OpenAI API

Available Request Headers

This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.

Content Safety (Guardrails)

Header Purpose Valid Values
X-Amzn-Bedrock-GuardrailIdentifier Guardrail ID for content filtering Your guardrail identifier
X-Amzn-Bedrock-GuardrailVersion Guardrail version Version number (e.g., 1)
X-Amzn-Bedrock-Trace Guardrail trace level disabled, enabled, enabled_full

Performance Optimization

Header Purpose Valid Values
X-Amzn-Bedrock-Service-Tier Service tier selection priority, default, flex
X-Amzn-Bedrock-PerformanceConfig-Latency Latency optimization standard, optimized

Example with headers:

curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-Service-Tier: flex" \
  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: standard" \
  -d '{
    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
    "input": ["Batch text 1", "Batch text 2", "Batch text 3"]
  }'

Detailed Documentation

For complete information about these headers, configuration options, and use cases, see:

Advanced Features

Provider-Specific Parameters

Access advanced embedding capabilities by passing provider-specific parameters directly in your requests. These parameters are forwarded to AWS Bedrock and allow you to access features unique to each embedding model provider.

Documentation: Bedrock Embedding Model Parameters

How It Works:

Add provider-specific fields at the top level of your request body alongside standard OpenAI parameters. The API automatically forwards these to the appropriate model provider via AWS Bedrock.

Examples:

Cohere Embed v4 - Input Type:

{
  "model": "cohere.embed-v4",
  "input": "Semantic search transforms how we find information",
  "input_type": "search_query"
}

Amazon Titan Embed v2 - Normalization:

{
  "model": "amazon.nova-2-multimodal-embeddings-v1:0",
  "input": "Product description for similarity matching",
  "normalize": true
}

Configuration Options:

Option 1: Per-Request

Add provider-specific parameters directly in your request body (as shown in examples above).

Option 2: Server-Wide Defaults

Configure default parameters for specific models via the DEFAULT_MODEL_PARAMS environment variable:

export DEFAULT_MODEL_PARAMS='{
  "cohere.embed-v4": {
    "input_type": "search_document",
    "truncate": "END"
  }
}'

Note: Per-request parameters override server-wide defaults.

Behavior:

  • Compatible parameters: Forwarded to the model and applied
  • ⚠️ Unsupported parameters: Return HTTP 400 with an error message

Try It Now

Single text embedding:

curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
    "input": "Semantic search transforms how we find information"
  }'

Batch processing with base64 encoding:

curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
    "input": ["Product description", "User query", "Related content"],
    "encoding_format": "base64"
  }'

Multimodal Embeddings

Go beyond text! Supported models can process images, videos, and audio through base64 data URI input. This enables powerful cross-modal search and similarity features.

Input Format

Multimodal content is passed as base64-encoded data URIs:

data:<mime-type>;base64,<base64-encoded-content>

Example: Image Embedding

# First, encode your image to base64
IMAGE_B64=$(base64 -w 0 image.jpg)

# Send the embedding request
curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"amazon.nova-2-multimodal-embeddings-v1:0\",
    \"input\": \"data:image/jpeg;base64,$IMAGE_B64\"
  }"

Example: Video Embedding

Option 1: Base64-encoded video (for small files)

# First, encode your video to base64
VIDEO_B64=$(base64 -w 0 video.mp4)

# Send the embedding request
curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"amazon.nova-2-multimodal-embeddings-v1:0\",
    \"input\": \"data:video/mp4;base64,$VIDEO_B64\"
  }"

Automatic S3 Upload and Asynchronous Invocation

When you provide Base64-encoded data that exceeds the model's size limit (or Bedrock's 25 MB quota), the server automatically uploads it to S3 and selects the appropriate invocation method (synchronous or asynchronous).

To allow this behavior, configure regional S3 buckets via AWS_S3_REGIONAL_BUCKETS in the same region as your Bedrock model. See configuration guide.

Large Base64 Files and Memory Configuration

While passing large files as Base64 is supported, ensure your server has sufficient memory configured. Large Base64-encoded files (especially videos) can consume significant memory during processing. Consider using S3 URLs directly for very large files, or adjust your server's memory limits accordingly.

Option 2: S3 URL (for large files)

# Send the embedding request with S3 URL
curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
    "input": "s3://my-bucket/path/to/video.mp4"
  }'

S3 URL Requirements

When using S3 URLs directly:

  • S3 bucket must be in the same AWS region as the Bedrock model
  • The stdapi.ai server must have read access to the S3 object
  • For TwelveLabs Marengo models: S3 bucket must be in the same AWS account as the STDAPI server

Example: PDF Document Embedding

For PDFs, convert each page to an image and send via inputs along with page metadata (e.g., file_name, entities) in adjacent text parts. For RAG applications, smaller chunks often improve retrieval accuracy and reduce costs.

Cohere Cohere Embed v4 (supports multiple text+image pairs in one request):

# Convert PDF pages to images (using ImageMagick or similar tool)
convert -density 150 document.pdf page-%d.jpg

# Encode each page image to base64
PAGE_1=$(base64 -w 0 page-0.jpg)
PAGE_2=$(base64 -w 0 page-1.jpg)

# Generate document embedding with metadata
curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"cohere.embed-v4\",
    \"input\": [
      \"file_name: report.pdf, page: 1\",
      \"data:image/jpeg;base64,$PAGE_1\",
      \"file_name: report.pdf, page: 2\",
      \"data:image/jpeg;base64,$PAGE_2\"
    ]
  }"

TwelveLabs TwelveLabs Marengo v3 (requires exactly one text + one image per request):

Text+Image Pairing for Marengo v3

When using twelvelabs.marengo-embed-3-0-v1:0, if you provide exactly 2 inputs where one is text and one is image, they are automatically combined into a single text_image embedding. This creates a unified multimodal representation of the text-image pair.

# Encode image to base64
IMAGE_B64=$(base64 -w 0 page-0.jpg)

# Generate text+image embedding (automatically uses text_image mode)
curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"twelvelabs.marengo-embed-3-0-v1:0\",
    \"input\": [
      \"A diagram showing the quarterly sales report\",
      \"data:image/jpeg;base64,$IMAGE_B64\"
    ]
  }"

Mixed-Content Batching

Combine text and multimodal inputs in a single request:

curl -X POST "$BASE/v1/embeddings" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"cohere.embed-v4\",
    \"input\": [
      \"A beautiful sunset over mountains\",
      \"...\",
      \"Nature photography collection\"
    ]
  }"

Use Cases

  • Visual Search: Find images similar to a query image or text description
  • Video Analysis: Search and retrieve video content based on visual similarity or text descriptions
  • Audio Similarity: Find similar audio clips or match audio to text descriptions
  • Document Retrieval: Find relevant PDFs based on visual and textual content
  • Cross-Modal Recommendations: Recommend images, videos, or audio based on text queries and vice versa
  • Content Moderation: Analyze and classify multimodal content at scale

Build smarter search and recommendations! Explore available embedding models in the Models API.