Embeddings API¶
Transform text into semantic vectors. Power your search, recommendations, and similarity features with AWS Bedrock embedding models through an OpenAI-compatible interface.
Why Choose Embeddings?¶
-
Semantic Search
Find content based on meaning and context, not just exact words. For knowledge bases and document retrieval. -
High Performance
AWS Bedrock embedding models deliver fast vectors optimized for production workloads. Batch processing for large-scale operations. -
Flexible Dimensions
Choose vector dimensions that match your needs. Balance accuracy and storage/compute costs with model-specific dimension control. -
Multimodal Embeddings
Process images, videos, audio, and PDF documents alongside text. Unified embeddings for cross-modal search using base64 data URI input.
Quick Start: Available Endpoint¶
| Endpoint | Method | What It Does | Powered By |
|---|---|---|---|
/v1/embeddings |
POST | Transform text into semantic vectors | AWS Bedrock Embedding Models |
Feature Compatibility¶
| Feature | Status | Notes |
|---|---|---|
| Input Types | ||
| Text input (single string) | Full support for text embeddings | |
| Multimodal input | Image, audio, video, document (image + text) | |
| Multiple input (batch array) | Process multiple inputs efficiently | |
| Token array input | Array of token integers not supported | |
| Output Formats | ||
| Float vectors | Standard floating-point arrays | |
| Base64 encoding | Base64-encoded float32 arrays | |
| Model Parameters | ||
dimensions override |
Some models support dimension reduction | |
encoding_format |
Choose float or base64 |
|
| Extra model-specific params | Extra model-specific parameters not supported by the OpenAI API | |
| Usage tracking | ||
| Input text tokens | Estimated on some models |
Legend:
- Supported — Fully compatible with OpenAI API
- Available on Select Models — Check your model's capabilities
- Unsupported — Not available in this implementation
- Extra Feature — Enhanced capability beyond OpenAI API
Available Request Headers¶
This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
Content Safety (Guardrails)¶
| Header | Purpose | Valid Values |
|---|---|---|
X-Amzn-Bedrock-GuardrailIdentifier |
Guardrail ID for content filtering | Your guardrail identifier |
X-Amzn-Bedrock-GuardrailVersion |
Guardrail version | Version number (e.g., 1) |
X-Amzn-Bedrock-Trace |
Guardrail trace level | disabled, enabled, enabled_full |
Performance Optimization¶
| Header | Purpose | Valid Values |
|---|---|---|
X-Amzn-Bedrock-Service-Tier |
Service tier selection | priority, default, flex |
X-Amzn-Bedrock-PerformanceConfig-Latency |
Latency optimization | standard, optimized |
Example with headers:
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Amzn-Bedrock-Service-Tier: flex" \
-H "X-Amzn-Bedrock-PerformanceConfig-Latency: standard" \
-d '{
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
"input": ["Batch text 1", "Batch text 2", "Batch text 3"]
}'
Detailed Documentation
For complete information about these headers, configuration options, and use cases, see:
Advanced Features¶
Provider-Specific Parameters¶
Access advanced embedding capabilities by passing provider-specific parameters directly in your requests. These parameters are forwarded to AWS Bedrock and allow you to access features unique to each embedding model provider.
Documentation: Bedrock Embedding Model Parameters
How It Works:
Add provider-specific fields at the top level of your request body alongside standard OpenAI parameters. The API automatically forwards these to the appropriate model provider via AWS Bedrock.
Examples:
Cohere Embed v4 - Input Type:
{
"model": "cohere.embed-v4",
"input": "Semantic search transforms how we find information",
"input_type": "search_query"
}
Amazon Titan Embed v2 - Normalization:
{
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
"input": "Product description for similarity matching",
"normalize": true
}
Configuration Options:
Option 1: Per-Request
Add provider-specific parameters directly in your request body (as shown in examples above).
Option 2: Server-Wide Defaults
Configure default parameters for specific models via the DEFAULT_MODEL_PARAMS environment variable:
export DEFAULT_MODEL_PARAMS='{
"cohere.embed-v4": {
"input_type": "search_document",
"truncate": "END"
}
}'
Note: Per-request parameters override server-wide defaults.
Behavior:
- ✅ Compatible parameters: Forwarded to the model and applied
- ⚠️ Unsupported parameters: Return HTTP 400 with an error message
Try It Now¶
Single text embedding:
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
"input": "Semantic search transforms how we find information"
}'
Batch processing with base64 encoding:
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
"input": ["Product description", "User query", "Related content"],
"encoding_format": "base64"
}'
Multimodal Embeddings¶
Go beyond text! Supported models can process images, videos, and audio through base64 data URI input. This enables powerful cross-modal search and similarity features.
Input Format¶
Multimodal content is passed as base64-encoded data URIs:
data:<mime-type>;base64,<base64-encoded-content>
Example: Image Embedding¶
# First, encode your image to base64
IMAGE_B64=$(base64 -w 0 image.jpg)
# Send the embedding request
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"amazon.nova-2-multimodal-embeddings-v1:0\",
\"input\": \"data:image/jpeg;base64,$IMAGE_B64\"
}"
Example: Video Embedding¶
Option 1: Base64-encoded video (for small files)
# First, encode your video to base64
VIDEO_B64=$(base64 -w 0 video.mp4)
# Send the embedding request
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"amazon.nova-2-multimodal-embeddings-v1:0\",
\"input\": \"data:video/mp4;base64,$VIDEO_B64\"
}"
Automatic S3 Upload and Asynchronous Invocation
When you provide Base64-encoded data that exceeds the model's size limit (or Bedrock's 25 MB quota), the server automatically uploads it to S3 and selects the appropriate invocation method (synchronous or asynchronous).
To allow this behavior, configure regional S3 buckets via AWS_S3_REGIONAL_BUCKETS in the same region as your Bedrock model. See configuration guide.
Large Base64 Files and Memory Configuration
While passing large files as Base64 is supported, ensure your server has sufficient memory configured. Large Base64-encoded files (especially videos) can consume significant memory during processing. Consider using S3 URLs directly for very large files, or adjust your server's memory limits accordingly.
Option 2: S3 URL (for large files)
# Send the embedding request with S3 URL
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
"input": "s3://my-bucket/path/to/video.mp4"
}'
S3 URL Requirements
When using S3 URLs directly:
- S3 bucket must be in the same AWS region as the Bedrock model
- The stdapi.ai server must have read access to the S3 object
- For TwelveLabs Marengo models: S3 bucket must be in the same AWS account as the STDAPI server
Example: PDF Document Embedding¶
For PDFs, convert each page to an image and send via inputs along with page metadata (e.g., file_name, entities) in adjacent text parts. For RAG applications, smaller chunks often improve retrieval accuracy and reduce costs.
Cohere Embed v4 (supports multiple text+image pairs in one request):
# Convert PDF pages to images (using ImageMagick or similar tool)
convert -density 150 document.pdf page-%d.jpg
# Encode each page image to base64
PAGE_1=$(base64 -w 0 page-0.jpg)
PAGE_2=$(base64 -w 0 page-1.jpg)
# Generate document embedding with metadata
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"cohere.embed-v4\",
\"input\": [
\"file_name: report.pdf, page: 1\",
\"data:image/jpeg;base64,$PAGE_1\",
\"file_name: report.pdf, page: 2\",
\"data:image/jpeg;base64,$PAGE_2\"
]
}"
TwelveLabs Marengo v3 (requires exactly one text + one image per request):
Text+Image Pairing for Marengo v3
When using twelvelabs.marengo-embed-3-0-v1:0, if you provide exactly 2 inputs where one is text and one is image, they are automatically combined into a single text_image embedding. This creates a unified multimodal representation of the text-image pair.
# Encode image to base64
IMAGE_B64=$(base64 -w 0 page-0.jpg)
# Generate text+image embedding (automatically uses text_image mode)
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"twelvelabs.marengo-embed-3-0-v1:0\",
\"input\": [
\"A diagram showing the quarterly sales report\",
\"data:image/jpeg;base64,$IMAGE_B64\"
]
}"
Mixed-Content Batching¶
Combine text and multimodal inputs in a single request:
curl -X POST "$BASE/v1/embeddings" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"cohere.embed-v4\",
\"input\": [
\"A beautiful sunset over mountains\",
\"data:image/jpeg;base64,/9j/4AAQSkZJRg...\",
\"Nature photography collection\"
]
}"
Use Cases¶
- Visual Search: Find images similar to a query image or text description
- Video Analysis: Search and retrieve video content based on visual similarity or text descriptions
- Audio Similarity: Find similar audio clips or match audio to text descriptions
- Document Retrieval: Find relevant PDFs based on visual and textual content
- Cross-Modal Recommendations: Recommend images, videos, or audio based on text queries and vice versa
- Content Moderation: Analyze and classify multimodal content at scale
Build smarter search and recommendations! Explore available embedding models in the Models API.