Skip to content

Configuration Guide

stdapi.ai is configured entirely through environment variables, which are read once at startup and cannot be changed without restarting the service. This guide explains each setting category with practical examples to help you configure the service correctly.

Zero Configuration Startup

stdapi.ai works out of the box with zero configuration. The service automatically detects your current AWS region and discovers available Bedrock models.

Prerequisites

Before configuring stdapi.ai, ensure you have:

  • AWS Account with access to Amazon Bedrock
  • AWS Credentials configured via environment variables, AWS CLI, or IAM role (for EC2/ECS/Lambda deployments)
  • IAM Permissions to access required AWS services (see IAM Permissions section)
  • S3 Bucket (optional, but recommended for production use with file operations)

Quick Start

For production deployments, configure these essential settings:

Minimal Production Setup

Single-region deployment with file storage only.

# S3 bucket for file storage (must be in same region as your server)
export AWS_S3_BUCKET=my-stdapi-bucket

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

Production with Authentication

Adds secure API key authentication via AWS Systems Manager.

# S3 bucket for file storage (must be in same region as your server)
export AWS_S3_BUCKET=my-stdapi-bucket

# Secure API authentication (recommended: SSM Parameter Store)
export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

Full Production Setup (All Features Enabled)

Multi-region deployment with all AWS AI services, observability, and security features.

# Core AWS configuration - host server in first region
export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,eu-west-1

# S3 bucket for file storage (must be in us-east-1, your first/primary region)
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket

# Optional: Transcribe S3 bucket (defaults to AWS_S3_BUCKET if not specified)
# Only set this if you need a separate bucket or if transcribe is in a different region
# export AWS_TRANSCRIBE_S3_BUCKET=my-stdapi-transcribe-us-east-1

# Optional: Regional buckets for async/batch inference in other regions
export AWS_S3_REGIONAL_BUCKETS='{"us-west-2": "my-stdapi-us-west-2-bucket", "eu-west-1": "my-stdapi-eu-west-1-bucket"}'

# AWS AI services regions (optional - defaults to first AWS_BEDROCK_REGIONS if not specified)
export AWS_POLLY_REGION=us-east-1           # Text-to-speech
export AWS_TRANSCRIBE_REGION=us-east-1      # Speech-to-text (audio transcription)
export AWS_COMPREHEND_REGION=us-east-1      # Language detection
export AWS_TRANSLATE_REGION=us-east-1       # Text translation

# Authentication
export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

# Logging
export LOG_LEVEL=warning
export LOG_CLIENT_IP=true

# Optional: OpenTelemetry observability (AWS X-Ray integration)
# export OTEL_ENABLED=true
# export OTEL_SERVICE_NAME=stdapi-production
# export OTEL_SAMPLE_RATE=0.1

# Production security settings (when behind AWS ALB/CloudFront)
export ENABLE_PROXY_HEADERS=true

# Note: TRUSTED_HOSTS not recommended with AWS ALB - use ALB host-based routing instead
# Only use TRUSTED_HOSTS if you cannot configure host validation at the load balancer level

# Optional: CORS for browser-based web applications
# export CORS_ALLOW_ORIGINS='["https://app.example.com"]'

Development Setup

Local development configuration with API documentation and debug logging enabled.

# Minimal configuration for local development
export AWS_S3_BUCKET=my-stdapi-dev-bucket

# Enable API documentation
export ENABLE_DOCS=true
export ENABLE_REDOC=true

# Full request/response logging for debugging
export LOG_LEVEL=info
export LOG_REQUEST_PARAMS=true

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

S3 Bucket Required for Certain Features

Without an S3 bucket configured, some features will be disabled (such as image output as URL, audio transcription). See the relevant API documentation for feature requirements.

All Other Settings Are Optional

The configurations above are sufficient for most production deployments. All other settings can be configured as needed for your specific use case.

Environment Variable Summary

This section provides a quick reference of all available configuration options. Detailed explanations for each variable can be found in the sections below.

Essential (Production)

Variable Default Description
AWS_S3_BUCKET None Primary S3 bucket for file storage; must be in first region of AWS_BEDROCK_REGIONS
AWS_BEDROCK_REGIONS Current region Comma-separated regions for Bedrock; first region is where server should be hosted

AWS Storage

Variable Default Description
AWS_S3_ACCELERATE false Enable S3 Transfer Acceleration for faster global downloads via CloudFront edge locations
AWS_S3_REGIONAL_BUCKETS {} Region-specific S3 buckets for Bedrock async/batch inference operations
AWS_S3_TMP_PREFIX tmp/ S3 prefix for temporary files used for jobs; configure lifecycle policies on this prefix
AWS_TRANSCRIBE_S3_BUCKET AWS_S3_BUCKET S3 bucket for temporary audio transcription files; must be in same region as AWS_TRANSCRIBE_REGION

AWS AI Services

Variable Default Description
AWS_POLLY_REGION First AWS_BEDROCK_REGIONS AWS region for Amazon Polly text-to-speech service
AWS_COMPREHEND_REGION First AWS_BEDROCK_REGIONS AWS region for Amazon Comprehend language detection service
AWS_TRANSCRIBE_REGION First AWS_BEDROCK_REGIONS AWS region for Amazon Transcribe speech-to-text service
AWS_TRANSLATE_REGION First AWS_BEDROCK_REGIONS AWS region for Amazon Translate text translation service

Bedrock Advanced

Variable Default Description
AWS_BEDROCK_CROSS_REGION_INFERENCE true Allow automatic model routing to other configured regions
AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL true Allow global cross-region inference routing to any region worldwide (disable for GDPR compliance)
AWS_BEDROCK_LEGACY true Allow usage of deprecated/legacy Bedrock models
AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE true Allow automatic subscription to new models in AWS Marketplace
AWS_BEDROCK_GUARDRAIL_IDENTIFIER None Bedrock Guardrails ID for content filtering and safety controls
AWS_BEDROCK_GUARDRAIL_VERSION None Bedrock Guardrails version number (required with identifier)
AWS_BEDROCK_GUARDRAIL_TRACE None Guardrails trace level: disabled, enabled, or enabled_full

Authentication

Choose one method (mutually exclusive):

Variable Default Description
API_KEY_SSM_PARAMETER None AWS Systems Manager Parameter Store path for API key (recommended)
API_KEY_SECRETSMANAGER_SECRET None AWS Secrets Manager secret name containing API key
API_KEY_SECRETSMANAGER_KEY api_key JSON key name within Secrets Manager secret
API_KEY None Direct API key value (not recommended for production)

OpenAI Compatibility

Variable Default Description
OPENAI_ROUTES_PREFIX Base path prefix for OpenAI-compatible API routes

Logging

Variable Default Description
LOG_LEVEL info Minimum log severity: info, warning, error, critical, or disabled
LOG_REQUEST_PARAMS false Include request/response parameters in logs (not recommended for production)
LOG_CLIENT_IP false Log client IP addresses (requires ENABLE_PROXY_HEADERS for real IPs behind proxies)

Observability (OpenTelemetry)

Variable Default Description
OTEL_ENABLED false Enable distributed tracing via OpenTelemetry (integrates with AWS X-Ray, Jaeger, etc.)
OTEL_SERVICE_NAME stdapi Service name identifier in trace visualizations
OTEL_EXPORTER_ENDPOINT http://127.0.0.1:4318/v1/traces OTLP HTTP endpoint URL for trace export
OTEL_SAMPLE_RATE 1.0 Trace sampling rate from 0.0 (none) to 1.0 (all requests)

HTTP/Security

Variable Default Description
CORS_ALLOW_ORIGINS None JSON array of allowed origins for browser cross-origin requests
TRUSTED_HOSTS None JSON array of trusted Host header values (prefer ALB host-based routing; see details)
ENABLE_PROXY_HEADERS false Trust X-Forwarded-* headers from reverse proxies (only enable behind trusted proxy)
ENABLE_GZIP false Enable GZip compression for responses >1KB (prefer AWS ALB/CloudFront compression)
SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS true Block requests to private/local networks for SSRF protection

Application Behavior

Variable Default Description
TIMEZONE UTC IANA timezone identifier for request timestamps
STRICT_INPUT_VALIDATION false Reject API requests with unknown/extra fields
DEFAULT_TTS_MODEL amazon.polly-standard Default text-to-speech model: standard, neural, long-form, or generative
TOKENS_ESTIMATION false Estimate token counts using tiktoken when model doesn't provide them
TOKENS_ESTIMATION_DEFAULT_ENCODING o200k_base Tiktoken encoding algorithm: o200k_base (GPT-4o+), cl100k_base (GPT-4), or p50k_base
DEFAULT_MODEL_PARAMS {} JSON object with per-model default inference parameters (temperature, max_tokens, etc.)
MODEL_CACHE_SECONDS 900 Model list cache lifetime in seconds before lazy refresh (default: 15 minutes)

API Documentation

Variable Default Description
ENABLE_DOCS false Enable interactive Swagger UI documentation at /docs
ENABLE_REDOC false Enable ReDoc documentation UI at /redoc
ENABLE_OPENAPI_JSON false Enable OpenAPI schema endpoint at /openapi.json (auto-enabled with docs/redoc)

AWS Services and Regions

Storage Configuration

AWS_S3_BUCKET

Purpose : Primary S3 bucket for storing generated files (images, audio, documents) and temporary data during processing

Default : None (must be configured for file operations)

Best Practice : The bucket must be in the first region specified in AWS_BEDROCK_REGIONS (your primary region where the server should be hosted) to avoid cross-region data transfer costs and reduce latency

export AWS_S3_BUCKET=my-llm-storage-us-east-1

Presigned URLs

Files are served via presigned URLs for secure, time-limited access. Presigned URLs expire after 1 hour by default.

AWS_S3_ACCELERATE

Purpose : Enable S3 Transfer Acceleration for presigned URLs to improve download performance for large files

Type : Boolean

Default : false

Best Practice : Enable when serving large files (high-resolution images, audio) to geographically distributed users

export AWS_S3_ACCELERATE=true

What is S3 Transfer Acceleration?

S3 Transfer Acceleration uses Amazon CloudFront's globally distributed edge locations to accelerate uploads and downloads to S3 buckets. When enabled, data is routed to the nearest edge location and then transferred to S3 over Amazon's optimized network paths.

Performance Benefits:

  • Faster downloads for users far from your bucket's region
  • Global reach via CloudFront edge locations
  • Optimized routing over Amazon's private backbone network
  • Consistent performance regardless of user location

Typical speed improvements: 50-500% faster for users located far from the bucket region.

Requirements

  1. Enable Transfer Acceleration on your S3 bucket before setting this option:
    aws s3api put-bucket-accelerate-configuration \
      --bucket my-stdapi-bucket \
      --accelerate-configuration Status=Enabled
    
  2. Additional costs: Transfer Acceleration incurs extra data transfer fees. See AWS S3 Transfer Acceleration pricing

When to Enable

Consider enabling S3 Transfer Acceleration when:

  • Serving generated images via Images API
  • Users are geographically distributed across multiple continents
  • Generating high-resolution images that are large in file size
  • Download performance is critical to user experience

For small images or users close to your bucket region, the performance benefit may not justify the additional cost.

Current Usage

Presigned URLs with Transfer Acceleration are currently only used for the Images API when returning generated images as URLs.

AWS_S3_TMP_PREFIX

Purpose : S3 prefix (folder path) for temporary files used during job processing

Default : tmp/

Best Practice : Configure S3 lifecycle policies to automatically delete objects under this prefix after 1 day

export AWS_S3_TMP_PREFIX=tmp/

What is an S3 Prefix?

An S3 prefix is essentially a folder path within your S3 bucket. When you set AWS_S3_TMP_PREFIX=tmp/, all temporary files are stored under the tmp/ folder structure in your bucket.

Example file paths:

  • With prefix tmp/: s3://my-bucket/tmp/request-id-123/output.json
  • With prefix temporary/: s3://my-bucket/temporary/request-id-123/output.json
  • With empty prefix `:s3://my-bucket/request-id-123/output.json` (not recommended)

Why Use a Prefix?

Using a dedicated prefix for temporary files provides several benefits:

  • Easy Lifecycle Management - Apply S3 lifecycle policies to automatically delete only temporary files
  • Better Organization - Keep temporary files separate from permanent storage
  • Security - Apply different IAM policies or bucket policies to the prefix
  • Cost Control - Easily identify and monitor temporary storage costs

Trailing Slash

Always include a trailing slash (/) in your prefix to create a proper folder structure. Without it, files will be stored with the prefix as part of the filename rather than in a folder.

  • ✅ Correct: tmp/ → Files stored as tmp/file.json
  • ❌ Incorrect: tmp → Files stored as tmpfile.json

Custom prefix examples:

# Production environment
export AWS_S3_TMP_PREFIX=prod/tmp/

# Staging environment
export AWS_S3_TMP_PREFIX=staging/tmp/

# Organize by date (requires manual updates)
export AWS_S3_TMP_PREFIX=tmp/2025/01/

# No prefix (store at bucket root - not recommended)
export AWS_S3_TMP_PREFIX=

AWS_TRANSCRIBE_S3_BUCKET

Purpose : Temporary S3 bucket for transcription workflows

Default : Falls back to AWS_S3_BUCKET if not specified

Requirement : Must be in the same region as AWS_TRANSCRIBE_REGION

# If AWS_TRANSCRIBE_REGION is us-east-1
export AWS_TRANSCRIBE_S3_BUCKET=my-transcribe-temp-us-east-1

# If AWS_TRANSCRIBE_REGION is eu-west-1
export AWS_TRANSCRIBE_S3_BUCKET=my-transcribe-temp-eu-west-1

AWS_S3_REGIONAL_BUCKETS

Purpose : Region-specific S3 buckets for Bedrock async and batch inference operations

Default : Empty (no regional buckets configured)

Format : JSON object with region names as keys and bucket names as values

Requirement : Some Bedrock models require S3 buckets in the same region for async and batch inference operations

export AWS_S3_REGIONAL_BUCKETS='{"us-east-1": "my-bedrock-temp-us-east-1", "eu-west-1": "my-bedrock-temp-eu-west-1"}'

When to Use

Configure this setting when:

  • Using Bedrock async inference API
  • Using Bedrock batch inference API
  • Working with models that require regional S3 storage

If not specified for a region where async/batch operations are attempted, those operations may fail.

Automatic Fallback

For the first region in AWS_BEDROCK_REGIONS (your primary region), if no regional bucket is specified, the service automatically falls back to AWS_S3_BUCKET. You only need to configure regional buckets for additional regions beyond your primary one.

Best Practice

Apply the same S3 Bucket Lifecycle Configuration to these regional buckets as you would for the primary bucket to automatically clean up temporary files.

S3 Bucket Lifecycle Configuration

Purpose : Configure automatic deletion of temporary files to minimize storage costs

Recommendation : Configure S3 lifecycle policies to automatically delete objects under the AWS_S3_TMP_PREFIX after 1 day

stdapi.ai stores temporary files under the prefix configured by AWS_S3_TMP_PREFIX (default: tmp/). These include generated images, audio files, and transcription workflow files. Configure S3 lifecycle policies to automatically delete objects under this prefix after 1 day:

Application Cleanup Behavior

Short-lived temporary files: The application attempts to clean up short-lived temporary files (such as intermediate transcription files) after processing completes.

Results shared with clients: Files shared with clients using presigned URLs (such as generated images and audio) are never cleaned up automatically by the application. These files remain in S3 until removed by lifecycle policies or manual deletion.

Why lifecycle policies are essential: Since the application cannot determine when a client has finished using a presigned URL, S3 lifecycle policies are the recommended mechanism to clean up these files and prevent unbounded storage growth.

{
  "Rules": [
    {
      "Id": "DeleteTemporaryFiles",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "tmp/"
      },
      "Expiration": {
        "Days": 1
      },
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 1
      }
    }
  ]
}

Important: Update the Prefix

The "Prefix": "tmp/" value in the lifecycle policy must match your AWS_S3_TMP_PREFIX setting. If you use a custom prefix, update the policy accordingly.

Examples:

  • If AWS_S3_TMP_PREFIX=temporary/, use "Prefix": "temporary/"
  • If AWS_S3_TMP_PREFIX=prod/tmp/, use "Prefix": "prod/tmp/"

Apply via AWS CLI:

# For primary S3 bucket (AWS_S3_BUCKET)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-stdapi-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

# For transcribe S3 bucket (AWS_TRANSCRIBE_S3_BUCKET, if different from AWS_S3_BUCKET)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-transcribe-temp-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

# For regional buckets (AWS_S3_REGIONAL_BUCKETS)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-stdapi-us-west-2-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

Apply to All S3 Buckets

Apply this lifecycle policy to:

  • AWS_S3_BUCKET - Primary bucket for generated files
  • AWS_TRANSCRIBE_S3_BUCKET - Transcription temporary files (if different from AWS_S3_BUCKET)
  • AWS_S3_REGIONAL_BUCKETS - All regional buckets for async/batch operations

All these buckets use the same AWS_S3_TMP_PREFIX for temporary file storage.

Bedrock Configuration

AWS_BEDROCK_REGIONS

Purpose : List of AWS regions where Bedrock models are available

Format : Comma-separated string

Default : Current AWS SDK region if not specified

Behavior : Models are discovered in the same order as the listed regions. The first region is the primary region where your server should be hosted on AWS for optimal performance. Your S3 bucket (aws_s3_bucket) must also be in this region. If a model is unavailable in the primary region, subsequent regions are checked in order

export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,eu-west-1

Region Selection Guide

Region Description
us-east-1 Widest model selection, usually gets latest releases first
us-west-2 Good selection, often early access to new models
eu-west-1 European compliance, subset of US models available

Advanced Configuration

See Compliance and Latency Optimization for detailed configuration examples including GDPR compliance, regional optimization strategies, and best practices for multi-region deployments.

AWS_BEDROCK_CROSS_REGION_INFERENCE

Purpose : Enable automatic cross-region routing when a model isn't available in the primary region

Type : Boolean

Default : true

export AWS_BEDROCK_CROSS_REGION_INFERENCE=true

AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL

Purpose : Allow global cross-region inference routing to any region worldwide

Type : Boolean

Default : true

GDPR Compliance

Set to false to comply with data residency regulations (e.g., EU GDPR) by restricting to regional inference only

export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

AWS_BEDROCK_LEGACY

Purpose : Allow usage of legacy/deprecated Bedrock models

Type : Boolean

Default : true

export AWS_BEDROCK_LEGACY=true

AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE

Purpose : Control automatic subscription to new models in AWS Marketplace

Type : Boolean

Default : true

Behavior : When true, the server automatically subscribes to new models discovered in the AWS Marketplace, making them immediately available through the API. When false, only models with existing marketplace subscriptions are visible and accessible

IAM Permissions Required : aws-marketplace:Subscribe, aws-marketplace:ViewSubscriptions

# Allow automatic subscription (default)
export AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true

# Restrict to pre-subscribed models only
export AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=false

What is Marketplace Auto-Subscribe?

AWS Bedrock requires marketplace subscription before certain models can be used. This setting controls whether stdapi.ai automatically handles the subscription process:

  • true (default): Models are automatically subscribed when discovered, providing seamless access to new models as they become available
  • false: Only models that have already been subscribed through the AWS Marketplace are visible, providing explicit control over model access

When to Disable

Set to false when:

  • You need explicit control over which models are accessible
  • You want to prevent automatic marketplace subscriptions that may incur costs
  • Your organization requires manual approval for new AI model usage
  • Compliance policies require pre-authorization of AI models

IAM Permission Requirements

This feature requires the following IAM permissions to automatically subscribe to models:

  • aws-marketplace:Subscribe - Subscribe to marketplace offerings
  • aws-marketplace:ViewSubscriptions - View existing marketplace subscriptions

See Bedrock Marketplace Auto-Subscribe section for the complete IAM policy configuration.

AWS Documentation

For more information about Bedrock model access and marketplace registration, see the AWS Bedrock Model Access documentation.

Other AWS Services

Optional Configuration

Each service region is optional and defaults to the first region in AWS_BEDROCK_REGIONS if not specified.

AWS_POLLY_REGION

Purpose : Region for Amazon Polly text-to-speech service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_POLLY_REGION=us-east-1

Amazon Polly Engine Availability

Not all Polly engines (Standard, Neural, Long-form, Generative) are available in all AWS regions. Verify engine and voice availability in your target region. See Amazon Polly feature and region compatibility for detailed information.

AWS_COMPREHEND_REGION

Purpose : Region for Amazon Comprehend language detection service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_COMPREHEND_REGION=us-east-1

Amazon Comprehend Regional Availability

Amazon Comprehend is not available in all AWS regions. stdapi.ai uses the detect_dominant_language feature for language detection. Verify service and feature availability in your target region. See Amazon Comprehend supported regions for regional availability.

AWS_TRANSCRIBE_REGION

Purpose : Region for Amazon Transcribe speech-to-text service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_TRANSCRIBE_REGION=us-east-1

AWS_TRANSLATE_REGION

Purpose : Region for Amazon Translate text translation service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_TRANSLATE_REGION=us-east-1

Compliance and Latency Optimization

Strategic region configuration is critical for both regulatory compliance and performance optimization. This section provides best practice configurations for common scenarios.

AWS AI Services Data Privacy

Amazon Bedrock: Does not store or use user prompts and responses, and does not share them with third parties by default. Your content remains private and is not used to train models.

Other AI Services: AWS collects telemetry data from other AI services (Polly, Comprehend, Transcribe, Translate) by default. For enhanced data privacy and compliance, you can opt out of AWS using your content to improve AI services. Configure AI services opt-out policies at the AWS Organizations level to prevent your data from being used for service improvement.

GDPR and Data Residency Compliance

For applications serving European users, data residency regulations like GDPR may require that data processing occurs within specific geographic boundaries.

EU-Only Configuration (Strict GDPR)
# Use only European regions
export AWS_S3_BUCKET=my-stdapi-eu-bucket
export AWS_BEDROCK_REGIONS=eu-west-1,eu-west-3,eu-central-1

# Disable global cross-region inference to prevent data routing outside Europe
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

# Keep cross-region inference enabled for failover within EU regions
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true

Key Compliance Settings

  • AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false: Prevents requests from being routed to regions outside your specified list
  • AWS_BEDROCK_CROSS_REGION_INFERENCE=true: Enables cross-region inference within your specified EU regions
  • All services in EU regions: Ensures all data processing stays within European boundaries

Important Considerations

  • Not all Bedrock models are available in all EU regions - verify model availability
  • Some newer models may be available in US regions first; this configuration prioritizes compliance over immediate access to latest models
  • S3 buckets must be created in EU regions and configured appropriately for data residency

Latency Optimization

For applications prioritizing low latency and high performance, configure regions closest to your users and application infrastructure.

🇺🇸 North America:

# Primary region for lowest latency, with fallbacks
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket
export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,us-east-2

# Enable all cross-region inference for maximum model availability
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

🇯🇵 Asia-Pacific:

# Use Asia-Pacific regions for lowest latency to APAC users
export AWS_S3_BUCKET=my-stdapi-ap-southeast-1-bucket
export AWS_BEDROCK_REGIONS=ap-southeast-1,ap-northeast-1,us-west-2

# Enable global inference for fallback to US regions if needed
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

🌍 Global Multi-Region:

# Balanced configuration with worldwide coverage
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket
export AWS_BEDROCK_REGIONS=us-east-1,eu-west-1,ap-southeast-1,us-west-2

# Enable global inference for best availability
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

Latency Optimization Tips

  • Server and S3 co-location: Deploy stdapi.ai and your AWS_S3_BUCKET in the first region specified in AWS_BEDROCK_REGIONS (your primary region)
  • Network proximity: Choose the first region based on low latency to your application servers and end users
  • Data transfer costs: Cross-region data transfer incurs costs; co-locating server and S3 in the same region minimizes these
  • Model availability: While us-east-1 often has the most models, check specific model availability in your target regions

Hybrid Approach: Compliance with Performance

Balance compliance requirements with performance needs:

EU Primary with US Fallback
# EU primary with US fallback (for model availability)
export AWS_S3_BUCKET=my-stdapi-eu-bucket
export AWS_BEDROCK_REGIONS=eu-west-1,eu-central-1,us-east-1

# Allow cross-region but restrict to specific regions only
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

Legal Compliance Notice

Including us-east-1 as a fallback region provides access to more models but may not comply with strict data residency requirements. Consult your legal and compliance teams before using this configuration.


Configuration Order

When deploying stdapi.ai, configure settings in this recommended order:

  1. IAM Permissions - Set up AWS access first
  2. AWS Services and Regions - Configure S3 buckets and Bedrock regions
  3. Authentication - Secure your API with authentication
  4. Optional Features - Add observability, guardrails, and other features as needed

IAM Permissions

stdapi.ai requires specific AWS IAM permissions to access Bedrock models and other AWS services. The exact permissions needed depend on which features you enable.

Building Your Policy

Combine the permission statements below based on the features you need. At minimum, you need the Bedrock permissions. Add statements for S3, TTS, STT, and other features as required by your deployment.

Bedrock (Required)

Environment Variables: Always required

These permissions are mandatory for stdapi.ai to discover and invoke Bedrock models:

Bedrock IAM Policy Statements
{
  "Sid": "BedrockModelInvoke",
  "Effect": "Allow",
  "Action": [
    "bedrock:GetAsyncInvoke",
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream"
  ],
  "Resource": "*"
},
{
  "Sid": "BedrockModelDiscovery",
  "Effect": "Allow",
  "Action": [
    "bedrock:ListFoundationModels",
    "bedrock:GetFoundationModelAvailability",
    "bedrock:ListProvisionedModelThroughputs",
    "bedrock:ListInferenceProfiles"
  ],
  "Resource": "*"
}

Bedrock Marketplace Auto-Subscribe (Optional)

Environment Variables: AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE

Required only if you want to enable automatic subscription to new models in the AWS Marketplace (AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true, which is the default). When enabled, the server can automatically subscribe to marketplace offerings for newly discovered models.

Bedrock Marketplace Auto-Subscribe IAM Policy Statement
{
  "Sid": "BedrockMarketplaceAutoSubscribe",
  "Effect": "Allow",
  "Action": [
    "aws-marketplace:Subscribe",
    "aws-marketplace:ViewSubscriptions"
  ],
  "Resource": "*"
}

Cost Consideration

Automatic marketplace subscriptions may incur costs. Review AWS Marketplace pricing for individual models before enabling this feature, or set AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=false to require manual marketplace subscription.

Bedrock Guardrails (Optional)

Environment Variables: AWS_BEDROCK_GUARDRAIL_IDENTIFIER, AWS_BEDROCK_GUARDRAIL_VERSION

Required only if you configure Bedrock Guardrails for content filtering. See Bedrock Guardrails configuration section.

Bedrock Guardrails IAM Policy Statement
{
  "Sid": "BedrockGuardrails",
  "Effect": "Allow",
  "Action": [
    "bedrock:ApplyGuardrail"
  ],
  "Resource": "arn:aws:bedrock:*:*:guardrail/*"
}

S3 File Storage (Optional)

Environment Variables: AWS_S3_BUCKET

Required for storing generated images, audio files, and documents. See Storage Configuration for bucket setup details.

S3 File Storage IAM Policy Statements
{
  "Sid": "S3FileStorage",
  "Effect": "Allow",
  "Action": [
    "s3:PutObject",
    "s3:GetObject",
    "s3:DeleteObject"
  ],
  "Resource": "arn:aws:s3:::AWS_S3_BUCKET_VALUE/*"
}

Replace Bucket Name

Replace AWS_S3_BUCKET_VALUE with the value of your AWS_S3_BUCKET environment variable.

If your S3 bucket uses KMS encryption, also add:

{
  "Sid": "KMSEncryptedBucket",
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt",
    "kms:GenerateDataKey"
  ],
  "Resource": "arn:aws:kms:REGION:ACCOUNT_ID:key/YOUR_KMS_KEY_ID",
  "Condition": {
    "StringEquals": {
      "kms:ViaService": "s3.REGION.amazonaws.com"
    }
  }
}

KMS Security

The kms:ViaService condition restricts KMS key usage to S3 service calls only, following AWS security best practices.

Text-to-Speech (Optional)

Environment Variables: AWS_POLLY_REGION, DEFAULT_TTS_MODEL

Required for generating speech from text using Amazon Polly. See Audio and Text-to-Speech configuration section.

Polly Text-to-Speech IAM Policy Statement
{
  "Sid": "PollyTextToSpeech",
  "Effect": "Allow",
  "Action": [
    "polly:SynthesizeSpeech",
    "polly:DescribeVoices"
  ],
  "Resource": "*"
}

Speech-to-Text (Optional)

Environment Variables: AWS_TRANSCRIBE_REGION, AWS_TRANSCRIBE_S3_BUCKET

Required for transcribing audio files using Amazon Transcribe.

Transcribe Speech-to-Text IAM Policy Statements
{
  "Sid": "TranscribeSpeechToText",
  "Effect": "Allow",
  "Action": [
    "transcribe:StartTranscriptionJob",
    "transcribe:GetTranscriptionJob",
    "transcribe:DeleteTranscriptionJob"
  ],
  "Resource": "*"
},
{
  "Sid": "TranscribeS3Storage",
  "Effect": "Allow",
  "Action": [
    "s3:PutObject",
    "s3:GetObject",
    "s3:DeleteObject"
  ],
  "Resource": "arn:aws:s3:::AWS_TRANSCRIBE_S3_BUCKET_VALUE/*"
}

Replace Bucket Name

Replace AWS_TRANSCRIBE_S3_BUCKET_VALUE with the value of your AWS_TRANSCRIBE_S3_BUCKET environment variable (or AWS_S3_BUCKET if using the same bucket).

If your transcribe S3 bucket uses KMS encryption, also add the KMS permissions with the appropriate bucket ARN.

Language Detection (Optional)

Environment Variables: AWS_COMPREHEND_REGION

Required for automatic language detection (used by TTS for voice selection).

Comprehend Language Detection IAM Policy Statement
{
  "Sid": "ComprehendLanguageDetection",
  "Effect": "Allow",
  "Action": [
    "comprehend:DetectDominantLanguage"
  ],
  "Resource": "*"
}

Text Translation (Optional)

Environment Variables: AWS_TRANSLATE_REGION

Required for text translation features.

Translate Text Translation IAM Policy Statement
{
  "Sid": "TranslateTextTranslation",
  "Effect": "Allow",
  "Action": [
    "translate:TranslateText"
  ],
  "Resource": "*"
}

API Key Authentication (Optional)

Required if you configure API authentication. See Authentication configuration section.

SSM Parameter Store

Environment Variables: API_KEY_SSM_PARAMETER

SSM Parameter Store IAM Policy Statements
{
  "Sid": "SSMParameterAccess",
  "Effect": "Allow",
  "Action": [
    "ssm:GetParameter"
  ],
  "Resource": "arn:aws:ssm:REGION:ACCOUNT_ID:parameter/API_KEY_SSM_PARAMETER_VALUE"
}

Replace Parameter Path

Replace API_KEY_SSM_PARAMETER_VALUE with the value of your API_KEY_SSM_PARAMETER environment variable (e.g., /stdapi/prod/api-key).

If using encrypted SSM parameters, also add:

{
  "Sid": "KMSDecryptionForSSM",
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt"
  ],
  "Resource": "arn:aws:kms:REGION:ACCOUNT_ID:key/YOUR_KMS_KEY_ID",
  "Condition": {
    "StringEquals": {
      "kms:ViaService": "ssm.REGION.amazonaws.com"
    }
  }
}

KMS Security

The kms:ViaService condition restricts KMS key usage to SSM service calls only.

Secrets Manager

Environment Variables: API_KEY_SECRETSMANAGER_SECRET

Secrets Manager IAM Policy Statement
{
  "Sid": "SecretsManagerAccess",
  "Effect": "Allow",
  "Action": [
    "secretsmanager:GetSecretValue"
  ],
  "Resource": "arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:API_KEY_SECRETSMANAGER_SECRET_VALUE"
}

Replace Secret Name

Replace API_KEY_SECRETSMANAGER_SECRET_VALUE with the value of your API_KEY_SECRETSMANAGER_SECRET environment variable (e.g., stdapi-api-key).

Complete Policy Examples

Minimal Policy (Bedrock Only)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BedrockModelInvoke",
      "Effect": "Allow",
      "Action": [
        "bedrock:GetAsyncInvoke",
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockModelDiscovery",
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:GetFoundationModelAvailability",
        "bedrock:ListProvisionedModelThroughputs",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockMarketplaceAutoSubscribe",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    }
  ]
}

Marketplace Auto-Subscribe (Default Enabled)

The marketplace permissions are included because AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE defaults to true. If you set it to false, you can remove the BedrockMarketplaceAutoSubscribe statement.

Production Policy (Bedrock + S3 + Authentication)
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BedrockModelInvoke",
      "Effect": "Allow",
      "Action": [
        "bedrock:GetAsyncInvoke",
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockModelDiscovery",
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:GetFoundationModelAvailability",
        "bedrock:ListProvisionedModelThroughputs",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockMarketplaceAutoSubscribe",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    },
    {
      "Sid": "S3FileStorage",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-stdapi-bucket/*"
    },
    {
      "Sid": "SSMParameterAccess",
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameter"
      ],
      "Resource": "arn:aws:ssm:us-east-1:123456789012:parameter/stdapi/prod/api-key"
    }
  ]
}

Marketplace Auto-Subscribe (Default Enabled)

The marketplace permissions are included because AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE defaults to true. If you set it to false, you can remove the BedrockMarketplaceAutoSubscribe statement to follow the principle of least privilege.

Permission Notes

Least Privilege Principle

Only include the permission statements you need for your specific deployment. Start with Bedrock permissions and add others as required.

Feature-Specific Permission Requirements

Feature Required Permissions Configuration
Bedrock Models (Invoke) bedrock:InvokeModel
bedrock:InvokeModelWithResponseStream
Always required
Bedrock Models (Discovery) bedrock:ListFoundationModels
bedrock:GetFoundationModelAvailability
bedrock:ListProvisionedModelThroughputs
bedrock:ListInferenceProfiles
Always required
Bedrock Marketplace Auto-Subscribe aws-marketplace:Subscribe
aws-marketplace:ViewSubscriptions
AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true (default)
Bedrock Guardrails bedrock:ApplyGuardrail AWS_BEDROCK_GUARDRAIL_IDENTIFIER
File Storage s3:PutObject
s3:GetObject
s3:DeleteObject
AWS_S3_BUCKET
KMS Encrypted S3 Buckets kms:Decrypt
kms:GenerateDataKey
with kms:ViaService condition
If S3 buckets use KMS encryption
Text-to-Speech polly:SynthesizeSpeech
polly:DescribeVoices
AWS_POLLY_REGION
Speech-to-Text transcribe:StartTranscriptionJob
transcribe:GetTranscriptionJob
transcribe:DeleteTranscriptionJob
s3:PutObject (transcribe bucket)
AWS_TRANSCRIBE_REGION
AWS_TRANSCRIBE_S3_BUCKET
Language Detection comprehend:DetectDominantLanguage AWS_COMPREHEND_REGION
Translation translate:TranslateText AWS_TRANSLATE_REGION
SSM Parameter Store ssm:GetParameter
kms:Decrypt (if encrypted)
API_KEY_SSM_PARAMETER
Secrets Manager secretsmanager:GetSecretValue API_KEY_SECRETSMANAGER_SECRET

IAM Role vs. IAM User

stdapi.ai supports both IAM roles and IAM users:

  • IAM Role (Recommended): Use when running on EC2, ECS, Lambda, or other AWS compute services. Attach the policy to the instance/task role.
  • IAM User: Use when running outside AWS or for development. Create an IAM user with the required permissions and configure AWS credentials via environment variables or AWS CLI configuration.

Best Practice: Use IAM Roles

When deploying on AWS infrastructure, always prefer IAM roles over IAM users with access keys. IAM roles provide automatic credential rotation and better security.


Authentication

stdapi.ai supports three methods for API key authentication.

Authentication Methods

Choose only one method - they are mutually exclusive, with the following precedence order:

  1. SSM Parameter Store (highest precedence)
  2. Secrets Manager
  3. Direct API key (lowest precedence)

No Authentication Warning

If no authentication method is configured, the API accepts all requests without authentication. This is suitable only for internal/private deployments.

Recommended - Use AWS Systems Manager Parameter Store for secure key storage with encryption, access control, and auditing.

API_KEY_SSM_PARAMETER

Purpose : Name of the SSM parameter containing the API key. The parameter is retrieved from the current region detected by the running container, or defaults to the first region in AWS_BEDROCK_REGIONS.

Recommendation : Use SecureString type for encryption at rest

IAM Permissions Required : ssm:GetParameter, kms:Decrypt (if encrypted)

export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

Method 2: Secrets Manager

Use AWS Secrets Manager for secure key storage with automatic rotation support.

API_KEY_SECRETSMANAGER_SECRET

Purpose : Name of the Secrets Manager secret containing the API key. The secret is retrieved from the current region detected by the running container, or defaults to the first region in AWS_BEDROCK_REGIONS.

Format : Can be a plain string or JSON object

IAM Permissions Required : secretsmanager:GetSecretValue

API_KEY_SECRETSMANAGER_KEY

Purpose : JSON key name within the secret (if the secret is a JSON object)

Default : api_key

Plain String Secret:

export API_KEY_SECRETSMANAGER_SECRET=stdapi-api-key

JSON Secret:

export API_KEY_SECRETSMANAGER_SECRET=stdapi-credentials
export API_KEY_SECRETSMANAGER_KEY=api_key

Example JSON secret structure:

{
  "api_key": "sk-1234567890abcdef...",
  "other_config": "value"
}

Method 3: Direct API Key

Provide the API key directly via environment variable.

API_KEY

Purpose : Static API key value

Security Warning : Avoid hardcoding in configuration files; use environment variables only

Client Usage : Clients must include this key in the Authorization: Bearer <key> header or X-API-Key header

export API_KEY=sk-1234567890abcdef...

OpenAI API Compatibility

OPENAI_ROUTES_PREFIX

Purpose : Base path prefix for OpenAI-compatible API routes

Default :

Effect : All OpenAI-compatible endpoints will be mounted under this prefix

export OPENAI_ROUTES_PREFIX=/api

Example Endpoints

With the prefix /api, endpoints are available at:

  • /api/v1/chat/completions
  • /api/v1/models
  • /api/v1/embeddings

CORS Configuration

Configure Cross-Origin Resource Sharing (CORS) to control which web origins can access your API from browsers.

CORS_ALLOW_ORIGINS

Purpose : List of origins allowed to make cross-origin requests

Format : JSON array of origin URLs

Default : None (CORS not enabled)

Best Practice : Only enable if your API is accessed from web browsers; specify exact origins in production

# Not configured (default) - CORS middleware not enabled
# Browser cross-origin requests will be blocked
# No environment variable needed

# Development: Allow all origins
export CORS_ALLOW_ORIGINS='["*"]'

# Production: Specific origins only
export CORS_ALLOW_ORIGINS='["https://myapp.com", "https://app.example.com"]'

# Multiple environments
export CORS_ALLOW_ORIGINS='["https://app.example.com", "https://staging.example.com"]'

What is CORS?

Cross-Origin Resource Sharing (CORS) is a browser security mechanism that restricts web pages from making requests to a different domain than the one serving the web page.

Without CORS enabled:

  • Browser requests from web applications will fail due to missing CORS headers
  • Non-browser clients (curl, SDKs, mobile apps, server-to-server) work normally
  • Most secure default - no cross-origin access from browsers

With CORS enabled:

  • Browsers can make requests from allowed origins
  • Preflight OPTIONS requests are handled automatically
  • Non-browser clients continue to work normally

Security Consideration

  • Default (not configured): CORS is disabled. Browser cross-origin requests will fail. This is the most secure default.
  • ["*"]: Allows requests from any web origin. Convenient for development but not recommended for production.
  • Specific origins: Only allows requests from listed origins. Recommended for production.

CORS Behavior

  • When CORS_ALLOW_ORIGINS is not configured (default), CORS is not enabled
  • When configured with specific origins or ["*"], CORS is enabled with:
    • Authorization headers with credentials allowed
    • All HTTP methods allowed
    • All request headers allowed

When to Configure

Configure CORS_ALLOW_ORIGINS when:

  • Your API is accessed from browser-based web applications (React, Vue, Angular, etc.)
  • Building a web frontend that calls your API from a different domain
  • Developing locally with web apps (browser at localhost:3000 calling API at localhost:8000)

When NOT to Configure

Do not configure CORS when:

  • Your API is only accessed from server-to-server integrations
  • Your API is only accessed from mobile apps or desktop clients
  • Your API is only accessed from CLI tools or SDKs
  • Your API is only accessed from non-browser HTTP clients

Non-browser clients don't enforce CORS, so enabling it is unnecessary overhead.


Trusted Host Configuration

Configure Host header validation to protect against Host header injection attacks.

TRUSTED_HOSTS

Purpose : List of trusted Host header values for validation

Format : JSON array of hostnames (supports wildcards)

Default : None (no Host header validation)

Best Practice : Use AWS ALB host-based routing rules instead when possible for better performance and management

# Not configured (default) - no Host header validation
# No environment variable needed

# Production: Specific hosts only
export TRUSTED_HOSTS='["api.example.com", "www.example.com"]'

# With wildcard subdomains
export TRUSTED_HOSTS='["*.example.com", "api.myapp.com"]'

# Multiple environments including localhost
export TRUSTED_HOSTS='["api.example.com", "staging.example.com", "localhost"]'

What is Host Header Validation?

The Host header in HTTP requests specifies the domain name of the server. Host header validation ensures that requests are only processed when they target your legitimate domains, preventing:

  • Host header injection attacks - Malicious manipulation of Host headers to generate poisoned cache entries or exploit application logic
  • Web cache poisoning - Attacks that exploit Host header handling in caching layers

Security Consideration

By default, no Host header validation is performed. For production deployments exposed to the internet, configure host validation.

Recommended approach for AWS deployments:

  • Use AWS ALB host-based routing rules to restrict which Host headers reach your application
  • Configure ALB listener rules to only forward traffic for approved hostnames
  • This provides better performance and centralized management compared to application-level validation

Use TRUSTED_HOSTS setting when:

  • You cannot configure host-based routing at the load balancer level
  • You need application-level defense-in-depth
  • You're not using AWS ALB or similar services

Wildcard Support

Wildcard subdomains are supported using the * prefix:

  • *.example.com - Matches any subdomain of example.com (api.example.com, app.example.com, etc.)
  • example.com - Matches only the exact domain
  • * - Not recommended, but matches all hosts (equivalent to no validation)

Common Configurations

Single Domain Production:

export TRUSTED_HOSTS='["api.example.com"]'

Multi-Domain with Subdomains:

export TRUSTED_HOSTS='["*.example.com", "*.myapp.com", "api.production.com"]'

Development and Production:

export TRUSTED_HOSTS='["api.example.com", "localhost", "127.0.0.1"]'

Host Validation Behavior

  • When TRUSTED_HOSTS is not configured (default), Host header validation is not enabled
  • When configured, requests with non-matching Host headers are rejected with HTTP 400 Bad Request

When to Configure

Configure TRUSTED_HOSTS when:

  • You need defense-in-depth beyond load balancer rules
  • You cannot configure host-based routing at the load balancer level
  • Deploying without AWS ALB or similar load balancer with host validation

AWS ALB Host-Based Routing (Recommended)

Instead of using TRUSTED_HOSTS, configure AWS ALB listener rules to validate Host headers:

Via AWS Console:

  1. Navigate to EC2 → Load Balancers → Your ALB → Listeners
  2. Add rules to listener on port 443 (HTTPS)
  3. Add condition: "Host header" is "api.example.com"
  4. Forward to target group only if Host header matches

Via AWS CLI:

aws elbv2 create-rule \
  --listener-arn arn:aws:elasticloadbalancing:... \
  --priority 1 \
  --conditions Field=host-header,Values=api.example.com \
  --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...

Benefits of ALB host validation:

  • Better performance (rejected at load balancer, not application)
  • Centralized security policy management
  • ALB metrics and logging for rejected requests
  • Reduced load on application servers

Proxy Headers Configuration

Configure X-Forwarded-* header processing when running behind reverse proxies or load balancers.

ENABLE_PROXY_HEADERS

Purpose : Enable trusting X-Forwarded-* headers from reverse proxies

Type : Boolean

Default : false (disabled)

Best Practice : Only enable when running behind a trusted reverse proxy

# Disabled (default) - do not trust X-Forwarded-* headers
# No environment variable needed

# Enable when behind reverse proxy
export ENABLE_PROXY_HEADERS=true

What are X-Forwarded Headers?

When your application runs behind a reverse proxy (nginx, Apache, AWS ALB, CloudFront, etc.), the proxy sits between clients and your application. Without proxy header processing:

  • The application sees the proxy's IP address instead of the client's real IP
  • The application sees the proxy-to-app connection (e.g., HTTP) instead of the original client connection (e.g., HTTPS)
  • The application cannot distinguish between different clients behind the proxy

Reverse proxies add X-Forwarded-* headers to preserve the original request information:

  • X-Forwarded-For - Client's real IP address (and chain of proxies)
  • X-Forwarded-Proto - Original protocol (http/https)
  • X-Forwarded-Port - Original port number

Security Warning

CRITICAL: Only enable ENABLE_PROXY_HEADERS when running behind a trusted reverse proxy that properly sets X-Forwarded-* headers.

If enabled without a trusted proxy:

  • Clients can spoof their IP address by sending fake X-Forwarded-For headers
  • Security controls based on client IP (rate limiting, allowlists) can be bypassed
  • Logging and monitoring will record incorrect client information
  • Authentication and authorization decisions may be affected

Never enable this setting if your application is directly exposed to the internet without a reverse proxy.

Common Deployment Scenarios

Scenario 1: Direct to Internet (No Proxy)

# Do NOT enable proxy headers
# ENABLE_PROXY_HEADERS should remain false (default)

Your application receives requests directly from clients.

Scenario 2: Behind AWS ALB/CloudFront

export ENABLE_PROXY_HEADERS=true

AWS load balancer or CDN forwards requests to your application.

Scenario 3: Multiple AWS Proxy Layers

export ENABLE_PROXY_HEADERS=true

Example: CloudFront → ALB → Your Application

Proxy Headers Behavior

  • When ENABLE_PROXY_HEADERS is false (default), X-Forwarded- headers are not trusted*
  • When enabled, the server processes X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Port headers to determine client information
  • All proxies are trusted - ensure your network architecture prevents untrusted sources from reaching the application

When to Enable

Enable ENABLE_PROXY_HEADERS when:

  • Deployed behind AWS ALB, NLB, API Gateway, or CloudFront
  • Running behind any reverse proxy that sets X-Forwarded-* headers

AWS Proxy Configuration

AWS ALB, NLB, and CloudFront automatically set X-Forwarded-* headers - no additional configuration needed.

When you enable ENABLE_PROXY_HEADERS=true, your application will trust these headers to determine:

  • Client's real IP address (from X-Forwarded-For)
  • Original protocol (from X-Forwarded-Proto: http/https)
  • Original port (from X-Forwarded-Port)

GZip Compression

Configure automatic GZip compression for HTTP responses to reduce bandwidth usage and improve response times.

ENABLE_GZIP

Purpose : Enable GZip compression for HTTP responses

Type : Boolean

Default : false (disabled)

Best Practice : Use AWS ALB or CloudFront compression instead when available for better performance

# Disabled (default) - no response compression
# No environment variable needed

# Enable GZip compression (responses larger than 1 KiB will be compressed)
export ENABLE_GZIP=true

How GZip Compression Works

When enabled, the server automatically:

  1. Checks if the response size exceeds 1 KiB (1024 bytes)
  2. Verifies the client supports compression (via Accept-Encoding: gzip header)
  3. Compresses the response body using gzip
  4. Adds Content-Encoding: gzip header to the response

Typical compression ratios for JSON responses: 60-80% size reduction

Recommended: Use AWS Compression Services

Instead of enabling application-level compression, use AWS services for better performance:

AWS ALB (Application Load Balancer):

  • Enable compression in ALB target group attributes
  • ALB compresses responses before sending to clients
  • Reduces CPU load on your application servers
  • AWS ALB Compression Documentation

AWS CloudFront (CDN):

  • Enable automatic compression in CloudFront distribution settings
  • Compresses and caches responses at edge locations globally
  • Best performance for geographically distributed users
  • CloudFront Compression Documentation

Benefits of AWS-managed compression:

  • No CPU overhead on application servers
  • Offloads compression to AWS infrastructure
  • Better performance with CloudFront edge locations
  • Centralized configuration and management

When to Enable Application-Level Compression

Enable ENABLE_GZIP only when:

  • You're not using AWS ALB or CloudFront
  • Your API returns large JSON responses and you want to reduce bandwidth
  • Local development or non-AWS deployments

When NOT to Enable

Do not enable when:

  • You're behind AWS ALB with compression enabled
  • You're using CloudFront with compression enabled
  • CPU usage is a concern (compression adds CPU overhead)

Enabling compression at multiple layers is redundant and wastes CPU resources.

Compression Behavior

  • When ENABLE_GZIP is false (default), compression is not enabled
  • When enabled, only responses meeting these criteria are compressed:
    • Response size ≥ 1 KiB (1024 bytes)
    • Client sends Accept-Encoding: gzip header
    • Response does not already have Content-Encoding header
  • Streaming responses are compressed on-the-fly

Configuring AWS Compression

AWS ALB Compression:

Enable via AWS Console:

  1. Navigate to EC2 → Target Groups → Your Target Group
  2. Edit target group attributes
  3. Enable "Compression" attribute

Enable via AWS CLI:

aws elbv2 modify-target-group-attributes \
  --target-group-arn arn:aws:elasticloadbalancing:region:account:targetgroup/... \
  --attributes Key=compression.enabled,Value=true

AWS CloudFront Compression:

Enable via AWS Console:

  1. Navigate to CloudFront → Distributions → Your Distribution
  2. Edit behavior settings
  3. Enable "Compress Objects Automatically"

Enable via AWS CLI:

aws cloudfront update-distribution \
  --id YOUR_DISTRIBUTION_ID \
  --distribution-config file://config.json
# Set "Compress": true in the distribution config

Performance Impact

Application-level compression costs:

  • Increased CPU usage on application servers
  • Memory overhead for compression buffers
  • Small latency increase (1-5ms per request)

AWS-managed compression benefits:

  • No CPU impact on application servers
  • Better overall performance
  • Lower costs (compression offloaded to AWS infrastructure)

SSRF Protection

Configure Server-Side Request Forgery (SSRF) protection to prevent unauthorized access to internal networks.

SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS

Purpose : Enable SSRF protection by blocking requests to private/local networks

Type : Boolean

Default : true (enabled for security)

Best Practice : Keep enabled in production to protect against SSRF attacks

# Enabled (default) - block private networks
# No environment variable needed

# Disable only in controlled environments that need local network access
export SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS=false

What is SSRF Protection?

Server-Side Request Forgery (SSRF) is an attack where an attacker can make the server send requests to unintended destinations, including internal network resources.

SSRF protection has two layers:

  1. Baseline Protection (Always Enabled) - Cannot be disabled:

    • :material-loopback: Loopback Addresses - 127.0.0.0/8, ::1
    • Unspecified Addresses - 0.0.0.0, ::
    • Link-Local Addresses - 169.254.0.0/16, fe80::/10
    • Reserved IP Ranges - IETF reserved addresses
    • Multicast Addresses - Multicast IP ranges
  2. Private Network Protection (Controlled by this setting):

    • RFC 1918 Private Networks - 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
    • Other Private Address Ranges - Including IPv6 unique local addresses (fc00::/7)

Security Warning

CRITICAL: Only disable SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS in controlled environments where accessing internal networks is explicitly required and safe.

If disabled, private network protection is removed:

  • Attackers may be able to access RFC 1918 private network resources (10.x.x.x, 172.16-31.x.x, 192.168.x.x) through your API
  • Internal services on private networks (databases, admin panels, internal APIs) may be exposed
  • Internal APIs without authentication may be exploited

Important: Even when disabled, baseline protection remains active and prevents access to:

  • Loopback addresses (127.0.0.1, localhost) - always blocked
  • Link-local addresses (169.254.x.x) including AWS EC2 metadata endpoint - always blocked
  • Reserved and multicast addresses - always blocked

When to Disable

Disable SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS only when:

  • Your application legitimately needs to access internal network resources
  • Local development environment where accessing localhost services is required
  • You have other security controls in place (network segmentation, firewall rules)
  • Running in isolated Docker/container environments with restricted network access

Defense in Depth

Even with SSRF protection enabled, implement additional security measures:

  • Network Segmentation - Isolate application servers from sensitive internal networks
  • :material-firewall: Firewall Rules - Restrict outbound connections from application servers
  • Security Groups - Use AWS security groups to limit network access
  • Monitoring - Log and monitor outbound requests for suspicious patterns

Observability (OpenTelemetry)

Configure distributed tracing for debugging and performance monitoring. stdapi.ai integrates with AWS X-Ray, Jaeger, DataDog, and other OTLP-compatible systems.

OTEL_ENABLED

Purpose : Enable or disable OpenTelemetry tracing

Type : Boolean

Default : false

export OTEL_ENABLED=true

Performance Consideration

Disable in performance-critical deployments where observability is not needed.

OTEL_SERVICE_NAME

Purpose : Service identifier in trace visualizations

Default : stdapi

Best Practice : Use descriptive names with environment information

export OTEL_SERVICE_NAME=stdapi-production-us-east-1

OTEL_EXPORTER_ENDPOINT

Purpose : OTLP HTTP endpoint URL for sending traces

Default : http://127.0.0.1:4318/v1/traces

Protocol : Must support OTLP HTTP format

AWS X-Ray (via ADOT):

export OTEL_EXPORTER_ENDPOINT=http://127.0.0.1:4318/v1/traces

Jaeger:

export OTEL_EXPORTER_ENDPOINT=http://jaeger:14268/api/traces

Cloud Provider OTLP:

# Use provider-specific OTLP endpoints
export OTEL_EXPORTER_ENDPOINT=https://your-provider-otlp-endpoint.com/v1/traces

OTEL_SAMPLE_RATE

Purpose : Percentage of requests to trace (controls cost vs. observability)

Type : Float (0.0 to 1.0)

Default : 1.0 (100%)

Development:

# Trace everything for debugging
export OTEL_SAMPLE_RATE=1.0

Production (Moderate Traffic):

# Sample 10% of requests
export OTEL_SAMPLE_RATE=0.1

Production (High Traffic):

# Sample 1% of requests
export OTEL_SAMPLE_RATE=0.01

Sampling Recommendations

Sample Rate Use Case
1.0 (100%) Development, debugging, low-traffic services
0.1 (10%) Production with moderate traffic
0.01 (1%) High-traffic production services
0.0 (0%) Equivalent to disabling tracing

API Documentation Routes

stdapi.ai provides automatic API documentation routes, which are disabled by default for security in production environments.

Security Consideration

Exposing API documentation routes in production can reveal internal API structure, available endpoints, and request/response schemas to potential attackers. Only enable these routes in development/testing environments or when absolutely necessary.

ENABLE_DOCS

Purpose : Enable interactive Swagger UI documentation at /docs

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_DOCS=true

Interactive Documentation Features

The /docs endpoint provides an interactive interface to:

  • Browse all available API endpoints
  • Test API requests directly from the browser
  • View request/response schemas
  • Understand parameter requirements

ENABLE_REDOC

Purpose : Enable ReDoc documentation UI at /redoc

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_REDOC=true

ReDoc Features

The /redoc endpoint provides a clean, responsive documentation interface with:

  • Three-panel layout for easy navigation
  • Enhanced schema visualization
  • Better rendering for complex APIs
  • Export to OpenAPI specification

Static Documentation Available

ReDoc API documentation is also available as static documentation at API Reference without requiring this endpoint to be enabled.

ENABLE_OPENAPI_JSON

Purpose : Enable OpenAPI schema JSON endpoint at /openapi.json

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_OPENAPI_JSON=true

OpenAPI Schema

The /openapi.json endpoint provides the raw OpenAPI 3.0 specification, useful for:

  • Generating API clients in various languages
  • Import into API testing tools (Postman, Insomnia)
  • API documentation generation
  • Contract testing and validation

Automatic Enablement

If either ENABLE_DOCS or ENABLE_REDOC is set to true, the /openapi.json endpoint will be automatically enabled since both documentation UIs require the OpenAPI schema to function. You only need to explicitly set ENABLE_OPENAPI_JSON=true if you want to expose the schema endpoint without enabling the documentation UIs.

Development Configuration

Enable all documentation routes for local development:

export ENABLE_DOCS=true
export ENABLE_REDOC=true
# ENABLE_OPENAPI_JSON is automatically enabled when ENABLE_DOCS or ENABLE_REDOC is true

Or enable only Swagger UI:

export ENABLE_DOCS=true
# ENABLE_OPENAPI_JSON is automatically enabled

Or enable only ReDoc:

export ENABLE_REDOC=true
# ENABLE_OPENAPI_JSON is automatically enabled

Production Best Practice

# Keep all routes disabled in production (default)
# No environment variables needed - defaults to false

Production Warning

Never enable these routes in production unless you have specific security controls in place (e.g., IP allowlisting, VPN-only access, or additional authentication layer).


Validation and Logging

For comprehensive logging and monitoring information, see the Logging and Monitoring guide.

Variable Type Default Description
STRICT_INPUT_VALIDATION Boolean false Reject API requests containing unknown/extra fields
LOG_LEVEL String info Minimum log level to output (see Logging Level)
LOG_REQUEST_PARAMS Boolean false Include request/response parameters in logs
TIMEZONE String UTC IANA timezone identifier for request timestamps

Strict Validation:

# Returns HTTP 400 for requests with unexpected fields
export STRICT_INPUT_VALIDATION=true

Logging Level

LOG_LEVEL

Purpose : Control the minimum severity of log events written to STDOUT

Default : info

Options : info, warning, error, critical, disabled

Behavior : Only log events at or above the configured level are output. Log levels are ordered by severity: info < warning < error < critical

# Default: Output all log events
export LOG_LEVEL=info

# Production: Suppress info logs, show only warnings and higher
export LOG_LEVEL=warning

# Critical only: Show only critical errors
export LOG_LEVEL=critical

# Disable logging: Suppress all log output (not recommended)
export LOG_LEVEL=disabled

Log Level Examples

Level Outputs Use Case
info info, warning, error, critical Development, debugging, full visibility
warning warning, error, critical Production (recommended for most deployments)
error error, critical High-traffic production, reduce log volume
critical critical only Minimal logging, only show fatal errors
disabled none Not recommended - disables all logging

Production Recommendation

For production deployments, warning is recommended to reduce log volume while maintaining visibility into issues. The info level can generate significant log volume in high-traffic environments.

For detailed information about log events, structure, and monitoring strategies, see the Logging and Monitoring guide.

Debug Logging:

# Enable for debugging (NOT recommended for production)
export LOG_REQUEST_PARAMS=true

Security and cost warning

Enabling LOG_REQUEST_PARAMS may expose sensitive data in logs. Use only in development/debugging environments.

Logging full request/response payloads can also significantly increase log ingestion and storage costs, especially for large LLM prompts, tool calls, and generated outputs. If you must enable it, prefer short log retention, targeted sampling, and temporary use only.

Client IP Logging

LOG_CLIENT_IP

Purpose : Enable logging of client IP addresses for each request and add IP to OpenTelemetry spans

Type : Boolean

Default : false (disabled for privacy)

# Disabled (default) - no client IP logging
# No environment variable needed

# Enable client IP logging
export LOG_CLIENT_IP=true

Client IP Behavior

When enabled, client IP addresses are:

  • Included in log output for each request
  • Added as the client.address attribute to OpenTelemetry spans (when OTEL_ENABLED=true)

The IP address depends on your proxy configuration:

With ENABLE_PROXY_HEADERS=true (behind reverse proxy):

  • Logs the real client IP address from the X-Forwarded-For header
  • Shows the actual end-user IP, not the proxy IP
  • Requires your reverse proxy (ALB, CloudFront, etc.) to set the header correctly

With ENABLE_PROXY_HEADERS=false (default):

  • Logs the direct connection IP address
  • Typically shows your reverse proxy or load balancer IP, not the end-user IP
  • Limited usefulness unless application is directly exposed to clients

When to Enable

Enable LOG_CLIENT_IP when:

  • You need client IP addresses for security auditing or compliance
  • Analyzing traffic patterns and geographic distribution
  • Investigating abuse, fraud, or suspicious activity
  • Debugging client-specific issues

Important: Also enable ENABLE_PROXY_HEADERS=true when behind AWS ALB, CloudFront, or other reverse proxies to log the real client IP instead of the proxy IP.

Privacy Consideration

Client IP addresses are considered personal data under privacy regulations like GDPR. When logging IP addresses:

  • Consider shorter log retention periods
  • Document the purpose in your privacy policy
  • Ensure logs are stored securely
  • Implement log deletion procedures aligned with your data retention policy

Configuration for AWS Deployments

Behind AWS ALB or CloudFront:

# Enable proxy headers to get real client IPs
export ENABLE_PROXY_HEADERS=true
# Enable client IP logging
export LOG_CLIENT_IP=true

Direct exposure (not recommended for production):

# Only enable client IP logging
export LOG_CLIENT_IP=true
# ENABLE_PROXY_HEADERS remains false (default)

Timezone Configuration:

# UTC (default)
export TIMEZONE=UTC

# North America
export TIMEZONE=America/New_York

# Europe
export TIMEZONE=Europe/London

Bedrock Guardrails

Amazon Bedrock Guardrails add content filtering and safety controls to model inputs and outputs.

Configuration Options

Guardrails can be configured in three ways:

  1. Global - Via environment variables
  2. Per-request - Via HTTP headers
  3. Request body - Via amazon-bedrock-guardrailConfig object

Global Configuration

AWS_BEDROCK_GUARDRAIL_IDENTIFIER

Purpose : ID of the Bedrock Guardrail to apply

Required : Yes (together with AWS_BEDROCK_GUARDRAIL_VERSION)

export AWS_BEDROCK_GUARDRAIL_IDENTIFIER=abc123def456

AWS_BEDROCK_GUARDRAIL_VERSION

Purpose : Version of the Bedrock Guardrail

Required : Yes (together with AWS_BEDROCK_GUARDRAIL_IDENTIFIER)

export AWS_BEDROCK_GUARDRAIL_VERSION=1

AWS_BEDROCK_GUARDRAIL_TRACE

Purpose : Trace level for guardrail evaluation

Options : disabled, enabled, enabled_full

Default : None (optional)

export AWS_BEDROCK_GUARDRAIL_TRACE=enabled

Complete Guardrail Configuration

export AWS_BEDROCK_GUARDRAIL_IDENTIFIER=abc123def456
export AWS_BEDROCK_GUARDRAIL_VERSION=1
export AWS_BEDROCK_GUARDRAIL_TRACE=enabled

Per-Request Configuration

Override global guardrail settings for individual requests using HTTP headers:

Header Purpose
X-Amzn-Bedrock-GuardrailIdentifier Guardrail ID
X-Amzn-Bedrock-GuardrailVersion Guardrail version
X-Amzn-Bedrock-Trace Trace level
Example cURL Request
curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "X-Amzn-Bedrock-GuardrailIdentifier: abc123def456" \
  -H "X-Amzn-Bedrock-GuardrailVersion: 1" \
  -H "X-Amzn-Bedrock-Trace: enabled" \
  -d '{"model": "anthropic.claude-3-sonnet", "messages": [...]}'

Request Body Configuration

The amazon-bedrock-guardrailConfig object in the request body is supported for OpenAI Chat Completions compatibility.

Compatibility Note

Only fields compatible with Bedrock Converse API are honored. The tagSuffix field is documented in AWS but not supported in this implementation.


Audio and Text-to-Speech

DEFAULT_TTS_MODEL

Purpose : Default text-to-speech model when not specified in requests

Default : amazon.polly-standard

Model Description Quality
amazon.polly-standard Standard Polly voices Classic quality
amazon.polly-neural Neural Polly voices Higher quality, more natural
amazon.polly-long-form Long-form content Optimized for long content
amazon.polly-generative Generative AI voices :material-sparkles: Latest technology
export DEFAULT_TTS_MODEL=amazon.polly-neural

Token Counting

Control how token usage is calculated and reported in API responses.

TOKENS_ESTIMATION

Purpose : Estimate token counts using a tokenizer when the model doesn't return them directly

Type : Boolean

Default : false

export TOKENS_ESTIMATION=true

Use Case

Enable for consistent token reporting across all models.

TOKENS_ESTIMATION_DEFAULT_ENCODING

Purpose : Tiktoken encoding algorithm for token estimation

Default : o200k_base

Encoding Models
o200k_base GPT-4o and newer models
cl100k_base GPT-3.5-turbo, GPT-4
p50k_base Older GPT-3 models
export TOKENS_ESTIMATION_DEFAULT_ENCODING=o200k_base

Model Cache

stdapi.ai automatically discovers and caches available Bedrock models from configured regions. The cache is refreshed on-demand when expired, not via background tasks.

MODEL_CACHE_SECONDS

Purpose : Cache lifetime for the Bedrock models list before refresh

Type : Integer (seconds)

Default : 900 (15 minutes)

Behavior : When a request needs the model list (e.g., model lookup, /models endpoint) and the cache has expired, the server queries AWS Bedrock to discover newly available models, check for model access changes, and update inference profile configurations

# Default: 15 minutes
export MODEL_CACHE_SECONDS=900

# More frequent updates (5 minutes)
export MODEL_CACHE_SECONDS=300

# Less frequent updates (1 hour)
export MODEL_CACHE_SECONDS=3600

Lazy Refresh Behavior

The model cache uses lazy (on-demand) refresh, not background tasks:

  • Cache is refreshed only when a request needs it and the cache has expired
  • Common triggers: model lookup failures, /v1/models API calls, inference requests with unknown models
  • The first request after expiration will experience additional latency while the cache refreshes (typically 2-5 seconds depending on number of regions)
  • All AWS API requests are executed in parallel across regions to minimize latency penalty
  • Subsequent requests use the fresh cache until it expires again

Tuning Recommendations

Interval Use Case Trade-offs
300 (5 min) Development, testing new models More frequent refresh latency, faster model discovery
900 (15 min) Production (default, balanced) Balanced refresh frequency and latency impact
3600 (1 hour) Stable production, cost optimization Rare refresh latency, slower model discovery

Performance Considerations

  • Latency Impact: The first request after cache expiration will experience 2-5 seconds additional latency. All AWS API calls are parallelized to minimize this penalty, so latency scales with the slowest region rather than the sum of all regions.
  • API Calls: Each refresh makes parallel calls to ListFoundationModels, GetFoundationModelAvailability, and ListInferenceProfiles across all configured regions. Lower cache lifetimes increase the frequency of these calls.
  • Rate Limits: Very frequent refreshes in high-traffic deployments may approach API rate limits, though parallel execution doesn't increase per-region request rate
  • Multi-Region: Refresh latency is determined by the slowest responding region, not the total number of regions, thanks to parallel execution

Default Model Parameters

Configure default inference parameters applied automatically to specific models.

What You Can Do

  • Set consistent temperature/creativity levels per model
  • Enable provider-specific features (e.g., Anthropic beta features)
  • Configure default token limits for cost control
  • Apply model-specific stop sequences

Parameter Precedence

Request parameters always take precedence over defaults.

DEFAULT_MODEL_PARAMS

Purpose : Per-model default parameters

Format : JSON object with model IDs as keys

Supported Parameters:

Parameter Type Range Description
temperature Float ≥ 0 Sampling temperature
top_p Float 0.0-1.0 Nucleus sampling
max_tokens Integer ≥ 1 Maximum response tokens
stop_sequences String/Array - Stop generation tokens
Provider-specific Various - e.g., anthropic_beta

Configuration Examples

Basic Parameters:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-micro-v1:0": {
    "temperature": 0.3,
    "max_tokens": 800
  }
}'

Provider-Specific Features:

export DEFAULT_MODEL_PARAMS='{
  "anthropic.claude-sonnet-4-5-20250929-v1:0": {
    "anthropic_beta": ["Interleaved-thinking-2025-05-14"]
  }
}'

Multiple Models:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-micro-v1:0": {
    "temperature": 0.3,
    "max_tokens": 500
  },
  "amazon.nova-lite-v1:0": {
    "temperature": 0.7,
    "max_tokens": 2000
  },
  "anthropic.claude-sonnet-4-5-20250929-v1:0": {
    "temperature": 0.5,
    "top_p": 0.9,
    "anthropic_beta": ["Interleaved-thinking-2025-05-14"]
  }
}'

Advanced Configuration:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-pro-v1:0": {
    "temperature": 0.7,
    "top_p": 0.95,
    "max_tokens": 4096,
    "stop_sequences": ["Human:", "Assistant:"]
  }
}'

Parameter Merging

graph LR
    A[Default Parameters] --> B[Merged Config]
    C[Request Parameters] --> B
    B --> D[Final Configuration]
    style C fill:#90EE90
    style A fill:#ADD8E6
  1. Default parameters are applied first (from DEFAULT_MODEL_PARAMS)
  2. Request parameters override defaults if both are specified
  3. Provider-specific fields are forwarded to Bedrock as additional model request fields
  4. Unsupported fields that would change output cause HTTP 400 error; otherwise ignored