Configuration Guide¶

stdapi.ai is configured entirely through environment variables, which are read once at startup and cannot be changed without restarting the service. This guide explains each setting category with practical examples to help you configure the service correctly.

Zero Configuration Startup

stdapi.ai works out of the box with zero configuration. The service automatically detects your current AWS region and discovers available Bedrock models.

Prerequisites

Before configuring stdapi.ai, ensure you have:

AWS Account with access to Amazon Bedrock
AWS Credentials configured via environment variables, AWS CLI, or IAM role (for EC2/ECS/Lambda deployments)
IAM Permissions to access required AWS services (see IAM Permissions section)
S3 Bucket (optional, but recommended for production use with file operations)

Quick Start¶

For production deployments, configure these essential settings:

Minimal Production Setup¶

Single-region deployment with file storage only.

# S3 bucket for file storage (must be in same region as your server)
export AWS_S3_BUCKET=my-stdapi-bucket

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

Production with Authentication¶

Adds secure API key authentication via AWS Systems Manager.

# S3 bucket for file storage (must be in same region as your server)
export AWS_S3_BUCKET=my-stdapi-bucket

# Secure API authentication (recommended: SSM Parameter Store)
export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

Full Production Setup (All Features Enabled)¶

Multi-region deployment with all AWS AI services, observability, and security features.

# Core AWS configuration - host server in first region
export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,eu-west-1

# S3 bucket for file storage (must be in us-east-1, your first/primary region)
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket

# Optional: Transcribe S3 bucket (defaults to AWS_S3_BUCKET if not specified)
# Only set this if you need a separate bucket or if transcribe is in a different region
# export AWS_TRANSCRIBE_S3_BUCKET=my-stdapi-transcribe-us-east-1

# Optional: Regional buckets for async/batch inference in other regions
export AWS_S3_REGIONAL_BUCKETS='{"us-west-2": "my-stdapi-us-west-2-bucket", "eu-west-1": "my-stdapi-eu-west-1-bucket"}'

# AWS AI services regions (optional - defaults to first AWS_BEDROCK_REGIONS if not specified)
export AWS_POLLY_REGION=us-east-1           # Text-to-speech
export AWS_TRANSCRIBE_REGION=us-east-1      # Speech-to-text (audio transcription)
export AWS_COMPREHEND_REGION=us-east-1      # Language detection
export AWS_TRANSLATE_REGION=us-east-1       # Text translation

# Authentication
export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

# Logging
export LOG_LEVEL=warning
export LOG_CLIENT_IP=true

# Optional: OpenTelemetry observability (AWS X-Ray integration)
# export OTEL_ENABLED=true
# export OTEL_SERVICE_NAME=stdapi-production
# export OTEL_SAMPLE_RATE=0.1

# Production security settings (when behind AWS ALB/CloudFront)
export ENABLE_PROXY_HEADERS=true

# Note: TRUSTED_HOSTS not recommended with AWS ALB - use ALB host-based routing instead
# Only use TRUSTED_HOSTS if you cannot configure host validation at the load balancer level

# Optional: CORS for browser-based web applications
# export CORS_ALLOW_ORIGINS='["https://app.example.com"]'

Development Setup¶

Local development configuration with API documentation and debug logging enabled.

# Minimal configuration for local development
export AWS_S3_BUCKET=my-stdapi-dev-bucket

# Enable API documentation
export ENABLE_DOCS=true
export ENABLE_REDOC=true

# Full request/response logging for debugging
export LOG_LEVEL=info
export LOG_REQUEST_PARAMS=true

# AWS_BEDROCK_REGIONS is optional - will auto-detect your current AWS region if not specified

S3 Bucket Required for Certain Features

Without an S3 bucket configured, some features will be disabled (such as image output as URL, audio transcription). See the relevant API documentation for feature requirements.

All Other Settings Are Optional

The configurations above are sufficient for most production deployments. All other settings can be configured as needed for your specific use case.

Environment Variable Summary¶

This section provides a quick reference of all available configuration options. Detailed explanations for each variable can be found in the sections below.

Essential (Production)¶

Variable	Default	Description
`AWS_S3_BUCKET`	None	Primary S3 bucket for file storage; must be in first region of `AWS_BEDROCK_REGIONS`
`AWS_BEDROCK_REGIONS`	Current region	Comma-separated regions for Bedrock; first region is where server should be hosted

AWS Storage¶

Variable	Default	Description
`AWS_S3_ACCELERATE`	`false`	Enable S3 Transfer Acceleration for faster global downloads via CloudFront edge locations
`AWS_S3_REGIONAL_BUCKETS`	`{}`	Region-specific S3 buckets for Bedrock async/batch inference operations
`AWS_S3_TMP_PREFIX`	`tmp/`	S3 prefix for temporary files used for jobs; configure lifecycle policies on this prefix
`AWS_TRANSCRIBE_S3_BUCKET`	`AWS_S3_BUCKET`	S3 bucket for temporary audio transcription files; must be in same region as `AWS_TRANSCRIBE_REGION`

AWS AI Services¶

Variable	Default	Description
`AWS_POLLY_REGION`	First `AWS_BEDROCK_REGIONS`	AWS region for Amazon Polly text-to-speech service
`AWS_COMPREHEND_REGION`	First `AWS_BEDROCK_REGIONS`	AWS region for Amazon Comprehend language detection service
`AWS_TRANSCRIBE_REGION`	First `AWS_BEDROCK_REGIONS`	AWS region for Amazon Transcribe speech-to-text service
`AWS_TRANSLATE_REGION`	First `AWS_BEDROCK_REGIONS`	AWS region for Amazon Translate text translation service

Bedrock Advanced¶

Variable	Default	Description
`AWS_BEDROCK_CROSS_REGION_INFERENCE`	`true`	Allow automatic model routing to other configured regions
`AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL`	`true`	Allow global cross-region inference routing to any region worldwide (disable for GDPR compliance)
`AWS_BEDROCK_LEGACY`	`true`	Allow usage of deprecated/legacy Bedrock models
`AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE`	`true`	Allow automatic subscription to new models in AWS Marketplace
`AWS_BEDROCK_GUARDRAIL_IDENTIFIER`	None	Bedrock Guardrails ID for content filtering and safety controls
`AWS_BEDROCK_GUARDRAIL_VERSION`	None	Bedrock Guardrails version number (required with identifier)
`AWS_BEDROCK_GUARDRAIL_TRACE`	None	Guardrails trace level: `disabled`, `enabled`, or `enabled_full`

Authentication¶

Choose one method (mutually exclusive):

Variable	Default	Description
`API_KEY_SSM_PARAMETER`	None	AWS Systems Manager Parameter Store path for API key (recommended)
`API_KEY_SECRETSMANAGER_SECRET`	None	AWS Secrets Manager secret name containing API key
`API_KEY_SECRETSMANAGER_KEY`	`api_key`	JSON key name within Secrets Manager secret
`API_KEY`	None	Direct API key value (not recommended for production)

OpenAI Compatibility¶

Variable	Default	Description
`OPENAI_ROUTES_PREFIX`		Base path prefix for OpenAI-compatible API routes

Logging¶

Variable	Default	Description
`LOG_LEVEL`	`info`	Minimum log severity: `info`, `warning`, `error`, `critical`, or `disabled`
`LOG_REQUEST_PARAMS`	`false`	Include request/response parameters in logs (not recommended for production)
`LOG_CLIENT_IP`	`false`	Log client IP addresses (requires `ENABLE_PROXY_HEADERS` for real IPs behind proxies)

Observability (OpenTelemetry)¶

Variable	Default	Description
`OTEL_ENABLED`	`false`	Enable distributed tracing via OpenTelemetry (integrates with AWS X-Ray, Jaeger, etc.)
`OTEL_SERVICE_NAME`	`stdapi`	Service name identifier in trace visualizations
`OTEL_EXPORTER_ENDPOINT`	`http://127.0.0.1:4318/v1/traces`	OTLP HTTP endpoint URL for trace export
`OTEL_SAMPLE_RATE`	`1.0`	Trace sampling rate from 0.0 (none) to 1.0 (all requests)

HTTP/Security¶

Variable	Default	Description
`CORS_ALLOW_ORIGINS`	None	JSON array of allowed origins for browser cross-origin requests
`TRUSTED_HOSTS`	None	JSON array of trusted Host header values (prefer ALB host-based routing; see details)
`ENABLE_PROXY_HEADERS`	`false`	Trust X-Forwarded-* headers from reverse proxies (only enable behind trusted proxy)
`ENABLE_GZIP`	`false`	Enable GZip compression for responses >1KB (prefer AWS ALB/CloudFront compression)
`SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS`	`true`	Block requests to private/local networks for SSRF protection

Application Behavior¶

Variable	Default	Description
`TIMEZONE`	`UTC`	IANA timezone identifier for request timestamps
`STRICT_INPUT_VALIDATION`	`false`	Reject API requests with unknown/extra fields
`DEFAULT_TTS_MODEL`	`amazon.polly-standard`	Default text-to-speech model: `standard`, `neural`, `long-form`, or `generative`
`TOKENS_ESTIMATION`	`false`	Estimate token counts using tiktoken when model doesn't provide them
`TOKENS_ESTIMATION_DEFAULT_ENCODING`	`o200k_base`	Tiktoken encoding algorithm: `o200k_base` (GPT-4o+), `cl100k_base` (GPT-4), or `p50k_base`
`DEFAULT_MODEL_PARAMS`	`{}`	JSON object with per-model default inference parameters (temperature, max_tokens, etc.)
`MODEL_CACHE_SECONDS`	`900`	Model list cache lifetime in seconds before lazy refresh (default: 15 minutes)

API Documentation¶

Variable	Default	Description
`ENABLE_DOCS`	`false`	Enable interactive Swagger UI documentation at `/docs`
`ENABLE_REDOC`	`false`	Enable ReDoc documentation UI at `/redoc`
`ENABLE_OPENAPI_JSON`	`false`	Enable OpenAPI schema endpoint at `/openapi.json` (auto-enabled with docs/redoc)

AWS Services and Regions¶

Storage Configuration¶

`AWS_S3_BUCKET`¶

Purpose : Primary S3 bucket for storing generated files (images, audio, documents) and temporary data during processing

Default : None (must be configured for file operations)

Best Practice : The bucket must be in the first region specified in AWS_BEDROCK_REGIONS (your primary region where the server should be hosted) to avoid cross-region data transfer costs and reduce latency

export AWS_S3_BUCKET=my-llm-storage-us-east-1

Presigned URLs

Files are served via presigned URLs for secure, time-limited access. Presigned URLs expire after 1 hour by default.

`AWS_S3_ACCELERATE`¶

Purpose : Enable S3 Transfer Acceleration for presigned URLs to improve download performance for large files

Type : Boolean

Default : false

Best Practice : Enable when serving large files (high-resolution images, audio) to geographically distributed users

export AWS_S3_ACCELERATE=true

What is S3 Transfer Acceleration?

S3 Transfer Acceleration uses Amazon CloudFront's globally distributed edge locations to accelerate uploads and downloads to S3 buckets. When enabled, data is routed to the nearest edge location and then transferred to S3 over Amazon's optimized network paths.

Performance Benefits:

Faster downloads for users far from your bucket's region
Global reach via CloudFront edge locations
Optimized routing over Amazon's private backbone network
Consistent performance regardless of user location

Typical speed improvements: 50-500% faster for users located far from the bucket region.

Requirements

Enable Transfer Acceleration on your S3 bucket before setting this option:

aws s3api put-bucket-accelerate-configuration \
  --bucket my-stdapi-bucket \
  --accelerate-configuration Status=Enabled

Additional costs: Transfer Acceleration incurs extra data transfer fees. See AWS S3 Transfer Acceleration pricing

When to Enable

Consider enabling S3 Transfer Acceleration when:

Serving generated images via Images API
Users are geographically distributed across multiple continents
Generating high-resolution images that are large in file size
Download performance is critical to user experience

For small images or users close to your bucket region, the performance benefit may not justify the additional cost.

Current Usage

Presigned URLs with Transfer Acceleration are currently only used for the Images API when returning generated images as URLs.

`AWS_S3_TMP_PREFIX`¶

Purpose : S3 prefix (folder path) for temporary files used during job processing

Default : tmp/

Best Practice : Configure S3 lifecycle policies to automatically delete objects under this prefix after 1 day

export AWS_S3_TMP_PREFIX=tmp/

What is an S3 Prefix?

An S3 prefix is essentially a folder path within your S3 bucket. When you set AWS_S3_TMP_PREFIX=tmp/, all temporary files are stored under the tmp/ folder structure in your bucket.

Example file paths:

With prefix tmp/: s3://my-bucket/tmp/request-id-123/output.json
With prefix temporary/: s3://my-bucket/temporary/request-id-123/output.json
With empty prefix `:s3://my-bucket/request-id-123/output.json` (not recommended)

Why Use a Prefix?

Using a dedicated prefix for temporary files provides several benefits:

Easy Lifecycle Management - Apply S3 lifecycle policies to automatically delete only temporary files
Better Organization - Keep temporary files separate from permanent storage
Security - Apply different IAM policies or bucket policies to the prefix
Cost Control - Easily identify and monitor temporary storage costs

Trailing Slash

Always include a trailing slash (/) in your prefix to create a proper folder structure. Without it, files will be stored with the prefix as part of the filename rather than in a folder.

✅ Correct: tmp/ → Files stored as tmp/file.json
❌ Incorrect: tmp → Files stored as tmpfile.json

Custom prefix examples:

# Production environment
export AWS_S3_TMP_PREFIX=prod/tmp/

# Staging environment
export AWS_S3_TMP_PREFIX=staging/tmp/

# Organize by date (requires manual updates)
export AWS_S3_TMP_PREFIX=tmp/2025/01/

# No prefix (store at bucket root - not recommended)
export AWS_S3_TMP_PREFIX=

`AWS_TRANSCRIBE_S3_BUCKET`¶

Purpose : Temporary S3 bucket for transcription workflows

Default : Falls back to AWS_S3_BUCKET if not specified

Requirement : Must be in the same region as AWS_TRANSCRIBE_REGION

# If AWS_TRANSCRIBE_REGION is us-east-1
export AWS_TRANSCRIBE_S3_BUCKET=my-transcribe-temp-us-east-1

# If AWS_TRANSCRIBE_REGION is eu-west-1
export AWS_TRANSCRIBE_S3_BUCKET=my-transcribe-temp-eu-west-1

`AWS_S3_REGIONAL_BUCKETS`¶

Purpose : Region-specific S3 buckets for Bedrock async and batch inference operations

Default : Empty (no regional buckets configured)

Format : JSON object with region names as keys and bucket names as values

Requirement : Some Bedrock models require S3 buckets in the same region for async and batch inference operations

export AWS_S3_REGIONAL_BUCKETS='{"us-east-1": "my-bedrock-temp-us-east-1", "eu-west-1": "my-bedrock-temp-eu-west-1"}'

When to Use

Configure this setting when:

Using Bedrock async inference API
Using Bedrock batch inference API
Working with models that require regional S3 storage

If not specified for a region where async/batch operations are attempted, those operations may fail.

Automatic Fallback

For the first region in AWS_BEDROCK_REGIONS (your primary region), if no regional bucket is specified, the service automatically falls back to AWS_S3_BUCKET. You only need to configure regional buckets for additional regions beyond your primary one.

Best Practice

Apply the same S3 Bucket Lifecycle Configuration to these regional buckets as you would for the primary bucket to automatically clean up temporary files.

S3 Bucket Lifecycle Configuration¶

Purpose : Configure automatic deletion of temporary files to minimize storage costs

Recommendation : Configure S3 lifecycle policies to automatically delete objects under the AWS_S3_TMP_PREFIX after 1 day

stdapi.ai stores temporary files under the prefix configured by AWS_S3_TMP_PREFIX (default: tmp/). These include generated images, audio files, and transcription workflow files. Configure S3 lifecycle policies to automatically delete objects under this prefix after 1 day:

Application Cleanup Behavior

Short-lived temporary files: The application attempts to clean up short-lived temporary files (such as intermediate transcription files) after processing completes.

Results shared with clients: Files shared with clients using presigned URLs (such as generated images and audio) are never cleaned up automatically by the application. These files remain in S3 until removed by lifecycle policies or manual deletion.

Why lifecycle policies are essential: Since the application cannot determine when a client has finished using a presigned URL, S3 lifecycle policies are the recommended mechanism to clean up these files and prevent unbounded storage growth.

{
  "Rules": [
    {
      "Id": "DeleteTemporaryFiles",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "tmp/"
      },
      "Expiration": {
        "Days": 1
      },
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 1
      }
    }
  ]
}

Important: Update the Prefix

The "Prefix": "tmp/" value in the lifecycle policy must match your AWS_S3_TMP_PREFIX setting. If you use a custom prefix, update the policy accordingly.

Examples:

If AWS_S3_TMP_PREFIX=temporary/, use "Prefix": "temporary/"
If AWS_S3_TMP_PREFIX=prod/tmp/, use "Prefix": "prod/tmp/"

Apply via AWS CLI:

# For primary S3 bucket (AWS_S3_BUCKET)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-stdapi-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

# For transcribe S3 bucket (AWS_TRANSCRIBE_S3_BUCKET, if different from AWS_S3_BUCKET)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-transcribe-temp-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

# For regional buckets (AWS_S3_REGIONAL_BUCKETS)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-stdapi-us-west-2-bucket \
  --lifecycle-configuration file://lifecycle-policy.json

Apply to All S3 Buckets

Apply this lifecycle policy to:

AWS_S3_BUCKET - Primary bucket for generated files
AWS_TRANSCRIBE_S3_BUCKET - Transcription temporary files (if different from AWS_S3_BUCKET)
AWS_S3_REGIONAL_BUCKETS - All regional buckets for async/batch operations

All these buckets use the same AWS_S3_TMP_PREFIX for temporary file storage.

Bedrock Configuration¶

`AWS_BEDROCK_REGIONS`¶

Purpose : List of AWS regions where Bedrock models are available

Format : Comma-separated string

Default : Current AWS SDK region if not specified

Behavior : Models are discovered in the same order as the listed regions. The first region is the primary region where your server should be hosted on AWS for optimal performance. Your S3 bucket (aws_s3_bucket) must also be in this region. If a model is unavailable in the primary region, subsequent regions are checked in order

export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,eu-west-1

Region Selection Guide

Region	Description
`us-east-1`	Widest model selection, usually gets latest releases first
`us-west-2`	Good selection, often early access to new models
`eu-west-1`	European compliance, subset of US models available

Advanced Configuration

See Compliance and Latency Optimization for detailed configuration examples including GDPR compliance, regional optimization strategies, and best practices for multi-region deployments.

`AWS_BEDROCK_CROSS_REGION_INFERENCE`¶

Purpose : Enable automatic cross-region routing when a model isn't available in the primary region

Type : Boolean

Default : true

export AWS_BEDROCK_CROSS_REGION_INFERENCE=true

`AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL`¶

Purpose : Allow global cross-region inference routing to any region worldwide

Type : Boolean

Default : true

GDPR Compliance

Set to false to comply with data residency regulations (e.g., EU GDPR) by restricting to regional inference only

export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

`AWS_BEDROCK_LEGACY`¶

Purpose : Allow usage of legacy/deprecated Bedrock models

Type : Boolean

Default : true

export AWS_BEDROCK_LEGACY=true

`AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE`¶

Purpose : Control automatic subscription to new models in AWS Marketplace

Type : Boolean

Default : true

Behavior : When true, the server automatically subscribes to new models discovered in the AWS Marketplace, making them immediately available through the API. When false, only models with existing marketplace subscriptions are visible and accessible

IAM Permissions Required : aws-marketplace:Subscribe, aws-marketplace:ViewSubscriptions

# Allow automatic subscription (default)
export AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true

# Restrict to pre-subscribed models only
export AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=false

What is Marketplace Auto-Subscribe?

AWS Bedrock requires marketplace subscription before certain models can be used. This setting controls whether stdapi.ai automatically handles the subscription process:

true (default): Models are automatically subscribed when discovered, providing seamless access to new models as they become available
false: Only models that have already been subscribed through the AWS Marketplace are visible, providing explicit control over model access

When to Disable

Set to false when:

You need explicit control over which models are accessible
You want to prevent automatic marketplace subscriptions that may incur costs
Your organization requires manual approval for new AI model usage
Compliance policies require pre-authorization of AI models

IAM Permission Requirements

This feature requires the following IAM permissions to automatically subscribe to models:

aws-marketplace:Subscribe - Subscribe to marketplace offerings
aws-marketplace:ViewSubscriptions - View existing marketplace subscriptions

See Bedrock Marketplace Auto-Subscribe section for the complete IAM policy configuration.

AWS Documentation

For more information about Bedrock model access and marketplace registration, see the AWS Bedrock Model Access documentation.

Other AWS Services¶

Optional Configuration

Each service region is optional and defaults to the first region in AWS_BEDROCK_REGIONS if not specified.

`AWS_POLLY_REGION`¶

Purpose : Region for Amazon Polly text-to-speech service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_POLLY_REGION=us-east-1

Amazon Polly Engine Availability

Not all Polly engines (Standard, Neural, Long-form, Generative) are available in all AWS regions. Verify engine and voice availability in your target region. See Amazon Polly feature and region compatibility for detailed information.

`AWS_COMPREHEND_REGION`¶

Purpose : Region for Amazon Comprehend language detection service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_COMPREHEND_REGION=us-east-1

Amazon Comprehend Regional Availability

Amazon Comprehend is not available in all AWS regions. stdapi.ai uses the detect_dominant_language feature for language detection. Verify service and feature availability in your target region. See Amazon Comprehend supported regions for regional availability.

`AWS_TRANSCRIBE_REGION`¶

Purpose : Region for Amazon Transcribe speech-to-text service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_TRANSCRIBE_REGION=us-east-1

`AWS_TRANSLATE_REGION`¶

Purpose : Region for Amazon Translate text translation service

Default : First region in AWS_BEDROCK_REGIONS

export AWS_TRANSLATE_REGION=us-east-1

Compliance and Latency Optimization¶

Strategic region configuration is critical for both regulatory compliance and performance optimization. This section provides best practice configurations for common scenarios.

AWS AI Services Data Privacy

Amazon Bedrock: Does not store or use user prompts and responses, and does not share them with third parties by default. Your content remains private and is not used to train models.

Other AI Services: AWS collects telemetry data from other AI services (Polly, Comprehend, Transcribe, Translate) by default. For enhanced data privacy and compliance, you can opt out of AWS using your content to improve AI services. Configure AI services opt-out policies at the AWS Organizations level to prevent your data from being used for service improvement.

For applications serving European users, data residency regulations like GDPR may require that data processing occurs within specific geographic boundaries.

EU-Only Configuration (Strict GDPR)

# Use only European regions
export AWS_S3_BUCKET=my-stdapi-eu-bucket
export AWS_BEDROCK_REGIONS=eu-west-1,eu-west-3,eu-central-1

# Disable global cross-region inference to prevent data routing outside Europe
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

# Keep cross-region inference enabled for failover within EU regions
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true

Key Compliance Settings

AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false: Prevents requests from being routed to regions outside your specified list
AWS_BEDROCK_CROSS_REGION_INFERENCE=true: Enables cross-region inference within your specified EU regions
All services in EU regions: Ensures all data processing stays within European boundaries

Important Considerations

Not all Bedrock models are available in all EU regions - verify model availability
Some newer models may be available in US regions first; this configuration prioritizes compliance over immediate access to latest models
S3 buckets must be created in EU regions and configured appropriately for data residency

Latency Optimization¶

For applications prioritizing low latency and high performance, configure regions closest to your users and application infrastructure.

North America:

# Primary region for lowest latency, with fallbacks
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket
export AWS_BEDROCK_REGIONS=us-east-1,us-west-2,us-east-2

# Enable all cross-region inference for maximum model availability
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

Asia-Pacific:

# Use Asia-Pacific regions for lowest latency to APAC users
export AWS_S3_BUCKET=my-stdapi-ap-southeast-1-bucket
export AWS_BEDROCK_REGIONS=ap-southeast-1,ap-northeast-1,us-west-2

# Enable global inference for fallback to US regions if needed
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

Global Multi-Region:

# Balanced configuration with worldwide coverage
export AWS_S3_BUCKET=my-stdapi-us-east-1-bucket
export AWS_BEDROCK_REGIONS=us-east-1,eu-west-1,ap-southeast-1,us-west-2

# Enable global inference for best availability
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=true

Latency Optimization Tips

Server and S3 co-location: Deploy stdapi.ai and your AWS_S3_BUCKET in the first region specified in AWS_BEDROCK_REGIONS (your primary region)
Network proximity: Choose the first region based on low latency to your application servers and end users
Data transfer costs: Cross-region data transfer incurs costs; co-locating server and S3 in the same region minimizes these
Model availability: While us-east-1 often has the most models, check specific model availability in your target regions

Hybrid Approach: Compliance with Performance¶

Balance compliance requirements with performance needs:

EU Primary with US Fallback

# EU primary with US fallback (for model availability)
export AWS_S3_BUCKET=my-stdapi-eu-bucket
export AWS_BEDROCK_REGIONS=eu-west-1,eu-central-1,us-east-1

# Allow cross-region but restrict to specific regions only
export AWS_BEDROCK_CROSS_REGION_INFERENCE=true
export AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL=false

Legal Compliance Notice

Including us-east-1 as a fallback region provides access to more models but may not comply with strict data residency requirements. Consult your legal and compliance teams before using this configuration.

Configuration Order¶

When deploying stdapi.ai, configure settings in this recommended order:

IAM Permissions - Set up AWS access first
AWS Services and Regions - Configure S3 buckets and Bedrock regions
Authentication - Secure your API with authentication
Optional Features - Add observability, guardrails, and other features as needed

IAM Permissions¶

stdapi.ai requires specific AWS IAM permissions to access Bedrock models and other AWS services. The exact permissions needed depend on which features you enable.

Building Your Policy

Combine the permission statements below based on the features you need. At minimum, you need the Bedrock permissions. Add statements for S3, TTS, STT, and other features as required by your deployment.

Bedrock (Required)¶

Environment Variables: Always required

These permissions are mandatory for stdapi.ai to discover and invoke Bedrock models:

Bedrock IAM Policy Statements

{
  "Sid": "BedrockModelInvoke",
  "Effect": "Allow",
  "Action": [
    "bedrock:GetAsyncInvoke",
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream"
  ],
  "Resource": "*"
},
{
  "Sid": "BedrockModelDiscovery",
  "Effect": "Allow",
  "Action": [
    "bedrock:ListFoundationModels",
    "bedrock:GetFoundationModelAvailability",
    "bedrock:ListProvisionedModelThroughputs",
    "bedrock:ListInferenceProfiles"
  ],
  "Resource": "*"
}

Environment Variables: AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE

Required only if you want to enable automatic subscription to new models in the AWS Marketplace (AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true, which is the default). When enabled, the server can automatically subscribe to marketplace offerings for newly discovered models.

Bedrock Marketplace Auto-Subscribe IAM Policy Statement

{
  "Sid": "BedrockMarketplaceAutoSubscribe",
  "Effect": "Allow",
  "Action": [
    "aws-marketplace:Subscribe",
    "aws-marketplace:ViewSubscriptions"
  ],
  "Resource": "*"
}

Cost Consideration

Automatic marketplace subscriptions may incur costs. Review AWS Marketplace pricing for individual models before enabling this feature, or set AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=false to require manual marketplace subscription.

Bedrock Guardrails (Optional)¶

Environment Variables: AWS_BEDROCK_GUARDRAIL_IDENTIFIER, AWS_BEDROCK_GUARDRAIL_VERSION

Required only if you configure Bedrock Guardrails for content filtering. See Bedrock Guardrails configuration section.

Bedrock Guardrails IAM Policy Statement

{
  "Sid": "BedrockGuardrails",
  "Effect": "Allow",
  "Action": [
    "bedrock:ApplyGuardrail"
  ],
  "Resource": "arn:aws:bedrock:*:*:guardrail/*"
}

S3 File Storage (Optional)¶

Environment Variables: AWS_S3_BUCKET

Required for storing generated images, audio files, and documents. See Storage Configuration for bucket setup details.

S3 File Storage IAM Policy Statements

{
  "Sid": "S3FileStorage",
  "Effect": "Allow",
  "Action": [
    "s3:PutObject",
    "s3:GetObject",
    "s3:DeleteObject"
  ],
  "Resource": "arn:aws:s3:::AWS_S3_BUCKET_VALUE/*"
}

Replace Bucket Name

Replace AWS_S3_BUCKET_VALUE with the value of your AWS_S3_BUCKET environment variable.

If your S3 bucket uses KMS encryption, also add:

{
  "Sid": "KMSEncryptedBucket",
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt",
    "kms:GenerateDataKey"
  ],
  "Resource": "arn:aws:kms:REGION:ACCOUNT_ID:key/YOUR_KMS_KEY_ID",
  "Condition": {
    "StringEquals": {
      "kms:ViaService": "s3.REGION.amazonaws.com"
    }
  }
}

KMS Security

The kms:ViaService condition restricts KMS key usage to S3 service calls only, following AWS security best practices.

Text-to-Speech (Optional)¶

Environment Variables: AWS_POLLY_REGION, DEFAULT_TTS_MODEL

Required for generating speech from text using Amazon Polly. See Audio and Text-to-Speech configuration section.

Polly Text-to-Speech IAM Policy Statement

{
  "Sid": "PollyTextToSpeech",
  "Effect": "Allow",
  "Action": [
    "polly:SynthesizeSpeech",
    "polly:DescribeVoices"
  ],
  "Resource": "*"
}

Speech-to-Text (Optional)¶

Environment Variables: AWS_TRANSCRIBE_REGION, AWS_TRANSCRIBE_S3_BUCKET

Required for transcribing audio files using Amazon Transcribe.

Transcribe Speech-to-Text IAM Policy Statements

{
  "Sid": "TranscribeSpeechToText",
  "Effect": "Allow",
  "Action": [
    "transcribe:StartTranscriptionJob",
    "transcribe:GetTranscriptionJob",
    "transcribe:DeleteTranscriptionJob"
  ],
  "Resource": "*"
},
{
  "Sid": "TranscribeS3Storage",
  "Effect": "Allow",
  "Action": [
    "s3:PutObject",
    "s3:GetObject",
    "s3:DeleteObject"
  ],
  "Resource": "arn:aws:s3:::AWS_TRANSCRIBE_S3_BUCKET_VALUE/*"
}

Replace Bucket Name

Replace AWS_TRANSCRIBE_S3_BUCKET_VALUE with the value of your AWS_TRANSCRIBE_S3_BUCKET environment variable (or AWS_S3_BUCKET if using the same bucket).

If your transcribe S3 bucket uses KMS encryption, also add the KMS permissions with the appropriate bucket ARN.

Language Detection (Optional)¶

Environment Variables: AWS_COMPREHEND_REGION

Required for automatic language detection (used by TTS for voice selection).

Comprehend Language Detection IAM Policy Statement

{
  "Sid": "ComprehendLanguageDetection",
  "Effect": "Allow",
  "Action": [
    "comprehend:DetectDominantLanguage"
  ],
  "Resource": "*"
}

Text Translation (Optional)¶

Environment Variables: AWS_TRANSLATE_REGION

Required for text translation features.

Translate Text Translation IAM Policy Statement

{
  "Sid": "TranslateTextTranslation",
  "Effect": "Allow",
  "Action": [
    "translate:TranslateText"
  ],
  "Resource": "*"
}

API Key Authentication (Optional)¶

Required if you configure API authentication. See Authentication configuration section.

SSM Parameter Store¶

Environment Variables: API_KEY_SSM_PARAMETER

SSM Parameter Store IAM Policy Statements

{
  "Sid": "SSMParameterAccess",
  "Effect": "Allow",
  "Action": [
    "ssm:GetParameter"
  ],
  "Resource": "arn:aws:ssm:REGION:ACCOUNT_ID:parameter/API_KEY_SSM_PARAMETER_VALUE"
}

Replace Parameter Path

Replace API_KEY_SSM_PARAMETER_VALUE with the value of your API_KEY_SSM_PARAMETER environment variable (e.g., /stdapi/prod/api-key).

If using encrypted SSM parameters, also add:

{
  "Sid": "KMSDecryptionForSSM",
  "Effect": "Allow",
  "Action": [
    "kms:Decrypt"
  ],
  "Resource": "arn:aws:kms:REGION:ACCOUNT_ID:key/YOUR_KMS_KEY_ID",
  "Condition": {
    "StringEquals": {
      "kms:ViaService": "ssm.REGION.amazonaws.com"
    }
  }
}

KMS Security

The kms:ViaService condition restricts KMS key usage to SSM service calls only.

Secrets Manager¶

Environment Variables: API_KEY_SECRETSMANAGER_SECRET

Secrets Manager IAM Policy Statement

{
  "Sid": "SecretsManagerAccess",
  "Effect": "Allow",
  "Action": [
    "secretsmanager:GetSecretValue"
  ],
  "Resource": "arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:API_KEY_SECRETSMANAGER_SECRET_VALUE"
}

Replace Secret Name

Replace API_KEY_SECRETSMANAGER_SECRET_VALUE with the value of your API_KEY_SECRETSMANAGER_SECRET environment variable (e.g., stdapi-api-key).

Complete Policy Examples¶

Minimal Policy (Bedrock Only)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BedrockModelInvoke",
      "Effect": "Allow",
      "Action": [
        "bedrock:GetAsyncInvoke",
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockModelDiscovery",
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:GetFoundationModelAvailability",
        "bedrock:ListProvisionedModelThroughputs",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockMarketplaceAutoSubscribe",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    }
  ]
}

Marketplace Auto-Subscribe (Default Enabled)

The marketplace permissions are included because AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE defaults to true. If you set it to false, you can remove the BedrockMarketplaceAutoSubscribe statement.

Production Policy (Bedrock + S3 + Authentication)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BedrockModelInvoke",
      "Effect": "Allow",
      "Action": [
        "bedrock:GetAsyncInvoke",
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockModelDiscovery",
      "Effect": "Allow",
      "Action": [
        "bedrock:ListFoundationModels",
        "bedrock:GetFoundationModelAvailability",
        "bedrock:ListProvisionedModelThroughputs",
        "bedrock:ListInferenceProfiles"
      ],
      "Resource": "*"
    },
    {
      "Sid": "BedrockMarketplaceAutoSubscribe",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    },
    {
      "Sid": "S3FileStorage",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::my-stdapi-bucket/*"
    },
    {
      "Sid": "SSMParameterAccess",
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameter"
      ],
      "Resource": "arn:aws:ssm:us-east-1:123456789012:parameter/stdapi/prod/api-key"
    }
  ]
}

Marketplace Auto-Subscribe (Default Enabled)

The marketplace permissions are included because AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE defaults to true. If you set it to false, you can remove the BedrockMarketplaceAutoSubscribe statement to follow the principle of least privilege.

Permission Notes¶

Least Privilege Principle

Only include the permission statements you need for your specific deployment. Start with Bedrock permissions and add others as required.

Feature-Specific Permission Requirements¶

Feature	Required Permissions	Configuration
Bedrock Models (Invoke)	`bedrock:InvokeModel` `bedrock:InvokeModelWithResponseStream`	Always required
Bedrock Models (Discovery)	`bedrock:ListFoundationModels` `bedrock:GetFoundationModelAvailability` `bedrock:ListProvisionedModelThroughputs` `bedrock:ListInferenceProfiles`	Always required
Bedrock Marketplace Auto-Subscribe	`aws-marketplace:Subscribe` `aws-marketplace:ViewSubscriptions`	`AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE=true` (default)
Bedrock Guardrails	`bedrock:ApplyGuardrail`	`AWS_BEDROCK_GUARDRAIL_IDENTIFIER`
File Storage	`s3:PutObject` `s3:GetObject` `s3:DeleteObject`	`AWS_S3_BUCKET`
KMS Encrypted S3 Buckets	`kms:Decrypt` `kms:GenerateDataKey` with `kms:ViaService` condition	If S3 buckets use KMS encryption
Text-to-Speech	`polly:SynthesizeSpeech` `polly:DescribeVoices`	`AWS_POLLY_REGION`
Speech-to-Text	`transcribe:StartTranscriptionJob` `transcribe:GetTranscriptionJob` `transcribe:DeleteTranscriptionJob` `s3:PutObject` (transcribe bucket)	`AWS_TRANSCRIBE_REGION` `AWS_TRANSCRIBE_S3_BUCKET`
Language Detection	`comprehend:DetectDominantLanguage`	`AWS_COMPREHEND_REGION`
Translation	`translate:TranslateText`	`AWS_TRANSLATE_REGION`
SSM Parameter Store	`ssm:GetParameter` `kms:Decrypt` (if encrypted)	`API_KEY_SSM_PARAMETER`
Secrets Manager	`secretsmanager:GetSecretValue`	`API_KEY_SECRETSMANAGER_SECRET`

IAM Role vs. IAM User¶

stdapi.ai supports both IAM roles and IAM users:

IAM Role (Recommended): Use when running on EC2, ECS, Lambda, or other AWS compute services. Attach the policy to the instance/task role.
IAM User: Use when running outside AWS or for development. Create an IAM user with the required permissions and configure AWS credentials via environment variables or AWS CLI configuration.

Best Practice: Use IAM Roles

When deploying on AWS infrastructure, always prefer IAM roles over IAM users with access keys. IAM roles provide automatic credential rotation and better security.

Authentication¶

stdapi.ai supports three methods for API key authentication.

Authentication Methods

Choose only one method - they are mutually exclusive, with the following precedence order:

SSM Parameter Store (highest precedence)
Secrets Manager
Direct API key (lowest precedence)

No Authentication Warning

If no authentication method is configured, the API accepts all requests without authentication. This is suitable only for internal/private deployments.

Method 1: SSM Parameter Store (Recommended)¶

Recommended - Use AWS Systems Manager Parameter Store for secure key storage with encryption, access control, and auditing.

`API_KEY_SSM_PARAMETER`¶

Purpose : Name of the SSM parameter containing the API key. The parameter is retrieved from the current region detected by the running container, or defaults to the first region in AWS_BEDROCK_REGIONS.

Recommendation : Use SecureString type for encryption at rest

IAM Permissions Required : ssm:GetParameter, kms:Decrypt (if encrypted)

export API_KEY_SSM_PARAMETER=/stdapi/prod/api-key

Method 2: Secrets Manager¶

Use AWS Secrets Manager for secure key storage with automatic rotation support.

`API_KEY_SECRETSMANAGER_SECRET`¶

Purpose : Name of the Secrets Manager secret containing the API key. The secret is retrieved from the current region detected by the running container, or defaults to the first region in AWS_BEDROCK_REGIONS.

Format : Can be a plain string or JSON object

IAM Permissions Required : secretsmanager:GetSecretValue

`API_KEY_SECRETSMANAGER_KEY`¶

Purpose : JSON key name within the secret (if the secret is a JSON object)

Default : api_key

Plain String Secret:

export API_KEY_SECRETSMANAGER_SECRET=stdapi-api-key

JSON Secret:

export API_KEY_SECRETSMANAGER_SECRET=stdapi-credentials
export API_KEY_SECRETSMANAGER_KEY=api_key

Example JSON secret structure:

{
  "api_key": "sk-1234567890abcdef...",
  "other_config": "value"
}

Method 3: Direct API Key¶

Provide the API key directly via environment variable.

`API_KEY`¶

Purpose : Static API key value

Security Warning : Avoid hardcoding in configuration files; use environment variables only

Client Usage : Clients must include this key in the Authorization: Bearer <key> header or X-API-Key header

export API_KEY=sk-1234567890abcdef...

OpenAI API Compatibility¶

`OPENAI_ROUTES_PREFIX`¶

Purpose : Base path prefix for OpenAI-compatible API routes

Default :

Effect : All OpenAI-compatible endpoints will be mounted under this prefix

export OPENAI_ROUTES_PREFIX=/api

Example Endpoints

With the prefix /api, endpoints are available at:

/api/v1/chat/completions
/api/v1/models
/api/v1/embeddings

CORS Configuration¶

Configure Cross-Origin Resource Sharing (CORS) to control which web origins can access your API from browsers.

`CORS_ALLOW_ORIGINS`¶

Purpose : List of origins allowed to make cross-origin requests

Format : JSON array of origin URLs

Default : None (CORS not enabled)

Best Practice : Only enable if your API is accessed from web browsers; specify exact origins in production

# Not configured (default) - CORS middleware not enabled
# Browser cross-origin requests will be blocked
# No environment variable needed

# Development: Allow all origins
export CORS_ALLOW_ORIGINS='["*"]'

# Production: Specific origins only
export CORS_ALLOW_ORIGINS='["https://myapp.com", "https://app.example.com"]'

# Multiple environments
export CORS_ALLOW_ORIGINS='["https://app.example.com", "https://staging.example.com"]'

What is CORS?

Cross-Origin Resource Sharing (CORS) is a browser security mechanism that restricts web pages from making requests to a different domain than the one serving the web page.

Without CORS enabled:

Browser requests from web applications will fail due to missing CORS headers
Non-browser clients (curl, SDKs, mobile apps, server-to-server) work normally
Most secure default - no cross-origin access from browsers

With CORS enabled:

Browsers can make requests from allowed origins
Preflight OPTIONS requests are handled automatically
Non-browser clients continue to work normally

Security Consideration

Default (not configured): CORS is disabled. Browser cross-origin requests will fail. This is the most secure default.
["*"]: Allows requests from any web origin. Convenient for development but not recommended for production.
Specific origins: Only allows requests from listed origins. Recommended for production.

CORS Behavior

When CORS_ALLOW_ORIGINS is not configured (default), CORS is not enabled
When configured with specific origins or ["*"], CORS is enabled with:
- Authorization headers with credentials allowed
- All HTTP methods allowed
- All request headers allowed

When to Configure

Configure CORS_ALLOW_ORIGINS when:

Your API is accessed from browser-based web applications (React, Vue, Angular, etc.)
Building a web frontend that calls your API from a different domain
Developing locally with web apps (browser at localhost:3000 calling API at localhost:8000)

When NOT to Configure

Do not configure CORS when:

Your API is only accessed from server-to-server integrations
Your API is only accessed from mobile apps or desktop clients
Your API is only accessed from CLI tools or SDKs
Your API is only accessed from non-browser HTTP clients

Non-browser clients don't enforce CORS, so enabling it is unnecessary overhead.

Trusted Host Configuration¶

Configure Host header validation to protect against Host header injection attacks.

`TRUSTED_HOSTS`¶

Purpose : List of trusted Host header values for validation

Format : JSON array of hostnames (supports wildcards)

Default : None (no Host header validation)

Best Practice : Use AWS ALB host-based routing rules instead when possible for better performance and management

# Not configured (default) - no Host header validation
# No environment variable needed

# Production: Specific hosts only
export TRUSTED_HOSTS='["api.example.com", "www.example.com"]'

# With wildcard subdomains
export TRUSTED_HOSTS='["*.example.com", "api.myapp.com"]'

# Multiple environments including localhost
export TRUSTED_HOSTS='["api.example.com", "staging.example.com", "localhost"]'

What is Host Header Validation?

The Host header in HTTP requests specifies the domain name of the server. Host header validation ensures that requests are only processed when they target your legitimate domains, preventing:

Host header injection attacks - Malicious manipulation of Host headers to generate poisoned cache entries or exploit application logic
Web cache poisoning - Attacks that exploit Host header handling in caching layers

Security Consideration

By default, no Host header validation is performed. For production deployments exposed to the internet, configure host validation.

Recommended approach for AWS deployments:

Use AWS ALB host-based routing rules to restrict which Host headers reach your application
Configure ALB listener rules to only forward traffic for approved hostnames
This provides better performance and centralized management compared to application-level validation

Use TRUSTED_HOSTS setting when:

You cannot configure host-based routing at the load balancer level
You need application-level defense-in-depth
You're not using AWS ALB or similar services

Wildcard Support

Wildcard subdomains are supported using the * prefix:

*.example.com - Matches any subdomain of example.com (api.example.com, app.example.com, etc.)
example.com - Matches only the exact domain
* - Not recommended, but matches all hosts (equivalent to no validation)

Common Configurations

Single Domain Production:

export TRUSTED_HOSTS='["api.example.com"]'

Multi-Domain with Subdomains:

export TRUSTED_HOSTS='["*.example.com", "*.myapp.com", "api.production.com"]'

Development and Production:

export TRUSTED_HOSTS='["api.example.com", "localhost", "127.0.0.1"]'

Host Validation Behavior

When TRUSTED_HOSTS is not configured (default), Host header validation is not enabled
When configured, requests with non-matching Host headers are rejected with HTTP 400 Bad Request

When to Configure

Configure TRUSTED_HOSTS when:

You need defense-in-depth beyond load balancer rules
You cannot configure host-based routing at the load balancer level
Deploying without AWS ALB or similar load balancer with host validation

AWS ALB Host-Based Routing (Recommended)

Instead of using TRUSTED_HOSTS, configure AWS ALB listener rules to validate Host headers:

Via AWS Console:

Navigate to EC2 → Load Balancers → Your ALB → Listeners
Add rules to listener on port 443 (HTTPS)
Add condition: "Host header" is "api.example.com"
Forward to target group only if Host header matches

Via AWS CLI:

aws elbv2 create-rule \
  --listener-arn arn:aws:elasticloadbalancing:... \
  --priority 1 \
  --conditions Field=host-header,Values=api.example.com \
  --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...

Benefits of ALB host validation:

Better performance (rejected at load balancer, not application)
Centralized security policy management
ALB metrics and logging for rejected requests
Reduced load on application servers

Proxy Headers Configuration¶

Configure X-Forwarded-* header processing when running behind reverse proxies or load balancers.

`ENABLE_PROXY_HEADERS`¶

Purpose : Enable trusting X-Forwarded-* headers from reverse proxies

Type : Boolean

Default : false (disabled)

Best Practice : Only enable when running behind a trusted reverse proxy

# Disabled (default) - do not trust X-Forwarded-* headers
# No environment variable needed

# Enable when behind reverse proxy
export ENABLE_PROXY_HEADERS=true

What are X-Forwarded Headers?

When your application runs behind a reverse proxy (nginx, Apache, AWS ALB, CloudFront, etc.), the proxy sits between clients and your application. Without proxy header processing:

The application sees the proxy's IP address instead of the client's real IP
The application sees the proxy-to-app connection (e.g., HTTP) instead of the original client connection (e.g., HTTPS)
The application cannot distinguish between different clients behind the proxy

Reverse proxies add X-Forwarded-* headers to preserve the original request information:

X-Forwarded-For - Client's real IP address (and chain of proxies)
X-Forwarded-Proto - Original protocol (http/https)
X-Forwarded-Port - Original port number

Security Warning

CRITICAL: Only enable ENABLE_PROXY_HEADERS when running behind a trusted reverse proxy that properly sets X-Forwarded-* headers.

If enabled without a trusted proxy:

Clients can spoof their IP address by sending fake X-Forwarded-For headers
Security controls based on client IP (rate limiting, allowlists) can be bypassed
Logging and monitoring will record incorrect client information
Authentication and authorization decisions may be affected

Never enable this setting if your application is directly exposed to the internet without a reverse proxy.

Common Deployment Scenarios

Scenario 1: Direct to Internet (No Proxy)

# Do NOT enable proxy headers
# ENABLE_PROXY_HEADERS should remain false (default)

Your application receives requests directly from clients.

Scenario 2: Behind AWS ALB/CloudFront

export ENABLE_PROXY_HEADERS=true

AWS load balancer or CDN forwards requests to your application.

Scenario 3: Multiple AWS Proxy Layers

export ENABLE_PROXY_HEADERS=true

Example: CloudFront → ALB → Your Application

Proxy Headers Behavior

When ENABLE_PROXY_HEADERS is false (default), X-Forwarded- headers are not trusted*
When enabled, the server processes X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Port headers to determine client information
All proxies are trusted - ensure your network architecture prevents untrusted sources from reaching the application

When to Enable

Enable ENABLE_PROXY_HEADERS when:

Deployed behind AWS ALB, NLB, API Gateway, or CloudFront
Running behind any reverse proxy that sets X-Forwarded-* headers

AWS Proxy Configuration

AWS ALB, NLB, and CloudFront automatically set X-Forwarded-* headers - no additional configuration needed.

When you enable ENABLE_PROXY_HEADERS=true, your application will trust these headers to determine:

Client's real IP address (from X-Forwarded-For)
Original protocol (from X-Forwarded-Proto: http/https)
Original port (from X-Forwarded-Port)

GZip Compression¶

Configure automatic GZip compression for HTTP responses to reduce bandwidth usage and improve response times.

`ENABLE_GZIP`¶

Purpose : Enable GZip compression for HTTP responses

Type : Boolean

Default : false (disabled)

Best Practice : Use AWS ALB or CloudFront compression instead when available for better performance

# Disabled (default) - no response compression
# No environment variable needed

# Enable GZip compression (responses larger than 1 KiB will be compressed)
export ENABLE_GZIP=true

How GZip Compression Works

When enabled, the server automatically:

Checks if the response size exceeds 1 KiB (1024 bytes)
Verifies the client supports compression (via Accept-Encoding: gzip header)
Compresses the response body using gzip
Adds Content-Encoding: gzip header to the response

Typical compression ratios for JSON responses: 60-80% size reduction

Recommended: Use AWS Compression Services

Instead of enabling application-level compression, use AWS services for better performance:

AWS ALB (Application Load Balancer):

Enable compression in ALB target group attributes
ALB compresses responses before sending to clients
Reduces CPU load on your application servers
AWS ALB Compression Documentation

AWS CloudFront (CDN):

Enable automatic compression in CloudFront distribution settings
Compresses and caches responses at edge locations globally
Best performance for geographically distributed users
CloudFront Compression Documentation

Benefits of AWS-managed compression:

No CPU overhead on application servers
Offloads compression to AWS infrastructure
Better performance with CloudFront edge locations
Centralized configuration and management

When to Enable Application-Level Compression

Enable ENABLE_GZIP only when:

You're not using AWS ALB or CloudFront
Your API returns large JSON responses and you want to reduce bandwidth
Local development or non-AWS deployments

When NOT to Enable

Do not enable when:

You're behind AWS ALB with compression enabled
You're using CloudFront with compression enabled
CPU usage is a concern (compression adds CPU overhead)

Enabling compression at multiple layers is redundant and wastes CPU resources.

Compression Behavior

When ENABLE_GZIP is false (default), compression is not enabled
When enabled, only responses meeting these criteria are compressed:
- Response size ≥ 1 KiB (1024 bytes)
- Client sends Accept-Encoding: gzip header
- Response does not already have Content-Encoding header
Streaming responses are compressed on-the-fly

Configuring AWS Compression

AWS ALB Compression:

Enable via AWS Console:

Navigate to EC2 → Target Groups → Your Target Group
Edit target group attributes
Enable "Compression" attribute

Enable via AWS CLI:

aws elbv2 modify-target-group-attributes \
  --target-group-arn arn:aws:elasticloadbalancing:region:account:targetgroup/... \
  --attributes Key=compression.enabled,Value=true

AWS CloudFront Compression:

Enable via AWS Console:

Navigate to CloudFront → Distributions → Your Distribution
Edit behavior settings
Enable "Compress Objects Automatically"

Enable via AWS CLI:

aws cloudfront update-distribution \
  --id YOUR_DISTRIBUTION_ID \
  --distribution-config file://config.json
# Set "Compress": true in the distribution config

Performance Impact

Application-level compression costs:

Increased CPU usage on application servers
Memory overhead for compression buffers
Small latency increase (1-5ms per request)

AWS-managed compression benefits:

No CPU impact on application servers
Better overall performance
Lower costs (compression offloaded to AWS infrastructure)

SSRF Protection¶

Configure Server-Side Request Forgery (SSRF) protection to prevent unauthorized access to internal networks.

`SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS`¶

Purpose : Enable SSRF protection by blocking requests to private/local networks

Type : Boolean

Default : true (enabled for security)

Best Practice : Keep enabled in production to protect against SSRF attacks

# Enabled (default) - block private networks
# No environment variable needed

# Disable only in controlled environments that need local network access
export SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS=false

What is SSRF Protection?

Server-Side Request Forgery (SSRF) is an attack where an attacker can make the server send requests to unintended destinations, including internal network resources.

SSRF protection has two layers:

Baseline Protection (Always Enabled) - Cannot be disabled:
- :material-loopback: Loopback Addresses - 127.0.0.0/8, ::1
- Unspecified Addresses - 0.0.0.0, ::
- Link-Local Addresses - 169.254.0.0/16, fe80::/10
- Reserved IP Ranges - IETF reserved addresses
- Multicast Addresses - Multicast IP ranges
Private Network Protection (Controlled by this setting):
- RFC 1918 Private Networks - 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
- Other Private Address Ranges - Including IPv6 unique local addresses (fc00::/7)

Security Warning

CRITICAL: Only disable SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS in controlled environments where accessing internal networks is explicitly required and safe.

If disabled, private network protection is removed:

Attackers may be able to access RFC 1918 private network resources (10.x.x.x, 172.16-31.x.x, 192.168.x.x) through your API
Internal services on private networks (databases, admin panels, internal APIs) may be exposed
Internal APIs without authentication may be exploited

Important: Even when disabled, baseline protection remains active and prevents access to:

Loopback addresses (127.0.0.1, localhost) - always blocked
Link-local addresses (169.254.x.x) including AWS EC2 metadata endpoint - always blocked
Reserved and multicast addresses - always blocked

When to Disable

Disable SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS only when:

Your application legitimately needs to access internal network resources
Local development environment where accessing localhost services is required
You have other security controls in place (network segmentation, firewall rules)
Running in isolated Docker/container environments with restricted network access

Defense in Depth

Even with SSRF protection enabled, implement additional security measures:

Network Segmentation - Isolate application servers from sensitive internal networks
:material-firewall: Firewall Rules - Restrict outbound connections from application servers
Security Groups - Use AWS security groups to limit network access
Monitoring - Log and monitor outbound requests for suspicious patterns

Observability (OpenTelemetry)¶

Configure distributed tracing for debugging and performance monitoring. stdapi.ai integrates with AWS X-Ray, Jaeger, DataDog, and other OTLP-compatible systems.

`OTEL_ENABLED`¶

Purpose : Enable or disable OpenTelemetry tracing

Type : Boolean

Default : false

export OTEL_ENABLED=true

Performance Consideration

Disable in performance-critical deployments where observability is not needed.

`OTEL_SERVICE_NAME`¶

Purpose : Service identifier in trace visualizations

Default : stdapi

Best Practice : Use descriptive names with environment information

export OTEL_SERVICE_NAME=stdapi-production-us-east-1

`OTEL_EXPORTER_ENDPOINT`¶

Purpose : OTLP HTTP endpoint URL for sending traces

Default : http://127.0.0.1:4318/v1/traces

Protocol : Must support OTLP HTTP format

AWS X-Ray (via ADOT):

export OTEL_EXPORTER_ENDPOINT=http://127.0.0.1:4318/v1/traces

Jaeger:

export OTEL_EXPORTER_ENDPOINT=http://jaeger:14268/api/traces

Cloud Provider OTLP:

# Use provider-specific OTLP endpoints
export OTEL_EXPORTER_ENDPOINT=https://your-provider-otlp-endpoint.com/v1/traces

`OTEL_SAMPLE_RATE`¶

Purpose : Percentage of requests to trace (controls cost vs. observability)

Type : Float (0.0 to 1.0)

Default : 1.0 (100%)

Development:

# Trace everything for debugging
export OTEL_SAMPLE_RATE=1.0

Production (Moderate Traffic):

# Sample 10% of requests
export OTEL_SAMPLE_RATE=0.1

Production (High Traffic):

# Sample 1% of requests
export OTEL_SAMPLE_RATE=0.01

Sampling Recommendations

Sample Rate	Use Case
`1.0` (100%)	Development, debugging, low-traffic services
`0.1` (10%)	Production with moderate traffic
`0.01` (1%)	High-traffic production services
`0.0` (0%)	Equivalent to disabling tracing

API Documentation Routes¶

stdapi.ai provides automatic API documentation routes, which are disabled by default for security in production environments.

Security Consideration

Exposing API documentation routes in production can reveal internal API structure, available endpoints, and request/response schemas to potential attackers. Only enable these routes in development/testing environments or when absolutely necessary.

`ENABLE_DOCS`¶

Purpose : Enable interactive Swagger UI documentation at /docs

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_DOCS=true

Interactive Documentation Features

The /docs endpoint provides an interactive interface to:

Browse all available API endpoints
Test API requests directly from the browser
View request/response schemas
Understand parameter requirements

`ENABLE_REDOC`¶

Purpose : Enable ReDoc documentation UI at /redoc

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_REDOC=true

ReDoc Features

The /redoc endpoint provides a clean, responsive documentation interface with:

Three-panel layout for easy navigation
Enhanced schema visualization
Better rendering for complex APIs
Export to OpenAPI specification

Static Documentation Available

ReDoc API documentation is also available as static documentation at API Reference without requiring this endpoint to be enabled.

`ENABLE_OPENAPI_JSON`¶

Purpose : Enable OpenAPI schema JSON endpoint at /openapi.json

Type : Boolean

Default : false (disabled)

# Enable for development
export ENABLE_OPENAPI_JSON=true

OpenAPI Schema

The /openapi.json endpoint provides the raw OpenAPI 3.0 specification, useful for:

Generating API clients in various languages
Import into API testing tools (Postman, Insomnia)
API documentation generation
Contract testing and validation

Automatic Enablement

If either ENABLE_DOCS or ENABLE_REDOC is set to true, the /openapi.json endpoint will be automatically enabled since both documentation UIs require the OpenAPI schema to function. You only need to explicitly set ENABLE_OPENAPI_JSON=true if you want to expose the schema endpoint without enabling the documentation UIs.

Development Configuration¶

Enable all documentation routes for local development:

export ENABLE_DOCS=true
export ENABLE_REDOC=true
# ENABLE_OPENAPI_JSON is automatically enabled when ENABLE_DOCS or ENABLE_REDOC is true

Or enable only Swagger UI:

export ENABLE_DOCS=true
# ENABLE_OPENAPI_JSON is automatically enabled

Or enable only ReDoc:

export ENABLE_REDOC=true
# ENABLE_OPENAPI_JSON is automatically enabled

Production Best Practice¶

# Keep all routes disabled in production (default)
# No environment variables needed - defaults to false

Production Warning

Never enable these routes in production unless you have specific security controls in place (e.g., IP allowlisting, VPN-only access, or additional authentication layer).

Validation and Logging¶

For comprehensive logging and monitoring information, see the Logging and Monitoring guide.

Variable	Type	Default	Description
`STRICT_INPUT_VALIDATION`	Boolean	`false`	Reject API requests containing unknown/extra fields
`LOG_LEVEL`	String	`info`	Minimum log level to output (see Logging Level)
`LOG_REQUEST_PARAMS`	Boolean	`false`	Include request/response parameters in logs
`TIMEZONE`	String	`UTC`	IANA timezone identifier for request timestamps

Strict Validation:

# Returns HTTP 400 for requests with unexpected fields
export STRICT_INPUT_VALIDATION=true

Logging Level¶

`LOG_LEVEL`¶

Purpose : Control the minimum severity of log events written to STDOUT

Default : info

Options : info, warning, error, critical, disabled

Behavior : Only log events at or above the configured level are output. Log levels are ordered by severity: info < warning < error < critical

# Default: Output all log events
export LOG_LEVEL=info

# Production: Suppress info logs, show only warnings and higher
export LOG_LEVEL=warning

# Critical only: Show only critical errors
export LOG_LEVEL=critical

# Disable logging: Suppress all log output (not recommended)
export LOG_LEVEL=disabled

Log Level Examples

Level	Outputs	Use Case
`info`	info, warning, error, critical	Development, debugging, full visibility
`warning`	warning, error, critical	Production (recommended for most deployments)
`error`	error, critical	High-traffic production, reduce log volume
`critical`	critical only	Minimal logging, only show fatal errors
`disabled`	none	Not recommended - disables all logging

Production Recommendation

For production deployments, warning is recommended to reduce log volume while maintaining visibility into issues. The info level can generate significant log volume in high-traffic environments.

For detailed information about log events, structure, and monitoring strategies, see the Logging and Monitoring guide.

Debug Logging:

# Enable for debugging (NOT recommended for production)
export LOG_REQUEST_PARAMS=true

Security and cost warning

Enabling LOG_REQUEST_PARAMS may expose sensitive data in logs. Use only in development/debugging environments.

Logging full request/response payloads can also significantly increase log ingestion and storage costs, especially for large LLM prompts, tool calls, and generated outputs. If you must enable it, prefer short log retention, targeted sampling, and temporary use only.

Client IP Logging¶

`LOG_CLIENT_IP`¶

Purpose : Enable logging of client IP addresses for each request and add IP to OpenTelemetry spans

Type : Boolean

Default : false (disabled for privacy)

# Disabled (default) - no client IP logging
# No environment variable needed

# Enable client IP logging
export LOG_CLIENT_IP=true

Client IP Behavior

When enabled, client IP addresses are:

Included in log output for each request
Added as the client.address attribute to OpenTelemetry spans (when OTEL_ENABLED=true)

The IP address depends on your proxy configuration:

With ENABLE_PROXY_HEADERS=true (behind reverse proxy):

Logs the real client IP address from the X-Forwarded-For header
Shows the actual end-user IP, not the proxy IP
Requires your reverse proxy (ALB, CloudFront, etc.) to set the header correctly

With ENABLE_PROXY_HEADERS=false (default):

Logs the direct connection IP address
Typically shows your reverse proxy or load balancer IP, not the end-user IP
Limited usefulness unless application is directly exposed to clients

When to Enable

Enable LOG_CLIENT_IP when:

You need client IP addresses for security auditing or compliance
Analyzing traffic patterns and geographic distribution
Investigating abuse, fraud, or suspicious activity
Debugging client-specific issues

Important: Also enable ENABLE_PROXY_HEADERS=true when behind AWS ALB, CloudFront, or other reverse proxies to log the real client IP instead of the proxy IP.

Privacy Consideration

Client IP addresses are considered personal data under privacy regulations like GDPR. When logging IP addresses:

Consider shorter log retention periods
Document the purpose in your privacy policy
Ensure logs are stored securely
Implement log deletion procedures aligned with your data retention policy

Configuration for AWS Deployments

Behind AWS ALB or CloudFront:

# Enable proxy headers to get real client IPs
export ENABLE_PROXY_HEADERS=true
# Enable client IP logging
export LOG_CLIENT_IP=true

Direct exposure (not recommended for production):

# Only enable client IP logging
export LOG_CLIENT_IP=true
# ENABLE_PROXY_HEADERS remains false (default)

Timezone Configuration:

# UTC (default)
export TIMEZONE=UTC

# North America
export TIMEZONE=America/New_York

# Europe
export TIMEZONE=Europe/London

Bedrock Guardrails¶

Amazon Bedrock Guardrails add content filtering and safety controls to model inputs and outputs.

Configuration Options

Guardrails can be configured in three ways:

Global - Via environment variables
Per-request - Via HTTP headers
Request body - Via amazon-bedrock-guardrailConfig object

Global Configuration¶

`AWS_BEDROCK_GUARDRAIL_IDENTIFIER`¶

Purpose : ID of the Bedrock Guardrail to apply

Required : Yes (together with AWS_BEDROCK_GUARDRAIL_VERSION)

export AWS_BEDROCK_GUARDRAIL_IDENTIFIER=abc123def456

`AWS_BEDROCK_GUARDRAIL_VERSION`¶

Purpose : Version of the Bedrock Guardrail

Required : Yes (together with AWS_BEDROCK_GUARDRAIL_IDENTIFIER)

export AWS_BEDROCK_GUARDRAIL_VERSION=1

`AWS_BEDROCK_GUARDRAIL_TRACE`¶

Purpose : Trace level for guardrail evaluation

Options : disabled, enabled, enabled_full

Default : None (optional)

export AWS_BEDROCK_GUARDRAIL_TRACE=enabled

Complete Guardrail Configuration

export AWS_BEDROCK_GUARDRAIL_IDENTIFIER=abc123def456
export AWS_BEDROCK_GUARDRAIL_VERSION=1
export AWS_BEDROCK_GUARDRAIL_TRACE=enabled

Per-Request Configuration¶

Override global guardrail settings for individual requests using HTTP headers:

Header	Purpose
`X-Amzn-Bedrock-GuardrailIdentifier`	Guardrail ID
`X-Amzn-Bedrock-GuardrailVersion`	Guardrail version
`X-Amzn-Bedrock-Trace`	Trace level

Example cURL Request

curl -X POST https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "X-Amzn-Bedrock-GuardrailIdentifier: abc123def456" \
  -H "X-Amzn-Bedrock-GuardrailVersion: 1" \
  -H "X-Amzn-Bedrock-Trace: enabled" \
  -d '{"model": "anthropic.claude-3-sonnet", "messages": [...]}'

Request Body Configuration¶

The amazon-bedrock-guardrailConfig object in the request body is supported for OpenAI Chat Completions compatibility.

Compatibility Note

Only fields compatible with Bedrock Converse API are honored. The tagSuffix field is documented in AWS but not supported in this implementation.

Audio and Text-to-Speech¶

`DEFAULT_TTS_MODEL`¶

Purpose : Default text-to-speech model when not specified in requests

Default : amazon.polly-standard

Model	Description	Quality
`amazon.polly-standard`	Standard Polly voices	Classic quality
`amazon.polly-neural`	Neural Polly voices	Higher quality, more natural
`amazon.polly-long-form`	Long-form content	Optimized for long content
`amazon.polly-generative`	Generative AI voices	:material-sparkles: Latest technology

export DEFAULT_TTS_MODEL=amazon.polly-neural

Token Counting¶

Control how token usage is calculated and reported in API responses.

`TOKENS_ESTIMATION`¶

Purpose : Estimate token counts using a tokenizer when the model doesn't return them directly

Type : Boolean

Default : false

export TOKENS_ESTIMATION=true

Use Case

Enable for consistent token reporting across all models.

`TOKENS_ESTIMATION_DEFAULT_ENCODING`¶

Purpose : Tiktoken encoding algorithm for token estimation

Default : o200k_base

Encoding	Models
`o200k_base`	GPT-4o and newer models
`cl100k_base`	GPT-3.5-turbo, GPT-4
`p50k_base`	Older GPT-3 models

export TOKENS_ESTIMATION_DEFAULT_ENCODING=o200k_base

Model Cache¶

stdapi.ai automatically discovers and caches available Bedrock models from configured regions. The cache is refreshed on-demand when expired, not via background tasks.

`MODEL_CACHE_SECONDS`¶

Purpose : Cache lifetime for the Bedrock models list before refresh

Type : Integer (seconds)

Default : 900 (15 minutes)

Behavior : When a request needs the model list (e.g., model lookup, /models endpoint) and the cache has expired, the server queries AWS Bedrock to discover newly available models, check for model access changes, and update inference profile configurations

# Default: 15 minutes
export MODEL_CACHE_SECONDS=900

# More frequent updates (5 minutes)
export MODEL_CACHE_SECONDS=300

# Less frequent updates (1 hour)
export MODEL_CACHE_SECONDS=3600

Lazy Refresh Behavior

The model cache uses lazy (on-demand) refresh, not background tasks:

Cache is refreshed only when a request needs it and the cache has expired
Common triggers: model lookup failures, /v1/models API calls, inference requests with unknown models
The first request after expiration will experience additional latency while the cache refreshes (typically 2-5 seconds depending on number of regions)
All AWS API requests are executed in parallel across regions to minimize latency penalty
Subsequent requests use the fresh cache until it expires again

Tuning Recommendations

Interval	Use Case	Trade-offs
`300` (5 min)	Development, testing new models	More frequent refresh latency, faster model discovery
`900` (15 min)	Production (default, balanced)	Balanced refresh frequency and latency impact
`3600` (1 hour)	Stable production, cost optimization	Rare refresh latency, slower model discovery

Performance Considerations

Latency Impact: The first request after cache expiration will experience 2-5 seconds additional latency. All AWS API calls are parallelized to minimize this penalty, so latency scales with the slowest region rather than the sum of all regions.
API Calls: Each refresh makes parallel calls to ListFoundationModels, GetFoundationModelAvailability, and ListInferenceProfiles across all configured regions. Lower cache lifetimes increase the frequency of these calls.
Rate Limits: Very frequent refreshes in high-traffic deployments may approach API rate limits, though parallel execution doesn't increase per-region request rate
Multi-Region: Refresh latency is determined by the slowest responding region, not the total number of regions, thanks to parallel execution

Default Model Parameters¶

Configure default inference parameters applied automatically to specific models.

What You Can Do

Set consistent temperature/creativity levels per model
Enable provider-specific features (e.g., Anthropic beta features)
Configure default token limits for cost control
Apply model-specific stop sequences

Parameter Precedence

Request parameters always take precedence over defaults.

`DEFAULT_MODEL_PARAMS`¶

Purpose : Per-model default parameters

Format : JSON object with model IDs as keys

Supported Parameters:

Parameter	Type	Range	Description
`temperature`	Float	≥ 0	Sampling temperature
`top_p`	Float	0.0-1.0	Nucleus sampling
`max_tokens`	Integer	≥ 1	Maximum response tokens
`stop_sequences`	String/Array	-	Stop generation tokens
Provider-specific	Various	-	e.g., `anthropic_beta`

Configuration Examples¶

Basic Parameters:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-micro-v1:0": {
    "temperature": 0.3,
    "max_tokens": 800
  }
}'

Provider-Specific Features:

export DEFAULT_MODEL_PARAMS='{
  "anthropic.claude-sonnet-4-5-20250929-v1:0": {
    "anthropic_beta": ["Interleaved-thinking-2025-05-14"]
  }
}'

Multiple Models:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-micro-v1:0": {
    "temperature": 0.3,
    "max_tokens": 500
  },
  "amazon.nova-lite-v1:0": {
    "temperature": 0.7,
    "max_tokens": 2000
  },
  "anthropic.claude-sonnet-4-5-20250929-v1:0": {
    "temperature": 0.5,
    "top_p": 0.9,
    "anthropic_beta": ["Interleaved-thinking-2025-05-14"]
  }
}'

Advanced Configuration:

export DEFAULT_MODEL_PARAMS='{
  "amazon.nova-pro-v1:0": {
    "temperature": 0.7,
    "top_p": 0.95,
    "max_tokens": 4096,
    "stop_sequences": ["Human:", "Assistant:"]
  }
}'

Parameter Merging¶

graph LR
    A[Default Parameters] --> B[Merged Config]
    C[Request Parameters] --> B
    B --> D[Final Configuration]
    style C fill:#90EE90
    style A fill:#ADD8E6

Default parameters are applied first (from DEFAULT_MODEL_PARAMS)
Request parameters override defaults if both are specified
Provider-specific fields are forwarded to Bedrock as additional model request fields
Unsupported fields that would change output cause HTTP 400 error; otherwise ignored

Configuration Guide¶

Quick Start¶

Minimal Production Setup¶

Production with Authentication¶

Full Production Setup (All Features Enabled)¶

Development Setup¶

Environment Variable Summary¶

Essential (Production)¶

AWS Storage¶

AWS AI Services¶

Bedrock Advanced¶

Authentication¶

OpenAI Compatibility¶

Logging¶

Observability (OpenTelemetry)¶

HTTP/Security¶

Application Behavior¶

API Documentation¶

AWS Services and Regions¶

Storage Configuration¶

AWS_S3_BUCKET¶

AWS_S3_ACCELERATE¶

AWS_S3_TMP_PREFIX¶

AWS_TRANSCRIBE_S3_BUCKET¶

AWS_S3_REGIONAL_BUCKETS¶

S3 Bucket Lifecycle Configuration¶

Bedrock Configuration¶

AWS_BEDROCK_REGIONS¶

AWS_BEDROCK_CROSS_REGION_INFERENCE¶

AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL¶

AWS_BEDROCK_LEGACY¶

AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE¶

Other AWS Services¶

AWS_POLLY_REGION¶

AWS_COMPREHEND_REGION¶

AWS_TRANSCRIBE_REGION¶

AWS_TRANSLATE_REGION¶

Compliance and Latency Optimization¶

GDPR and Data Residency Compliance¶

Latency Optimization¶

Hybrid Approach: Compliance with Performance¶

Configuration Order¶

IAM Permissions¶

Bedrock (Required)¶

Bedrock Marketplace Auto-Subscribe (Optional)¶

Bedrock Guardrails (Optional)¶

S3 File Storage (Optional)¶

Text-to-Speech (Optional)¶

Speech-to-Text (Optional)¶

Language Detection (Optional)¶

Text Translation (Optional)¶

API Key Authentication (Optional)¶

SSM Parameter Store¶

Secrets Manager¶

Complete Policy Examples¶

Permission Notes¶

Feature-Specific Permission Requirements¶

IAM Role vs. IAM User¶

Authentication¶

Method 1: SSM Parameter Store (Recommended)¶

API_KEY_SSM_PARAMETER¶

Method 2: Secrets Manager¶

API_KEY_SECRETSMANAGER_SECRET¶

API_KEY_SECRETSMANAGER_KEY¶

Method 3: Direct API Key¶

API_KEY¶

OpenAI API Compatibility¶

OPENAI_ROUTES_PREFIX¶

CORS Configuration¶

CORS_ALLOW_ORIGINS¶

Trusted Host Configuration¶

TRUSTED_HOSTS¶

Proxy Headers Configuration¶

ENABLE_PROXY_HEADERS¶

GZip Compression¶

ENABLE_GZIP¶

SSRF Protection¶

SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS¶

Observability (OpenTelemetry)¶

OTEL_ENABLED¶

`AWS_S3_BUCKET`¶

`AWS_S3_ACCELERATE`¶

`AWS_S3_TMP_PREFIX`¶

`AWS_TRANSCRIBE_S3_BUCKET`¶

`AWS_S3_REGIONAL_BUCKETS`¶

`AWS_BEDROCK_REGIONS`¶

`AWS_BEDROCK_CROSS_REGION_INFERENCE`¶

`AWS_BEDROCK_CROSS_REGION_INFERENCE_GLOBAL`¶

`AWS_BEDROCK_LEGACY`¶

`AWS_BEDROCK_MARKETPLACE_AUTO_SUBSCRIBE`¶

`AWS_POLLY_REGION`¶

`AWS_COMPREHEND_REGION`¶

`AWS_TRANSCRIBE_REGION`¶

`AWS_TRANSLATE_REGION`¶

`API_KEY_SSM_PARAMETER`¶

`API_KEY_SECRETSMANAGER_SECRET`¶

`API_KEY_SECRETSMANAGER_KEY`¶

`API_KEY`¶

`OPENAI_ROUTES_PREFIX`¶

`CORS_ALLOW_ORIGINS`¶

`TRUSTED_HOSTS`¶

`ENABLE_PROXY_HEADERS`¶

`ENABLE_GZIP`¶

`SSRF_PROTECTION_BLOCK_PRIVATE_NETWORKS`¶

`OTEL_ENABLED`¶

`OTEL_SERVICE_NAME`¶

`OTEL_EXPORTER_ENDPOINT`¶

`OTEL_SAMPLE_RATE`¶

`ENABLE_DOCS`¶

`ENABLE_REDOC`¶

`ENABLE_OPENAPI_JSON`¶

`LOG_LEVEL`¶

`LOG_CLIENT_IP`¶

`AWS_BEDROCK_GUARDRAIL_IDENTIFIER`¶

`AWS_BEDROCK_GUARDRAIL_VERSION`¶

`AWS_BEDROCK_GUARDRAIL_TRACE`¶

`DEFAULT_TTS_MODEL`¶

`TOKENS_ESTIMATION`¶

`TOKENS_ESTIMATION_DEFAULT_ENCODING`¶

`MODEL_CACHE_SECONDS`¶

`DEFAULT_MODEL_PARAMS`¶