3 posts tagged with "anthropic"

View All Tags

Claude Code - Managing Anthropic Beta Headers

February 16, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Ishaan Jaff

CTO, LiteLLM

Krrish Dholakia

CEO, LiteLLM

When using Claude Code with LiteLLM and non-Anthropic providers (Bedrock, Azure AI, Vertex AI), you need to ensure that only supported beta headers are sent to each provider. This guide explains how to add support for new beta headers or fix invalid beta header errors.

What Are Beta Headers?

Anthropic uses beta headers to enable experimental features in Claude. When you use Claude Code, it may send beta headers like:

anthropic-beta: prompt-caching-scope-2026-01-05,advanced-tool-use-2025-11-20

However, not all providers support all Anthropic beta features. LiteLLM uses anthropic_beta_headers_config.json to manage which beta headers are supported by each provider.

Common Error Message

Error: The model returned the following errors: invalid beta flag

How LiteLLM Handles Beta Headers

LiteLLM uses a strict validation approach with a configuration file:

litellm/litellm/anthropic_beta_headers_config.json

This JSON file contains a mapping of beta headers for each provider:

Keys: Input beta header names (from Anthropic)
Values: Provider-specific header names (or null if unsupported)
Validation: Only headers present in the mapping with non-null values are forwarded

This enforces stricter validation than just filtering unsupported headers - headers must be explicitly defined to be allowed.

Adding Support for a New Beta Header

When Anthropic releases a new beta feature, you need to add it to the configuration file for each provider.

Step 1: Add the New Beta Header

Open anthropic_beta_headers_config.json and add the new header to each provider's mapping:

anthropic_beta_headers_config.json
{
  "description": "Mapping of Anthropic beta headers for each provider. Keys are input header names, values are provider-specific header names (or null if unsupported). Only headers present in mapping keys with non-null values can be forwarded.",
  "anthropic": {
    "advanced-tool-use-2025-11-20": "advanced-tool-use-2025-11-20",
    "new-feature-2026-03-01": "new-feature-2026-03-01",
    ...
  },
  "azure_ai": {
    "advanced-tool-use-2025-11-20": "advanced-tool-use-2025-11-20",
    "new-feature-2026-03-01": "new-feature-2026-03-01",
    ...
  },
  "bedrock_converse": {
    "advanced-tool-use-2025-11-20": "tool-search-tool-2025-10-19",
    "new-feature-2026-03-01": null,
    ...
  },
  "bedrock": {
    "advanced-tool-use-2025-11-20": "tool-search-tool-2025-10-19",
    "new-feature-2026-03-01": null,
    ...
  },
  "vertex_ai": {
    "advanced-tool-use-2025-11-20": "tool-search-tool-2025-10-19",
    "new-feature-2026-03-01": null,
    ...
  }
}

Key Points:

Supported headers: Set the value to the provider-specific header name (often the same as the key)
Unsupported headers: Set the value to null
Header transformations: Some providers use different header names (e.g., Bedrock maps advanced-tool-use-2025-11-20 to tool-search-tool-2025-10-19)
Alphabetical order: Keep headers sorted alphabetically for maintainability

Step 2: Reload Configuration (No Restart Required!)

Option 1: Dynamic Reload Without Restart

Instead of restarting your application, you can dynamically reload the beta headers configuration using environment variables and API endpoints:

# Set environment variable to fetch from remote URL (Do this if you want to point it to some other URL)
export LITELLM_ANTHROPIC_BETA_HEADERS_URL="https://raw.githubusercontent.com/BerriAI/litellm/main/litellm/anthropic_beta_headers_config.json"

# Manually trigger reload via API (no restart needed!)
curl -X POST "https://your-proxy-url/reload/anthropic_beta_headers" \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN"

Option 2: Schedule Automatic Reloads

Set up automatic reloading to always stay up-to-date with the latest beta headers:

# Reload configuration every 24 hours
curl -X POST "https://your-proxy-url/schedule/anthropic_beta_headers_reload?hours=24" \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN"

Option 3: Traditional Restart

If you prefer the traditional approach, restart your LiteLLM proxy or application:

# If using LiteLLM proxy
litellm --config config.yaml

# If using Python SDK
# Just restart your Python application

Zero-Downtime Updates

With dynamic reloading, you can fix invalid beta header errors without restarting your service! This is especially useful in production environments where downtime is costly.

See Auto Sync Anthropic Beta Headers for complete documentation.

Fixing Invalid Beta Header Errors

If you encounter an "invalid beta flag" error, it means a beta header is being sent that the provider doesn't support.

Step 1: Identify the Problematic Header

Check your logs to see which header is causing the issue:

Error: The model returned the following errors: invalid beta flag: new-feature-2026-03-01

Step 2: Update the Config

Set the header value to null for that provider:

anthropic_beta_headers_config.json
{
  "bedrock_converse": {
    "new-feature-2026-03-01": null
  }
}

Step 3: Restart and Test

Restart your application and verify the header is now filtered out.

Contributing a Fix to LiteLLM

Help the community by contributing your fix!

What to Include in Your PR

Update the config file: Add the new beta header to litellm/anthropic_beta_headers_config.json
Test your changes: Verify the header is correctly filtered/mapped for each provider
Documentation: Include provider documentation links showing which headers are supported

Example PR Description

## Add support for new-feature-2026-03-01 beta header

### Changes
- Added `new-feature-2026-03-01` to anthropic_beta_headers_config.json
- Set to `null` for bedrock_converse (unsupported)
- Set to header name for anthropic, azure_ai (supported)

### Testing
Tested with:
- ✅ Anthropic: Header passed through correctly
- ✅ Azure AI: Header passed through correctly  
- ✅ Bedrock Converse: Header filtered out (returns error without fix)

### References
- Anthropic docs: [link]
- AWS Bedrock docs: [link]

How Beta Header Filtering Works

When you make a request through LiteLLM:

Filtering Rules

Header must exist in mapping: Unknown headers are filtered out
Header must have non-null value: Headers with null values are filtered out
Header transformation: Headers are mapped to provider-specific names (e.g., advanced-tool-use-2025-11-20 → tool-search-tool-2025-10-19 for Bedrock)

Example

Request with headers:

anthropic-beta: advanced-tool-use-2025-11-20,computer-use-2025-01-24,unknown-header

For Bedrock Converse:

✅ computer-use-2025-01-24 → computer-use-2025-01-24 (supported, passed through)
❌ advanced-tool-use-2025-11-20 → filtered out (null value in config)
❌ unknown-header → filtered out (not in config)

Result sent to Bedrock:

anthropic-beta: computer-use-2025-01-24

Dynamic Configuration Management (No Restart Required!)

Environment Variables

Control how LiteLLM loads the beta headers configuration:

Variable	Description	Default
`LITELLM_ANTHROPIC_BETA_HEADERS_URL`	URL to fetch config from	GitHub main branch
`LITELLM_LOCAL_ANTHROPIC_BETA_HEADERS`	Set to `True` to use local config only	`False`

Example: Use Custom Config URL

export LITELLM_ANTHROPIC_BETA_HEADERS_URL="https://your-company.com/custom-beta-headers.json"

Example: Use Local Config Only (No Remote Fetching)

export LITELLM_LOCAL_ANTHROPIC_BETA_HEADERS=True

Day 0 Support: Claude Opus 4.6

February 5, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Ishaan Jaff

CTO, LiteLLM

Krrish Dholakia

CEO, LiteLLM

LiteLLM now supports Claude Opus 4.6 on Day 0. Use it across Anthropic, Azure, Vertex AI, and Bedrock through the LiteLLM AI Gateway.

Docker Image

docker pull ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6

Usage - Anthropic

LiteLLM Proxy

1. Setup config.yaml

model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: anthropic/claude-opus-4-6
      api_key: os.environ/ANTHROPIC_API_KEY

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  -v $(pwd)/config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6 \
  --config /app/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "what llm are you"
    }
  ]
}'

Usage - Azure

LiteLLM Proxy

1. Setup config.yaml

model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: azure_ai/claude-opus-4-6
      api_key: os.environ/AZURE_AI_API_KEY
      api_base: os.environ/AZURE_AI_API_BASE  # https://<resource>.services.ai.azure.com

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e AZURE_AI_API_KEY=$AZURE_AI_API_KEY \
  -e AZURE_AI_API_BASE=$AZURE_AI_API_BASE \
  -v $(pwd)/config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6 \
  --config /app/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "what llm are you"
    }
  ]
}'

Usage - Vertex AI

LiteLLM Proxy

1. Setup config.yaml

model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: vertex_ai/claude-opus-4-6
      vertex_project: os.environ/VERTEX_PROJECT
      vertex_location: us-east5

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e VERTEX_PROJECT=$VERTEX_PROJECT \
  -e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -v $(pwd)/credentials.json:/app/credentials.json \
  ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6 \
  --config /app/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "what llm are you"
    }
  ]
}'

Usage - Bedrock

LiteLLM Proxy

1. Setup config.yaml

model_list:
  - model_name: claude-opus-4-6
    litellm_params:
      model: bedrock/anthropic.claude-opus-4-6-v1:0
      aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
      aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
      aws_region_name: us-east-1

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
  -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
  -v $(pwd)/config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.80.0-stable.opus-4-6 \
  --config /app/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "what llm are you"
    }
  ]
}'

Advanced Features

Compaction

/chat/completions
/v1/messages

Litellm supports enabling compaction for the new claude-opus-4-6.

Enabling Compaction

To enable compaction, add the context_management parameter with the compact_20260112 edit type:

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in San Francisco?"
    }
  ],
  "context_management": {
    "edits": [
      {
        "type": "compact_20260112"
      }
    ]
  },
  "max_tokens": 100
}'

All the parameters supported for context_management by anthropic are supported and can be directly added. Litellm automatically adds the compact-2026-01-12 beta header in the request.

Enable compaction to reduce context size while preserving key information. LiteLLM automatically adds the compact-2026-01-12 beta header when compaction is enabled.

info

Provider Support: Compaction is supported on Anthropic, Azure AI, and Vertex AI. It is not supported on Bedrock (Invoke or Converse APIs).

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "Hi"
        }
    ],
    "context_management": {
        "edits": [
            {
                "type": "compact_20260112"
            }
        ]
    }
}'

Response with Compaction Block

The response will include the compaction summary in provider_specific_fields.compaction_blocks:

{
  "id": "chatcmpl-a6c105a3-4b25-419e-9551-c800633b6cb2",
  "created": 1770357619,
  "model": "claude-opus-4-6",
  "object": "chat.completion",
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "message": {
        "content": "I don't have access to real-time data, so I can't provide the current weather in San Francisco. To get up-to-date weather information, I'd recommend checking:\n\n- **Weather websites** like weather.com, accuweather.com, or wunderground.com\n- **Search engines** – just Google \"San Francisco weather\"\n- **Weather apps** on your phone (e.g., Apple Weather, Google Weather)\n- **National",
        "role": "assistant",
        "provider_specific_fields": {
          "compaction_blocks": [
            {
              "type": "compaction",
              "content": "Summary of the conversation: The user requested help building a web scraper..."
            }
          ]
        }
      }
    }
  ],
  "usage": {
    "completion_tokens": 100,
    "prompt_tokens": 86,
    "total_tokens": 186
  }
}

Using Compaction Blocks in Follow-up Requests

To continue the conversation with compaction, include the compaction block in the assistant message's provider_specific_fields:

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "How can I build a web scraper?"
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "Certainly! To build a basic web scraper, you'll typically use a programming language like Python along with libraries such as `requests` (for fetching web pages) and `BeautifulSoup` (for parsing HTML). Here's a basic example:\n\n```python\nimport requests\nfrom bs4 import BeautifulSoup\n\nurl = 'https://example.com'\nresponse = requests.get(url)\nsoup = BeautifulSoup(response.text, 'html.parser')\n\n# Extract and print all text\ntext = soup.get_text()\nprint(text)\n```\n\nLet me know what you're interested in scraping or if you need help with a specific website!"
        }
      ],
      "provider_specific_fields": {
        "compaction_blocks": [
          {
            "type": "compaction",
            "content": "Summary of the conversation: The user asked how to build a web scraper, and the assistant gave an overview using Python with requests and BeautifulSoup."
          }
        ]
      }
    },
    {
      "role": "user",
      "content": "How do I use it to scrape product prices?"
    }
  ],
  "context_management": {
    "edits": [
      {
        "type": "compact_20260112"
      }
    ]
  },
  "max_tokens": 100
}'

Streaming Support

Compaction blocks are also supported in streaming mode. You'll receive:

compaction_start event when a compaction block begins
compaction_delta events with the compaction content
The accumulated compaction_blocks in provider_specific_fields

Adaptive Thinking

note

When using reasoning_effort with Claude Opus 4.6, all values (low, medium, high) are mapped to thinking: {type: "adaptive"}. To use explicit thinking budgets with type: "enabled", pass the native thinking parameter directly (see "Native thinking param" tab below).

/chat/completions
/v1/messages
Native thinking param

LiteLLM supports adaptive thinking through the reasoning_effort parameter:

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Solve this complex problem: What is the optimal strategy for..."
    }
  ],
  "reasoning_effort": "high"
}'

Use the thinking parameter with type: "adaptive" to enable adaptive thinking mode:

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "messages": [
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
}'

Use the thinking parameter directly for adaptive thinking via the SDK:

import litellm

response = litellm.completion(
  model="anthropic/claude-opus-4-6",
  messages=[{"role": "user", "content": "Solve this complex problem: What is the optimal strategy for..."}],
  thinking={"type": "adaptive"},
)

Effort Levels

/chat/completions
/v1/messages

Four effort levels available: low, medium, high (default), and max. Pass directly via the output_config parameter:

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum computing"
    }
  ],
  "output_config": {
        "effort": "medium"
    }
}'

You can use reasoning effort plus output_config to have more control on the model.

Four effort levels available: low, medium, high (default), and max. Pass directly via the output_config parameter:

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "Explain quantum computing"
        }
    ],
    "output_config": {
        "effort": "medium"
    }
}'

1M Token Context (Beta)

Opus 4.6 supports 1M token context. Premium pricing applies for prompts exceeding 200k tokens ($10/$37.50 per million input/output tokens). LiteLLM supports cost calculations for 1M token contexts.

/chat/completions
/v1/messages

To use the 1M token context window, you need to forward the anthropic-beta header from your client to the LLM provider.

Step 1: Enable header forwarding in your config

general_settings:
  forward_client_headers_to_llm_api: true

Step 2: Send requests with the beta header

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--header 'anthropic-beta: context-1m-2025-08-07' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this large document..."
    }
  ]
}'

To use the 1M token context window, you need to forward the anthropic-beta header from your client to the LLM provider.

Step 1: Enable header forwarding in your config

general_settings:
  forward_client_headers_to_llm_api: true

Step 2: Send requests with the beta header

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'anthropic-beta: context-1m-2025-08-07' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 16000,
    "messages": [
        {
            "role": "user",
            "content": "Analyze this large document..."
        }
    ]
}'

tip

You can combine multiple beta headers by separating them with commas:

--header 'anthropic-beta: context-1m-2025-08-07,compact-2026-01-12'

US-Only Inference

Available at 1.1× token pricing. LiteLLM automatically tracks costs for US-only inference.

/chat/completions
/v1/messages

Use the inference_geo parameter to specify US-only inference:

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "inference_geo": "us"
}'

LiteLLM will automatically apply the 1.1× pricing multiplier for US-only inference in cost tracking.

Use the inference_geo parameter to specify US-only inference:

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
    "inference_geo": "us"
}'

LiteLLM will automatically apply the 1.1× pricing multiplier for US-only inference in cost tracking.

Fast Mode

info

Fast mode is only supported on the Anthropic provider (anthropic/claude-opus-4-6). It is not available on Azure AI, Vertex AI, or Bedrock.

Pricing:

Standard: $5 input / $25 output per MTok
Fast: $30 input / $150 output per MTok (6× premium)

/chat/completions
/v1/messages

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
  "model": "claude-opus-4-6",
  "messages": [
    {
      "role": "user",
      "content": "Refactor this module..."
    }
  ],
  "max_tokens": 4096,
  "speed": "fast"
}'

Using OpenAI SDK:

import openai

client = openai.OpenAI(
    api_key="your-litellm-key",
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="claude-opus-4-6",
    messages=[{"role": "user", "content": "Refactor this module..."}],
    max_tokens=4096,
    extra_body={"speed": "fast"}
)

Using LiteLLM SDK:

from litellm import completion

response = completion(
    model="anthropic/claude-opus-4-6",
    messages=[{"role": "user", "content": "Refactor this module..."}],
    max_tokens=4096,
    speed="fast"
)

LiteLLM automatically tracks the higher costs for fast mode in usage and cost calculations.

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
    "model": "claude-opus-4-6",
    "max_tokens": 4096,
    "speed": "fast",
    "messages": [
        {
            "role": "user",
            "content": "Refactor this module..."
        }
    ]
}'

LiteLLM automatically:

Adds the fast-mode-2026-02-01 beta header
Tracks the 6× premium pricing in cost calculations

Day 0 Support: Claude 4.5 Opus (+Advanced Features)

November 25, 2025

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

This guide covers Anthropic's latest model (Claude Opus 4.5) and its advanced features now available in LiteLLM: Tool Search, Programmatic Tool Calling, Tool Input Examples, and the Effort Parameter.

Feature	Supported Models
Tool Search	Claude Opus 4.5, Sonnet 4.5
Programmatic Tool Calling	Claude Opus 4.5, Sonnet 4.5
Input Examples	Claude Opus 4.5, Sonnet 4.5
Effort Parameter	Claude Opus 4.5 only

Supported Providers: Anthropic, Bedrock, Vertex AI, Azure AI.

Usage

LiteLLM Python SDK
LiteLLM Proxy

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [{"role": "user", "content": "Hey! how's it going?"}]

## OPENAI /chat/completions API format
response = completion(model="claude-opus-4-5-20251101", messages=messages)
print(response)

1. Setup config.yaml

model_list:
  - model_name: claude-4 ### RECEIVED MODEL NAME ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: claude-opus-4-5-20251101 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("ANTHROPIC_API_KEY")

2. Start the proxy

litellm --config /path/to/config.yaml

3. Test it!

OpenAI Chat Completions
Anthropic /v1/messages API

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "max_tokens": 1024,
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

Usage - Bedrock

info

LiteLLM uses the boto3 library to authenticate with Bedrock.

For more ways to authenticate with Bedrock, see the Bedrock documentation.

LiteLLM Python SDK
LiteLLM Proxy

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

## OPENAI /chat/completions API format
response = completion(
  model="bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0",
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

1. Setup config.yaml

model_list:
  - model_name: claude-4 ### RECEIVED MODEL NAME ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: bedrock/us.anthropic.claude-opus-4-5-20251101-v1:0 ### MODEL NAME sent to `litellm.completion()` ###
      aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
      aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
      aws_region_name: os.environ/AWS_REGION_NAME

2. Start the proxy

litellm --config /path/to/config.yaml

3. Test it!

OpenAI Chat Completions
Anthropic /v1/messages API
Bedrock /invoke API
Bedrock /converse API

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "max_tokens": 1024,
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

curl --location 'http://0.0.0.0:4000/bedrock/model/claude-4/invoke' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "max_tokens": 1024,
      "messages": [{"role": "user", "content": "Hello, how are you?"}]
    }'

curl --location 'http://0.0.0.0:4000/bedrock/model/claude-4/converse' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "messages": [{"role": "user", "content": "Hello, how are you?"}]
    }'

Usage - Vertex AI

LiteLLM Python SDK
LiteLLM Proxy

from litellm import completion
import json 

## GET CREDENTIALS 
## RUN ## 
# !gcloud auth application-default login - run this to add vertex credentials to your env
## OR ## 
file_path = 'path/to/vertex_ai_service_account.json'

# Load the JSON file
with open(file_path, 'r') as file:
    vertex_credentials = json.load(file)

# Convert to JSON string
vertex_credentials_json = json.dumps(vertex_credentials)

## COMPLETION CALL 
response = completion(
  model="vertex_ai/claude-opus-4-5@20251101",
  messages=[{ "content": "Hello, how are you?","role": "user"}],
  vertex_credentials=vertex_credentials_json,
  vertex_project="your-project-id",
  vertex_location="us-east5"
)

1. Setup config.yaml

model_list:
  - model_name: claude-4 ### RECEIVED MODEL NAME ###
    litellm_params:
        model: vertex_ai/claude-opus-4-5@20251101
        vertex_credentials: "/path/to/service_account.json"
        vertex_project: "your-project-id"
        vertex_location: "us-east5"

2. Start the proxy

litellm --config /path/to/config.yaml

3. Test it!

OpenAI Chat Completions
Anthropic /v1/messages API

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "max_tokens": 1024,
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

Usage - Azure Anthropic (Azure Foundry Claude)

LiteLLM funnels Azure Claude deployments through the azure_ai/ provider so Claude Opus models on Azure Foundry keep working with Tool Search, Effort, streaming, and the rest of the advanced feature set. Point AZURE_AI_API_BASE to https://<resource>.services.ai.azure.com/anthropic (LiteLLM appends /v1/messages automatically) and authenticate with AZURE_AI_API_KEY or an Azure AD token.

LiteLLM Python SDK
LiteLLM Proxy

import os
from litellm import completion

# Configure Azure credentials
os.environ["AZURE_AI_API_KEY"] = "your-azure-ai-api-key"
os.environ["AZURE_AI_API_BASE"] = "https://my-resource.services.ai.azure.com/anthropic"

response = completion(
    model="azure_ai/claude-opus-4-1",
    messages=[{"role": "user", "content": "Explain how Azure Anthropic hosts Claude Opus differently from the public Anthropic API."}],
    max_tokens=1200,
    temperature=0.7,
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

1. Set environment variables

export AZURE_AI_API_KEY="your-azure-ai-api-key"
export AZURE_AI_API_BASE="https://my-resource.services.ai.azure.com/anthropic"

2. Configure the proxy

model_list:
  - model_name: claude-4-azure
    litellm_params:
      model: azure_ai/claude-opus-4-1
      api_key: os.environ/AZURE_AI_API_KEY
      api_base: os.environ/AZURE_AI_API_BASE

3. Start LiteLLM

litellm --config /path/to/config.yaml

4. Test the Azure Claude route

curl --location 'http://0.0.0.0:4000/chat/completions' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer $LITELLM_KEY' \
  --data '{
    "model": "claude-4-azure",
    "messages": [
      {
        "role": "user",
        "content": "How do I use Claude Opus 4 via Azure Anthropic in LiteLLM?"
      }
    ],
    "max_tokens": 1024
  }'

Tool Search

This lets Claude work with thousands of tools, by dynamically loading tools on-demand, instead of loading all tools into the context window upfront.

Usage Example

LiteLLM Python SDK
LiteLLM Proxy

import litellm
import os

# Configure your API key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

# Define your tools with defer_loading
tools = [
    # Tool search tool (regex variant)
    {
        "type": "tool_search_tool_regex_20251119",
        "name": "tool_search_tool_regex"
    },
    # Deferred tools - loaded on-demand
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a given location. Returns temperature and conditions.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        },
        "defer_loading": True  # Load on-demand
    },
    {
        "type": "function",
        "function": {
            "name": "search_files",
            "description": "Search through files in the workspace using keywords",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "file_types": {
                        "type": "array",
                        "items": {"type": "string"}
                    }
                },
                "required": ["query"]
            }
        },
        "defer_loading": True
    },
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Execute SQL queries against the database",
            "parameters": {
                "type": "object",
                "properties": {
                    "sql": {"type": "string"}
                },
                "required": ["sql"]
            }
        },
        "defer_loading": True
    }
]

# Make a request - Claude will search for and use relevant tools
response = litellm.completion(
    model="anthropic/claude-opus-4-5-20251101",
    messages=[{
        "role": "user",
        "content": "What's the weather like in San Francisco?"
    }],
    tools=tools
)

print("Claude's response:", response.choices[0].message.content)
print("Tool calls:", response.choices[0].message.tool_calls)

# Check tool search usage
if hasattr(response.usage, 'server_tool_use'):
    print(f"Tool searches performed: {response.usage.server_tool_use.tool_search_requests}")

Setup config.yaml

model_list:
  - model_name: claude-4
    litellm_params:
      model: anthropic/claude-opus-4-5-20251101
      api_key: os.environ/ANTHROPIC_API_KEY

Start the proxy

litellm --config /path/to/config.yaml

Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [{
        "role": "user",
        "content": "What's the weather like in San Francisco?"
       }],
       "tools": [
        # Tool search tool (regex variant)
        {
            "type": "tool_search_tool_regex_20251119",
            "name": "tool_search_tool_regex"
        },
        # Deferred tools - loaded on-demand
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather in a given location. Returns temperature and conditions.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "Temperature unit"
                        }
                    },
                    "required": ["location"]
                }
            },
            "defer_loading": True  # Load on-demand
        },
        {
            "type": "function",
            "function": {
                "name": "search_files",
                "description": "Search through files in the workspace using keywords",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string"},
                        "file_types": {
                            "type": "array",
                            "items": {"type": "string"}
                        }
                    },
                    "required": ["query"]
                }
            },
            "defer_loading": True
        },
        {
            "type": "function",
            "function": {
                "name": "query_database",
                "description": "Execute SQL queries against the database",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "sql": {"type": "string"}
                    },
                    "required": ["sql"]
                }
            },
            "defer_loading": True
        }
    ]
}
'

BM25 Variant (Natural Language Search)

For natural language queries instead of regex patterns:

tools = [
    {
        "type": "tool_search_tool_bm25_20251119",  # Natural language variant
        "name": "tool_search_tool_bm25"
    },
    # ... your deferred tools
]

Programmatic Tool Calling

Programmatic tool calling allows Claude to write code that calls your tools programmatically. Learn more

LiteLLM Python SDK
LiteLLM Proxy

import litellm
import json

# Define tools that can be called programmatically
tools = [
    # Code execution tool (required for programmatic calling)
    {
        "type": "code_execution_20250825",
        "name": "code_execution"
    },
    # Tool that can be called from code
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.",
            "parameters": {
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "SQL query to execute"
                    }
                },
                "required": ["sql"]
            }
        },
        "allowed_callers": ["code_execution_20250825"]  # Enable programmatic calling
    }
]

# First request
response = litellm.completion(
    model="anthropic/claude-sonnet-4-5-20250929",
    messages=[{
        "role": "user",
        "content": "Query sales data for West, East, and Central regions, then tell me which had the highest revenue"
    }],
    tools=tools
)

print("Claude's response:", response.choices[0].message)

# Handle tool calls
messages = [
    {"role": "user", "content": "Query sales data for West, East, and Central regions, then tell me which had the highest revenue"},
    {"role": "assistant", "content": response.choices[0].message.content, "tool_calls": response.choices[0].message.tool_calls}
]

# Process each tool call
for tool_call in response.choices[0].message.tool_calls:
    # Check if it's a programmatic call
    if hasattr(tool_call, 'caller') and tool_call.caller:
        print(f"Programmatic call to {tool_call.function.name}")
        print(f"Called from: {tool_call.caller}")
    
    # Simulate tool execution
    if tool_call.function.name == "query_database":
        args = json.loads(tool_call.function.arguments)
        # Simulate database query
        result = json.dumps([
            {"region": "West", "revenue": 150000},
            {"region": "East", "revenue": 180000},
            {"region": "Central", "revenue": 120000}
        ])
        
        messages.append({
            "role": "user",
            "content": [{
                "type": "tool_result",
                "tool_use_id": tool_call.id,
                "content": result
            }]
        })

# Get final response
final_response = litellm.completion(
    model="anthropic/claude-sonnet-4-5-20250929",
    messages=messages,
    tools=tools
)

print("\nFinal answer:", final_response.choices[0].message.content)

Setup config.yaml

model_list:
  - model_name: claude-4
    litellm_params:
      model: anthropic/claude-opus-4-5-20251101
      api_key: os.environ/ANTHROPIC_API_KEY

Start the proxy

litellm --config /path/to/config.yaml

Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [{
        "role": "user",
        "content": "Query sales data for West, East, and Central regions, then tell me which had the highest revenue"
      }],
      "tools": [
        # Code execution tool (required for programmatic calling)
        {
            "type": "code_execution_20250825",
            "name": "code_execution"
        },
        # Tool that can be called from code
        {
            "type": "function",
            "function": {
                "name": "query_database",
                "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "sql": {
                            "type": "string",
                            "description": "SQL query to execute"
                        }
                    },
                    "required": ["sql"]
                }
            },
            "allowed_callers": ["code_execution_20250825"]  # Enable programmatic calling
        }
    ]
}
'

Tool Input Examples

You can now provide Claude with examples of how to use your tools. Learn more

LiteLLM Python SDK
LiteLLM Proxy

import litellm

tools = [
    {
        "type": "function",
        "function": {
            "name": "create_calendar_event",
            "description": "Create a new calendar event with attendees and reminders",
            "parameters": {
                "type": "object",
                "properties": {
                    "title": {"type": "string"},
                    "start_time": {
                        "type": "string",
                        "description": "ISO 8601 format: YYYY-MM-DDTHH:MM:SS"
                    },
                    "duration_minutes": {"type": "integer"},
                    "attendees": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "email": {"type": "string"},
                                "optional": {"type": "boolean"}
                            }
                        }
                    },
                    "reminders": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "minutes_before": {"type": "integer"},
                                "method": {"type": "string", "enum": ["email", "popup"]}
                            }
                        }
                    }
                },
                "required": ["title", "start_time", "duration_minutes"]
            }
        },
        # Provide concrete examples
        "input_examples": [
            {
                "title": "Team Standup",
                "start_time": "2025-01-15T09:00:00",
                "duration_minutes": 30,
                "attendees": [
                    {"email": "alice@company.com", "optional": False},
                    {"email": "bob@company.com", "optional": False}
                ],
                "reminders": [
                    {"minutes_before": 15, "method": "popup"}
                ]
            },
            {
                "title": "Lunch Break",
                "start_time": "2025-01-15T12:00:00",
                "duration_minutes": 60
                # Demonstrates optional fields can be omitted
            }
        ]
    }
]

response = litellm.completion(
    model="anthropic/claude-sonnet-4-5-20250929",
    messages=[{
        "role": "user",
        "content": "Schedule a team meeting for tomorrow at 2pm for 45 minutes with john@company.com and sarah@company.com"
    }],
    tools=tools
)

print("Tool call:", response.choices[0].message.tool_calls[0].function.arguments)

Setup config.yaml

model_list:
  - model_name: claude-4
    litellm_params:
      model: anthropic/claude-opus-4-5-20251101
      api_key: os.environ/ANTHROPIC_API_KEY

Start the proxy

litellm --config /path/to/config.yaml

Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [{
        "role": "user",
        "content": "Schedule a team meeting for tomorrow at 2pm for 45 minutes with john@company.com and sarah@company.com"
      }],
      "tools": [
    {
        "type": "function",
        "function": {
            "name": "create_calendar_event",
            "description": "Create a new calendar event with attendees and reminders",
            "parameters": {
                "type": "object",
                "properties": {
                    "title": {"type": "string"},
                    "start_time": {
                        "type": "string",
                        "description": "ISO 8601 format: YYYY-MM-DDTHH:MM:SS"
                    },
                    "duration_minutes": {"type": "integer"},
                    "attendees": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "email": {"type": "string"},
                                "optional": {"type": "boolean"}
                            }
                        }
                    },
                    "reminders": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "minutes_before": {"type": "integer"},
                                "method": {"type": "string", "enum": ["email", "popup"]}
                            }
                        }
                    }
                },
                "required": ["title", "start_time", "duration_minutes"]
            }
        },
        # Provide concrete examples
        "input_examples": [
            {
                "title": "Team Standup",
                "start_time": "2025-01-15T09:00:00",
                "duration_minutes": 30,
                "attendees": [
                    {"email": "alice@company.com", "optional": False},
                    {"email": "bob@company.com", "optional": False}
                ],
                "reminders": [
                    {"minutes_before": 15, "method": "popup"}
                ]
            },
            {
                "title": "Lunch Break",
                "start_time": "2025-01-15T12:00:00",
                "duration_minutes": 60
                # Demonstrates optional fields can be omitted
            }
        ]
    }
]
}
'

Effort Parameter: Control Token Usage

Control how much effort Claude puts into its response using the reasoning_effort parameter. This allows you to trade off between response thoroughness and token efficiency.

info

LiteLLM automatically maps reasoning_effort to Anthropic's output_config format and adds the required effort-2025-11-24 beta header for Claude Opus 4.5.

Potential values for reasoning_effort parameter: "high", "medium", "low".

Usage Example

LiteLLM Python SDK
LiteLLM Proxy

import litellm

message = "Analyze the trade-offs between microservices and monolithic architectures"

# High effort (default) - Maximum capability
response_high = litellm.completion(
    model="anthropic/claude-opus-4-5-20251101",
    messages=[{"role": "user", "content": message}],
    reasoning_effort="high"
)

print("High effort response:")
print(response_high.choices[0].message.content)
print(f"Tokens used: {response_high.usage.completion_tokens}\n")

# Medium effort - Balanced approach
response_medium = litellm.completion(
    model="anthropic/claude-opus-4-5-20251101",
    messages=[{"role": "user", "content": message}],
    reasoning_effort="medium"
)

print("Medium effort response:")
print(response_medium.choices[0].message.content)
print(f"Tokens used: {response_medium.usage.completion_tokens}\n")

# Low effort - Maximum efficiency
response_low = litellm.completion(
    model="anthropic/claude-opus-4-5-20251101",
    messages=[{"role": "user", "content": message}],
    reasoning_effort="low"
)

print("Low effort response:")
print(response_low.choices[0].message.content)
print(f"Tokens used: {response_low.usage.completion_tokens}\n")

# Compare token usage
print("Token Comparison:")
print(f"High:   {response_high.usage.completion_tokens} tokens")
print(f"Medium: {response_medium.usage.completion_tokens} tokens")
print(f"Low:    {response_low.usage.completion_tokens} tokens")

Setup config.yaml

model_list:
  - model_name: claude-4
    litellm_params:
      model: anthropic/claude-opus-4-5-20251101
      api_key: os.environ/ANTHROPIC_API_KEY

Start the proxy

litellm --config /path/to/config.yaml

Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data ' {
      "model": "claude-4",
      "messages": [{
        "role": "user",
        "content": "Analyze the trade-offs between microservices and monolithic architectures"
      }],
      "reasoning_effort": "high"
    }
'

What Are Beta Headers?​

Common Error Message​

How LiteLLM Handles Beta Headers​

Adding Support for a New Beta Header​

Step 1: Add the New Beta Header​

Step 2: Reload Configuration (No Restart Required!)​

Fixing Invalid Beta Header Errors​

Step 1: Identify the Problematic Header​

Step 2: Update the Config​

Step 3: Restart and Test​

Contributing a Fix to LiteLLM​

What to Include in Your PR​

Example PR Description​

How Beta Header Filtering Works​

Filtering Rules​

Example​

Dynamic Configuration Management (No Restart Required!)​

Environment Variables​

Docker Image​

Usage - Anthropic​

Usage - Azure​

Usage - Vertex AI​

Usage - Bedrock​

Advanced Features​

Compaction​

Adaptive Thinking​

Effort Levels​

1M Token Context (Beta)​

US-Only Inference​

Fast Mode​

Usage​

Usage - Bedrock​

Usage - Vertex AI​

Usage - Azure Anthropic (Azure Foundry Claude)​

Tool Search​

Usage Example​

BM25 Variant (Natural Language Search)​

Programmatic Tool Calling​

Tool Input Examples​

Effort Parameter: Control Token Usage​

Usage Example​

What Are Beta Headers?

Common Error Message

How LiteLLM Handles Beta Headers

Adding Support for a New Beta Header

Step 1: Add the New Beta Header

Step 2: Reload Configuration (No Restart Required!)

Fixing Invalid Beta Header Errors

Step 1: Identify the Problematic Header

Step 2: Update the Config

Step 3: Restart and Test

Contributing a Fix to LiteLLM

What to Include in Your PR

Example PR Description

How Beta Header Filtering Works

Filtering Rules

Example

Dynamic Configuration Management (No Restart Required!)

Environment Variables

Docker Image

Usage - Anthropic

Usage - Azure

Usage - Vertex AI

Usage - Bedrock

Advanced Features

Compaction

Adaptive Thinking

Effort Levels

1M Token Context (Beta)

US-Only Inference

Fast Mode

Usage

Usage - Bedrock

Usage - Vertex AI

Usage - Azure Anthropic (Azure Foundry Claude)

Tool Search

Usage Example

BM25 Variant (Natural Language Search)

Programmatic Tool Calling

Tool Input Examples

Effort Parameter: Control Token Usage

Usage Example