Skip to main content
Models are referenced as provider/model (e.g. anthropic/claude-sonnet-4-6) and passed as the model field to the chat completions endpoint. The list and pricing are served live from the API, so query the endpoint for the current set rather than relying on a static table.
These are the same Pinata-hosted models an agent can use via the Pinata provider. Calling them directly through Inference and using them inside an agent both draw down the same credit balance.

List models

This endpoint is public — no authentication required.
curl https://agents.pinata.cloud/v0/llm/models
Response
[
  {
    "model_id": "anthropic/claude-sonnet-4-6",
    "input_usd_per_1m_tokens": 3.0,
    "output_usd_per_1m_tokens": 15.0,
    "cache_read_usd_per_1m_tokens": 0.3
  }
]
FieldDescription
model_idThe provider/model id to pass as model
input_usd_per_1m_tokensPrice per 1M input (prompt) tokens
output_usd_per_1m_tokensPrice per 1M output (completion) tokens
cache_read_usd_per_1m_tokensPrice per 1M cached-input tokens read
The example prices above are illustrative — always read the live endpoint for current rates.

Pricing

Usage is metered per token at the rates above and billed against your credit balance. See Credits for how billing works and how to fund usage.