provider/model (e.g. anthropic/claude-sonnet-4-6) and passed as the model field to the chat completions endpoint. The list and pricing are served live from the API, so query the endpoint for the current set rather than relying on a static table.
These are the same Pinata-hosted models an agent can use via the
Pinata provider. Calling them directly through Inference and using them inside an agent both draw down the same credit balance.List models
This endpoint is public — no authentication required.Response
| Field | Description |
|---|---|
model_id | The provider/model id to pass as model |
input_usd_per_1m_tokens | Price per 1M input (prompt) tokens |
output_usd_per_1m_tokens | Price per 1M output (completion) tokens |
cache_read_usd_per_1m_tokens | Price per 1M cached-input tokens read |
The example prices above are illustrative — always read the live endpoint for current rates.