# imgnAI Katana API

Compact Integration Guide for LLMs

This document is designed for an LLM or coding agent which is seeking to integrate text/image/video AI via imgnAI. This llms.txt holds the important integration rules, auth/payment flows, response shapes, media constraints, and compact model catalog in one context-sized file.

The documentation site's top-nav `Copy for LLMs` button copies this same `/llms.txt` content. Use this file as the canonical compact integration context for LLMs and coding agents. For full machine-readable current model metadata, call `GET /v1/models`. Individual model-card `Copy for LLMs` buttons prepend the exact selected model details and examples, then include a compact shared API overview rather than duplicating the full catalog.

imgnAI's API is the paid production gateway for generating images, videos, and text/LLM completions from prompts, optional image references, optional video frame inputs, optional audio references, and multimodal chat messages on the models that support them. Public API calls must use HTTPS JSON requests. Do not call public endpoints with `http://` URLs. Image and video generation uses `/v1/generation-requests`. Text models use the OpenAI-compatible `/v1/chat/completions` endpoint. Paid generation endpoints can be called with either an imgnAI API key/credit balance or x402 USDC payment.

## Service URLs and Account Setup

- Base URL for this deployment: `https://kat.imgnai.com`
- Public API calls must use HTTPS. If an integration ever sees an `http://` Katana base URL, replace it with `https://` before making calls.
- imgnAI web app: https://app.imgnai.com
- Buy credits and subscriptions: https://app.imgnai.com
- Get an API key: https://app.imgnai.com/katana-api
- Create or view existing API keys: https://app.imgnai.com/katana-api
- Reference price used for rough USD estimates: $0.0052 per credit, based on Platinum Annual pricing.

Credits-based account setup and funding:

- To use a credits-based account, create a new account by selecting `Sign up` on https://app.imgnai.com.
- Users can also sign up or log in with SSO through Google, Discord, Telegram, Twitter/X, Farcaster, or any Web3 wallet.
- After registering or signing in, select `Manage Subscription` or `Get Credits` to top up with PayPal, credit card, debit card, Apple Pay, Google Pay, and other supported payment channels.
- Select `API` from the main menu to create API keys and view prompt and credit usage history.
- To check a standard imgnAI account's current credit balance by API, call `GET /v1/me/balance` with the account API key and secret. Text/OpenAI-compatible clients may use `Authorization: Bearer <api_key>:<api_secret>`; other clients may use `X-API-Key` plus `X-API-Secret`. The response returns `credits` as a decimal string, so a Balance Service value of `2000` means `200.0` imgnAI credits.
- Prompt/result history can be switched off from the API page. Historical prompts and results are retained for a maximum of 72 hours after generation.

x402 setup and funding:

- x402 does not require an imgnAI web account or API key. Use a Base or Solana wallet funded with USDC.
- For one-request x402 payment, call the API endpoint without credentials, decode `PAYMENT-REQUIRED`, pay the exact returned Base or Solana USDC requirement, then retry with `PAYMENT-SIGNATURE`.
- For repeated calls, top up an imgnAI x402 wallet balance with `POST /v1/x402/top-up` and an `amount_usdc`, complete the same x402 payment flow, then send `X-Sign-In-With-X` on future requests.
- x402 wallet balance is tied to the settled payer wallet, network, and USDC asset. Read it with `GET /v1/x402/balance/{walletAddress}`.

## Authentication and x402 Payment

Standard account billing uses API key headers. Do not place these credentials in the JSON body.

```http
X-API-Key: <api_key>
X-API-Secret: <secret_key>
```

Text/OpenAI-compatible clients may use one combined bearer token:

```http
Authorization: Bearer <api_key>:<api_secret>
```

Anonymous paid calls use x402 instead of API keys. x402 follows the HTTP 402 flow: first call the endpoint without credentials, receive a `402 Payment Required` response, build and sign one of the returned USDC payment options, then retry the exact same request with the Base64 JSON payment payload in the `PAYMENT-SIGNATURE` header. imgnAI only starts processing after the x402 payment is verified and settled.

x402 headers:

```http
PAYMENT-REQUIRED: <base64 JSON payment requirements returned by imgnAI>
PAYMENT-SIGNATURE: <base64 JSON signed x402 payment sent by the client>
X-402-Payment: <accepted compatibility alias for PAYMENT-SIGNATURE>
PAYMENT-RESPONSE: <base64 JSON settlement receipt returned by imgnAI>
X-Sign-In-With-X: <base64 JSON wallet proof for wallet-balance mode>
```

x402 details:

- x402 version: `2`.
- Payment scheme: `exact`.
- Currency: USDC, atomic units with 6 decimals.
- Supported networks: Base (`eip155:8453`) and Solana (`solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp`).
- Base is listed first and should be the default client choice.
- The returned `accepts` entries include `scheme`, `network`, `asset`, `payTo`, `amount`, `maxAmountRequired`, `maxTimeoutSeconds`, and explanatory `extra` metadata.
- `amount` is the exact atomic USDC amount for that network, including any usually small settlement gas/network cost. The user pays the single returned USDC amount; no separate gas field is returned.
- `maxTimeoutSeconds` is the paid HTTP response budget. Image/video calls should still submit with `wait=false` and poll by `request_id` for final assets.
- Image/video x402 prices use the model's USDC price metadata. Text x402 prices are prepaid in USDC from the reserve estimate because final token usage is only known after inference.
- Optional wallet balance is available with `POST /v1/x402/top-up`, `GET /v1/x402/balance/{walletAddress}`, and `GET /v1/x402/transactions/{walletAddress}`.
- Wallet-balance requests use `X-Sign-In-With-X` to prove wallet ownership. The backend ties balance to the settled payer wallet/network/asset and never trusts wallet identifiers in the generation JSON body.
- Use wallet balance for non-deterministic endpoints such as LLM/text completions. Otherwise one-request pay-as-you-go x402 text calls are charged for the full token allowance/reserve supplied by the caller.
- With wallet balance, text/LLM requests debit the reserve and then credit unused reserve back to the same wallet balance after actual token usage is known.
- The server generates the canonical `request_id` after payment and returns it in the successful response. Do not send caller-defined request IDs.
- For async image/video x402 calls, use the returned `request_id` to poll `GET /v1/generation-requests/{request_id}`. The request id is unguessable; no API key is required for x402 polling.

x402 wallet balance top-up and use:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/x402/top-up' \
  -H 'Content-Type: application/json' \
  -d '{"amount_usdc":"1.00"}'
```

The top-up endpoint follows the normal x402 402 discovery and `PAYMENT-SIGNATURE` retry flow. The credited balance is the requested `amount_usdc`; the paid amount can be higher because it includes settlement gas/network cost. After top-up, send `X-Sign-In-With-X: <base64-json-wallet-proof>` on future generation/text calls to spend from that wallet balance. For Base, sign the proof message with EIP-191/personal-sign. For Solana, sign the UTF-8 message bytes. The message must include `imgnAI` and the wallet address.

Balance-funded text example:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -H 'X-Sign-In-With-X: <base64-json-wallet-proof>' \
  -d '{"model":"qwen3-6-flash","messages":[{"role":"user","content":"Reply with exactly: wallet balance ok"}],"max_tokens":64}'
```

x402 discovery endpoint:

```http
GET /v1/x402
```

x402 image/video discovery call:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -d '{"requests":[{"type":"image","model":"gen","prompt":"A clean product photo of a mug","aspect_ratio":"1:1","output_format":"png"}]}'
```

The response status will be `402`. Decode `PAYMENT-REQUIRED` as Base64 JSON, choose an `accepts` item, sign/pay it using an x402-compatible client, then retry:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-signed-x402-payment>' \
  -d '<same JSON body as the discovery call>'
```

x402 text/LLM calls use the same flow against `/v1/chat/completions`; omit `Authorization`, send a non-streaming body, receive `402`, then retry with `PAYMENT-SIGNATURE`.

## x402 Zero-Context Integration Guide

This section assumes the reader has no previous context about imgnAI or x402. x402 is an HTTP-native payment flow for paid API requests. The client first sends the ordinary API request without an API key. imgnAI replies with HTTP `402 Payment Required` and a `PAYMENT-REQUIRED` response header. That header is Base64-encoded JSON containing exact USDC payment instructions. The client signs or constructs one of the returned payment options, retries the exact same HTTP request with `PAYMENT-SIGNATURE`, and imgnAI verifies and settles payment before starting image, video, or text inference.

The same x402 flow works for:

- Image and video generation: `POST https://kat.imgnai.com/v1/generation-requests?wait=false`, then poll `GET https://kat.imgnai.com/v1/generation-requests/{request_id}`
- Text/LLM chat completions: `POST https://kat.imgnai.com/v1/chat/completions`
- Capability discovery: `GET https://kat.imgnai.com/v1/x402`
- Optional wallet balance top-up: `POST https://kat.imgnai.com/v1/x402/top-up`
- Optional wallet balance/transaction checks: `GET https://kat.imgnai.com/v1/x402/balance/{walletAddress}` and `GET https://kat.imgnai.com/v1/x402/transactions/{walletAddress}`

Do not send `X-API-Key`, `X-API-Secret`, or `Authorization` when using x402. The payment itself authorizes that one request. imgnAI generates the canonical `request_id`; ordinary integrations should not ask users to invent request IDs or submit client-side request IDs.

Gas is included in the USDC amount. imgnAI estimates the Base or Solana settlement gas/network fee, converts that estimate to USDC, and includes that usually small settlement overhead in `amount` / `maxAmountRequired`. The user pays a single exact USDC amount. imgnAI pays the actual settlement gas from the oracle/facilitator wallet, so the service does not start work until the payment covers both the model price and the estimated settlement cost. For Solana, if the imgnAI payTo associated token account does not exist yet, the quoted gas can also include the rent/setup cost needed to create it.

### x402 setup and funding

x402 does not require an imgnAI web account or API key. The user needs a wallet with USDC on Base or Solana. For a one-request payment, send the API request without credentials, decode the returned `PAYMENT-REQUIRED` header, pay the exact Base or Solana `accepts` entry, and retry with `PAYMENT-SIGNATURE`.

For repeated calls or non-deterministic text/LLM usage, first top up an imgnAI x402 wallet balance by calling `POST https://kat.imgnai.com/v1/x402/top-up` with an `amount_usdc`, completing the same x402 payment flow, and then sending `X-Sign-In-With-X` on future requests. The topped-up balance is tied to the settled payer wallet, network, and USDC asset. Check it with `GET https://kat.imgnai.com/v1/x402/balance/{walletAddress}`.

x402 has two billing modes:

- Pay-as-you-go: use the normal 402 discovery and paid retry for one request. This remains supported for image, video, and text.
- Wallet balance: top up once with `POST /v1/x402/top-up`, then prove wallet ownership with `X-Sign-In-With-X` on future requests. imgnAI debits the stored USDC balance tied to that wallet. For text/LLM calls, imgnAI debits the reserve first and credits unused reserve back to the same wallet balance after actual token usage is known.

Use wallet balance for non-deterministic endpoints such as LLM/text completions. If you use one-request pay-as-you-go x402 for text, the request is charged for the full token allowance/reserve you give it, because the final token usage cannot be known before the payment is settled. Wallet balance lets the backend handle the reserve and refund internally without trusting caller-supplied wallet metadata.

### x402 request sequence

1. Build the normal JSON body for the image, video, or text request.
2. Send it without API credentials and without `PAYMENT-SIGNATURE`.
3. Expect HTTP `402 Payment Required`.
4. Read `PAYMENT-REQUIRED`. Decode it as Base64, then parse the decoded UTF-8 string as JSON.
5. Read `accepts`. Each entry is one supported payment option. imgnAI returns Base USDC first and Solana USDC second.
6. Choose exactly one `accepts` entry. Do not edit the amount, recipient, asset, network, or timeout.
7. Build the network-specific payment payload for that exact `accepts` entry.
8. Base64-encode the payment payload JSON.
9. Retry the exact same endpoint, method, query string, and JSON body with `PAYMENT-SIGNATURE: <base64-json-payment>`.
10. If payment verifies and settles, imgnAI starts inference and returns the normal API response. Successful responses include `PAYMENT-RESPONSE`, a Base64 JSON settlement receipt.

### PAYMENT-REQUIRED shape

The decoded requirement is a JSON object similar to:

```json
{
  "x402Version": 2,
  "error": "Payment required",
  "accepts": [
    {
      "scheme": "exact",
      "network": "eip155:8453",
      "asset": "0xBASE_USDC_CONTRACT",
      "payTo": "0xIMGNAI_BASE_ORACLE",
      "amount": "13938",
      "maxAmountRequired": "13938",
      "maxTimeoutSeconds": 3600,
      "extra": {
        "currency": "USDC",
        "decimals": 6,
        "chain": "base",
        "subtotalUsd": "0.010000",
        "estimatedGasUsd": "0.003938",
        "totalUsd": "0.013938"
      }
    },
    {
      "scheme": "exact",
      "network": "solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp",
      "asset": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
      "payTo": "IMGNAI_SOLANA_ORACLE_PUBKEY",
      "amount": "11500",
      "maxAmountRequired": "11500",
      "maxTimeoutSeconds": 3600
    }
  ]
}
```

All USDC values are atomic units with 6 decimals. For example, `"13938"` means `0.013938` USDC.

### Base x402 payment payload

Base uses network `eip155:8453`. imgnAI expects USDC payment using a signed EIP-3009-style `transferWithAuthorization` authorization. This is the smooth USDC flow where the payer signs typed data and imgnAI submits the transfer onchain from its oracle wallet.

To build the Base signature:

- Use the exact Base `accepts` entry returned in `PAYMENT-REQUIRED`.
- Use the payer wallet address as `authorization.from`.
- Use `accept.payTo` as `authorization.to`.
- Use `accept.amount` as `authorization.value`.
- Set `validAfter` to the current Unix timestamp or `0`.
- Set `validBefore` far enough ahead for the paid retry and settlement confirmation, commonly 5-10 minutes. `accept.maxTimeoutSeconds` is the paid HTTP response budget, not a generation completion guarantee.
- Generate a fresh 32-byte random `nonce` for every payment. Do not reuse it.
- Sign EIP-712 typed data for USDC `TransferWithAuthorization`.
- Domain: `name: "USD Coin"`, `version: "2"`, `chainId: 8453`, `verifyingContract: accept.asset`.
- Primary type: `TransferWithAuthorization`.
- Fields: `from`, `to`, `value`, `validAfter`, `validBefore`, `nonce`.

Base payment payload JSON before Base64 encoding:

```json
{
  "x402Version": 2,
  "accepted": {
    "scheme": "exact",
    "network": "eip155:8453",
    "asset": "0xBASE_USDC_CONTRACT",
    "payTo": "0xIMGNAI_BASE_ORACLE",
    "amount": "13938",
    "maxAmountRequired": "13938",
    "maxTimeoutSeconds": 3600
  },
  "payload": {
    "authorization": {
      "from": "0xPAYER_WALLET",
      "to": "0xIMGNAI_BASE_ORACLE",
      "value": "13938",
      "validAfter": 1778430000,
      "validBefore": 1778430120,
      "nonce": "0x32_BYTES_RANDOM_HEX"
    },
    "signature": "0xEIP712_SIGNATURE"
  }
}
```

PAYMENT-SIGNATURE for Base is:

```text
base64(json_utf8_string_of_the_payload_above)
```

Base retry example:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-base-transferwithauthorization-payment>' \
  -d '{"requests":[{"type":"image","model":"gen","prompt":"A precise studio image of a brushed steel cube","aspect_ratio":"1:1","output_format":"png"}]}'
```

### Solana x402 payment payload

Solana uses network `solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp` and native Solana USDC mint `EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v`. imgnAI expects a serialized VersionedTransaction that transfers the exact USDC amount from the payer's associated token account to the imgnAI oracle/payTo associated token account.

To build the Solana transaction:

- Use the exact Solana `accepts` entry returned in `PAYMENT-REQUIRED`.
- The USDC mint must be `accept.asset`.
- The token recipient owner must be `accept.payTo`.
- The token amount must equal `accept.amount` exactly.
- The transaction fee payer must be the imgnAI oracle/payTo public key. imgnAI co-signs the fee payer after validating the transfer.
- The customer signs as the USDC token owner. imgnAI will reject transactions that do not contain the required customer token-owner signature.
- If the recipient associated token account does not exist, include the associated-token-account creation instruction with the oracle/payTo as fee payer.
- Use a recent blockhash. If a public RPC reports `BlockhashNotFound`, rebuild and re-sign with a fresh finalized blockhash.
- Serialize the complete VersionedTransaction bytes and encode those bytes as Base64.

Solana payment payload JSON before Base64 encoding:

```json
{
  "x402Version": 2,
  "accepted": {
    "scheme": "exact",
    "network": "solana:5eykt4UsFv8P8NJdTREpY1vzqKqZKvdp",
    "asset": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "payTo": "IMGNAI_SOLANA_ORACLE_PUBKEY",
    "amount": "11500",
    "maxAmountRequired": "11500",
    "maxTimeoutSeconds": 3600
  },
  "payload": {
    "payer": "CUSTOMER_SOLANA_WALLET",
    "transaction": "<base64-serialized-VersionedTransaction>"
  }
}
```

PAYMENT-SIGNATURE for Solana is:

```text
base64(json_utf8_string_of_the_payload_above)
```

Solana retry example:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-solana-serialized-transaction-payment>' \
  -d '{"requests":[{"type":"image","model":"gen","prompt":"A precise studio image of a brushed steel cube","aspect_ratio":"1:1","output_format":"png"}]}'
```

### Text x402 example

Text x402 uses the same 402 discovery and paid retry flow as image/video. The only difference is the endpoint and body shape. Text calls are prepaid from a reserve estimate because the final prompt/completion token usage is only known after the model returns. x402 text calls must be non-streaming: omit `stream` or send `stream: false`. Streaming text is supported only with API-key account billing.

Discovery:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-haiku-4-5","messages":[{"role":"user","content":"Reply with exactly: text x402 ok"}],"max_tokens":16}'
```

Paid retry on Base or Solana:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-payment-for-selected-network>' \
  -d '{"model":"claude-haiku-4-5","messages":[{"role":"user","content":"Reply with exactly: text x402 ok"}],"max_tokens":16}'
```

### x402 wallet balance top-up

Wallet balance is optional and exists alongside one-request pay-as-you-go x402. Use it when a wallet will make multiple calls, or when a text/LLM call has a non-deterministic final cost.

Top-up discovery:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/x402/top-up' \
  -H 'Content-Type: application/json' \
  -d '{"amount_usdc":"1.00"}'
```

Decode `PAYMENT-REQUIRED`, choose Base or Solana, pay the exact returned amount, then retry the same top-up request with `PAYMENT-SIGNATURE`. The credited wallet balance is the requested `amount_usdc`; the paid amount can be higher because it includes settlement gas/network cost.

Top-up paid retry:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/x402/top-up' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-payment-for-selected-network>' \
  -d '{"amount_usdc":"1.00"}'
```

Successful top-up returns `wallet_address`, `network`, `asset`, `credited_usdc`, `paid_usdc`, `balance_usdc`, and transaction identifiers. The balance is keyed entirely by the payer wallet, network, and USDC asset observed from the settled payment.

### X-Sign-In-With-X wallet proof

Balance and transaction reads, and balance-funded generation/text calls, require `X-Sign-In-With-X`. This header proves ownership of the wallet without asking the caller to put wallet metadata in the JSON body.

`X-Sign-In-With-X` is Base64 JSON:

```json
{
  "network": "eip155:8453",
  "address": "0xPAYER_WALLET",
  "message": "imgnAI X-Sign-In-With-X\nWallet: 0xPAYER_WALLET\nNetwork: eip155:8453\nNonce: <random>\nIssued At: <unix time>",
  "signature": "0xPERSONAL_SIGN_SIGNATURE",
  "issued_at": 1778711400
}
```

For Base, sign the UTF-8 message with ordinary EIP-191/personal-sign semantics. For Solana, sign the UTF-8 message bytes and send the Solana signature as base58, base64, or hex. The message must include `imgnAI` and the wallet address. `issued_at` must be a recent Unix timestamp; stale wallet proofs are rejected.

Balance read:

```bash
curl 'https://kat.imgnai.com/v1/x402/balance/0xPAYER_WALLET' \
  -H 'X-Sign-In-With-X: <base64-json-wallet-proof>'
```

Transactions read:

```bash
curl 'https://kat.imgnai.com/v1/x402/transactions/0xPAYER_WALLET' \
  -H 'X-Sign-In-With-X: <base64-json-wallet-proof>'
```

Balance-funded text request:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -H 'X-Sign-In-With-X: <base64-json-wallet-proof>' \
  -d '{"model":"qwen3-6-flash","messages":[{"role":"user","content":"Reply with exactly: wallet balance ok"}],"max_tokens":64}'
```

If the wallet balance is too low, imgnAI returns HTTP 402 with a top-up payment requirement for the shortfall. The caller can top up and then retry the same request with `X-Sign-In-With-X`, or omit `X-Sign-In-With-X` and use the ordinary one-request pay-as-you-go x402 flow.

### Video x402 example

Video x402 uses `/v1/generation-requests` and the ordinary video request body. Model media rules still apply, so do not mix first-frame/last-frame/reference/audio slots in combinations rejected by the selected video model.

Discovery:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -d '{"requests":[{"type":"video","model":"seedance-2-0-fast","prompt":"A five second smooth dolly shot of a glass sculpture on a table","duration_seconds":5,"aspect_ratio":"16:9"}]}'
```

Paid retry on Base or Solana:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'PAYMENT-SIGNATURE: <base64-json-payment-for-selected-network>' \
  -d '{"requests":[{"type":"video","model":"seedance-2-0-fast","prompt":"A five second smooth dolly shot of a glass sculpture on a table","duration_seconds":5,"aspect_ratio":"16:9"}]}'
```

### x402 implementation notes for agents

- Decode `PAYMENT-REQUIRED` from the response header, not from scraped HTML.
- Preserve the exact request body between the unpaid discovery call and the paid retry.
- Use the returned amount from the selected `accepts` entry. Never hard-code prices from examples.
- Choose Base by default if the user does not request a network, because it is listed first.
- Choose Solana when the user wants to pay from a Solana wallet or when a Solana x402 client is already available.
- A 402 response with `PAYMENT-REQUIRED` means no inference work has started yet.
- A successful paid response includes `PAYMENT-RESPONSE`; decode it for settlement metadata such as transaction hash/signature, network, and paid amount.
- Image/video responses should be treated as asynchronous. Submit with `wait=false`, then poll `GET /v1/generation-requests/{request_id}` until the status is `completed`, `partial_failure`, or `failed`.
- x402 payments are one-request payments. Do not reuse a Base nonce or Solana transaction for another request.
- x402 wallet balance is optional and additive. It does not replace one-request pay-as-you-go x402.
- Use wallet balance for LLM/text calls when the final token usage may be lower than the caller's maximum token allowance, because unused reserve can be credited back to the wallet balance.
- If a payment is invalid, expired, for the wrong amount, wrong recipient, wrong asset, or wrong network, imgnAI returns a 402-style payment error and does not start inference.

## Billing Path Examples

imgnAI supports three independent billing paths. Pick exactly one per request:

- Credits/API key: standard account billing. Use `Authorization: Bearer <api_key>:<api_secret>` for text, or `X-API-Key` plus `X-API-Secret` for image/video.
- Direct x402 pay-as-you-go: no API key, no wallet balance. Send the request once without credentials, receive `402` + `PAYMENT-REQUIRED`, pay the selected `accepts` entry, then retry immediately with `PAYMENT-SIGNATURE` or the compatibility alias `X-402-Payment`.
- x402 wallet balance: top up with `POST /v1/x402/top-up`, then send `X-Sign-In-With-X` on inference calls. This debits the wallet balance tied to the settled payer wallet and can refund unused text reserve back to that balance.

### Credits / API key examples

Text with credits:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer <api_key>:<api_secret>' \
  -d '{"model":"grok-4-3","messages":[{"role":"user","content":"Reply with exactly: imgnAI credits ok"}],"max_tokens":64}'
```

Image/video with credits:

```bash
curl -i -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: <api_key>' \
  -H 'X-API-Secret: <api_secret>' \
  -d '{"requests":[{"type":"image","model":"gpt-image-2","prompt":"A clean product photo of a matte black espresso cup","aspect_ratio":"1:1","output_format":"png"}]}'
```

Current account credit balance:

```bash
curl -i 'https://kat.imgnai.com/v1/me/balance' \
  -H 'Authorization: Bearer <api_key>:<api_secret>'
```

`GET /v1/me/balance` returns the current API-account credit balance for the authenticated user:

```json
{
  "credits": "1234.5"
}
```

Balance Service stores account balance in 10x service credits. The API returns visible imgnAI credits as a decimal string, so a Balance Service balance of `2000` is returned as `"200.0"`.

This is separate from x402 wallet balance. Use `/v1/me/balance` for API-key/credit accounts, and `/v1/x402/balance/{walletAddress}` for topped-up x402 wallet balances.

### Direct x402 pay-as-you-go function

Use this when each request should pay immediately and independently. This flow does not read or spend wallet balance even if the same wallet has previously topped up.

```js
async function callKatanaWithDirectX402(url, body, buildPaymentHeader) {
  const discovery = await fetch(url, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(body),
  });
  if (discovery.status !== 402) return discovery;
  const encoded = discovery.headers.get('PAYMENT-REQUIRED');
  const requirements = JSON.parse(Buffer.from(encoded, 'base64').toString('utf8'));
  const accept = requirements.accepts[0]; // Base is listed first by default.
  const paymentHeader = await buildPaymentHeader(accept);
  return fetch(url, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'PAYMENT-SIGNATURE': paymentHeader,
    },
    body: JSON.stringify(body),
  });
}
```

`buildPaymentHeader(accept)` must build the Base EIP-3009 `transferWithAuthorization` payload or Solana serialized transaction described in the x402 section, then Base64-encode the payment JSON. Do not edit `accept.amount`, `accept.payTo`, `accept.asset`, or `accept.network`.

Direct x402 text example (non-streaming only):

```js
const response = await callKatanaWithDirectX402('https://kat.imgnai.com/v1/chat/completions', {"model":"grok-4-3","messages":[{"role":"user","content":"Reply with exactly: imgnAI credits ok"}],"max_tokens":64}, buildPaymentHeader);
const data = await response.json();
```

### x402 wallet-balance functions

Use this for non-deterministic text/LLM calls or for wallets that will make repeated calls. The top-up itself uses the same direct x402 payment header. Later inference calls use only `X-Sign-In-With-X` and spend from the wallet balance.

```js
async function topUpKatanaX402Balance(baseUrl, amountUsdc, buildPaymentHeader) {
  const body = { amount_usdc: String(amountUsdc) };
  const discovery = await fetch(`${baseUrl}/v1/x402/top-up`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(body),
  });
  if (discovery.status !== 402) return discovery;
  const requirements = JSON.parse(Buffer.from(discovery.headers.get('PAYMENT-REQUIRED'), 'base64').toString('utf8'));
  const paymentHeader = await buildPaymentHeader(requirements.accepts[0]);
  return fetch(`${baseUrl}/v1/x402/top-up`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json', 'PAYMENT-SIGNATURE': paymentHeader },
    body: JSON.stringify(body),
  });
}

async function callKatanaWithWalletBalance(url, body, signInWithXHeader) {
  return fetch(url, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-Sign-In-With-X': signInWithXHeader,
    },
    body: JSON.stringify(body),
  });
}

async function readKatanaX402Balance(baseUrl, walletAddress, signInWithXHeader) {
  return fetch(`${baseUrl}/v1/x402/balance/${walletAddress}`, {
    headers: { 'X-Sign-In-With-X': signInWithXHeader },
  });
}
```

Wallet-funded text example:

```js
await topUpKatanaX402Balance('https://kat.imgnai.com', '1.00', buildPaymentHeader);
const response = await callKatanaWithWalletBalance('https://kat.imgnai.com/v1/chat/completions', {"model":"grok-4-3","messages":[{"role":"user","content":"Reply with exactly: imgnAI credits ok"}],"max_tokens":64}, signInWithXHeader);
const data = await response.json();
```

## Retargeting Venice-Style x402 Scripts to imgnAI

If a script currently targets Venice's API and the user wants to retarget it to imgnAI, do not blindly replace only the hostname. Adjust endpoints, authentication, model ids, request bodies, and response parsing.

Common direct replacements:

- Base URL: replace `https://api.venice.ai/api/v1` with `https://kat.imgnai.com/v1`.
- Chat completions: Venice `/api/v1/chat/completions` maps to imgnAI `/v1/chat/completions` with OpenAI-style `messages` and imgnAI model ids such as `qwen3-6-flash`.
- x402 top-up: Venice `/api/v1/x402/top-up` maps to imgnAI `/v1/x402/top-up`. imgnAI accepts a JSON body such as `{ "amount_usdc": "1.00" }`.
- x402 balance: Venice `/api/v1/x402/balance/{walletAddress}` maps to imgnAI `/v1/x402/balance/{walletAddress}`.
- x402 transactions: Venice `/api/v1/x402/transactions/{walletAddress}` maps to imgnAI `/v1/x402/transactions/{walletAddress}`.
- Wallet auth: Venice-style scripts that already produce `X-Sign-In-With-X` can keep that header concept, but the message must be scoped to imgnAI and include the wallet address plus a recent `issued_at`/Issued At timestamp.
- x402 payment header: imgnAI prefers `PAYMENT-SIGNATURE`. The compatibility alias `X-402-Payment` is accepted for clients already using that name.
- Response wrappers: Venice balance/top-up responses commonly wrap values under `success`/`data`. imgnAI returns direct JSON fields such as `wallet_address`, `balance_usdc`, `credited_usdc`, `paid_usdc`, and `transactions`.
- Billing behavior: Venice-style wallet-balance inference maps to imgnAI requests with `X-Sign-In-With-X`. imgnAI also supports direct one-request x402 by omitting `X-Sign-In-With-X` and using the 402 discovery + paid retry flow.
- Image/video generation: imgnAI uses `/v1/generation-requests?wait=false` with a `requests` array, then polling by `request_id`. Do not reuse Venice image/video request bodies without reshaping them.
- Embeddings: do not retarget Venice `/api/v1/embeddings` calls unless imgnAI exposes an embeddings model/endpoint in the current `/v1/models` output. This Katana API primarily exposes chat completions plus image/video generation.

Minimal retarget helper:

```js
function retargetVeniceUrlToKatana(url) {
  return url.replace('https://api.venice.ai/api/v1', 'https://kat.imgnai.com/v1');
}

function normalizeVeniceX402HeadersForKatana(headers) {
  const next = { ...headers };
  if (next['X-402-Payment'] && !next['PAYMENT-SIGNATURE']) {
    next['PAYMENT-SIGNATURE'] = next['X-402-Payment'];
  }
  return next;
}
```


## Main Generation Endpoint

Submit generation jobs to:

```http
POST /v1/generation-requests?wait=false
```

Use `wait=false` as the default integration pattern for image and video. It returns quickly with `request_id`, `status`, and `poll_after_seconds`, avoiding long-held HTTP connections through proxies such as Cloudflare. Then poll:

```http
GET /v1/generation-requests/{request_id}
```

Continue polling until `status` is `completed`, `partial_failure`, or `failed`. Include the same API credentials on polling requests when using account billing. x402 polling may use the unguessable `request_id` without API credentials.

`wait=true` remains available as a convenience for short direct-origin image/video calls, but production image/video integrations should prefer polling. API-key Text/LLM calls on `/v1/chat/completions` are normally short enough to hold the connection or use streaming. x402 text calls must be non-streaming.

If the submission response itself is terminal (`status: "failed"`, `status: "rejected"`, or all response items rejected), do not keep polling. Report the returned `responses[].error` or top-level error to the user. Rejected validation items are not charged.

The top-level request body is always a JSON object with:

- `requests`: array of generation items. Each item is either `type: "image"` or `type: "video"`.

Do not send a caller-defined request id. imgnAI generates and returns `request_id`. Use top-level `output_format` for image output format; do not send an `output` object for ordinary integrations.

## Image / Video Response Format

Image and video results are returned under `responses[].output_assets[]`. Do not read `results[].url`; that is not the Katana response shape.

Completed poll response example:

```json
{
  "request_id": "0f3f0ab9-2d2c-4d99-a7cc-76cf82735246",
  "status": "completed",
  "created_at": "2026-05-13T12:59:15+00:00",
  "updated_at": "2026-05-13T13:00:02+00:00",
  "requests": [
    {
      "type": "image",
      "model": "gpt2image",
      "prompt": "A clean product photo of a ceramic mug",
      "aspect_ratio": "1:1",
      "output_format": "png"
    }
  ],
  "responses": [
    {
      "job_id": "8c92b087-8d6d-4ad2-9b18-3f296568dcfd",
      "type": "image",
      "status": "completed",
      "output_assets": [
        {
          "kind": "image",
          "mime_type": "image/png",
          "url": "https://k.imgnai.com/preview-or-default.png",
          "original_data_url": "https://k.imgnai.com/full-size-original.png",
          "thumbnail_image_url": "https://k.imgnai.com/thumbnail-preview.jpg",
          "width": 1536,
          "height": 1024,
          "metadata": {
            "tags": [
              {
                "tag": "ceramic_mug",
                "confidence": 0.94
              },
              {
                "tag": "table",
                "confidence": 0.81
              }
            ]
          },
          "expires_at": "2026-05-16T13:00:02+00:00"
        }
      ],
      "started_at": "2026-05-13T12:59:16+00:00",
      "completed_at": "2026-05-13T13:00:02+00:00",
      "metadata": {
        "model": "gpt2image",
        "credits_spent": "3"
      }
    }
  ],
  "poll_after_seconds": null,
  "metadata": {
    "credits_spent": "3"
  }
}
```

Response handling rules:

- For final user delivery, prefer `responses[].output_assets[].original_data_url` when present. It is the full-size/full-quality asset URL.
- Use `responses[].output_assets[].url` for previews and as the fallback when `original_data_url` is absent.
- If present, `responses[].output_assets[].thumbnail_image_url` is a smaller thumbnail/preview image for the completed image or video asset. Use it for gallery previews, not as the full-quality deliverable.
- `responses[].output_assets[].metadata.tags`, when present, contains description keywords for the generated content derived from a CLIP tagger. Each tag includes a confidence score. This is only available on in-house imgnAI image/video models; external/provider-hosted models should be treated as returning no CLIP-tag metadata.
- For video outputs, `responses[].output_assets[].thumbnail_silent_video_mp4_url` may be present. This is a short/silent lightweight MP4 preview thumbnail suited to galleries and hover previews. It is not the full video; always use `original_data_url` for the real/full-quality video.
- For video outputs, `responses[].output_assets[].final_frame_image_url` is returned when the backend provides a final frame still image. `thumbnail_silent_video_mp4_url` and `final_frame_image_url` are returned as blank strings when unavailable.
- Use `final_frame_image_url` to extend a generated video: send it as the first-frame input for the next video job, then stitch the original and continuation clips together outside the API.
- Report actual output dimensions from completed assets: `responses[].output_assets[].width`, `height`, and for video `duration_seconds`. Do not treat submission or preview dimensions as authoritative final dimensions.
- In completed media responses, `requests[].model` and `responses[].metadata.model` use canonical model keys, not necessarily the public model name sent by the caller. Use `/v1/models` to map `model_key` values to public names and display names.
- `expires_at` tells you when the temporary API asset URL is expected to expire. Download or store the asset elsewhere if the user needs permanence.
- `responses[].started_at` and `responses[].completed_at` are item-level processing timestamps.


## Text / LLM Chat Completion Basics

Text models are exposed through an OpenAI-compatible chat completions endpoint. Existing OpenAI SDK integrations can point their base URL at this service and use a single combined bearer credential in the form `<api_key>:<api_secret>`.

```http
POST /v1/chat/completions
Authorization: Bearer <api_key>:<api_secret>
Content-Type: application/json
```

Minimal text chat example:

```json
{
  "model": "grok-4-3",
  "messages": [
    {
      "role": "system",
      "content": "You are a concise assistant."
    },
    {
      "role": "user",
      "content": "Write a three bullet launch checklist for a product API."
    }
  ],
  "max_tokens": 16000,
  "temperature": 1
}
```

Compatibility notes:

- Preferred request shape is OpenAI-style `messages`. For clients that send a single string `prompt`, the API converts it into one user message when `messages` is omitted.
- `stream: true` is supported with Server-Sent Events for API-key account billing. x402 text calls currently must omit `stream` or send `stream: false`.
- For streaming API-key calls, the API forwards chunks as `data: ...`, rewrites the chunk model name to the public imgnAI model name, requests usage in the final stream chunk, then finalizes billing before `data: [DONE]`.
- If a streaming provider does not return usage in the stream, imgnAI keeps the pre-charge reserve instead of refunding based on an unknown cost. Set `max_tokens` or `max_completion_tokens` to keep the reserve predictable.

Streaming text chat example:

```json
{
  "model": "grok-4-3",
  "messages": [
    {
      "role": "user",
      "content": "Write one concise sentence about API integration."
    }
  ],
  "max_tokens": 16000,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}
```

Vision chat example with a Base64 data URL. The API accepts HTTPS image URLs, full `data:image/...;base64,...` data URLs, or raw base64 image strings in `image_url.url`. Do not send local file paths or `file://` URLs. Base64 image inputs are converted to JPEG, capped to a maximum side of 4096px, and kept at the original aspect ratio before inference.

```json
{
  "model": "grok-4-3",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Describe this image for an accessibility caption."
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,<BASE64_ENCODED_IMAGE_BYTES>"
          }
        }
      ]
    }
  ],
  "max_tokens": 16000
}
```

Text billing rules:

- With API-key account billing and x402 wallet-balance billing, text final cost is computed from actual provider usage after the response returns, using prompt/input tokens and completion/output tokens.
- Direct one-request x402 text calls are prepaid from the quoted reserve because final token usage is not known before settlement.
- Published token prices include imgnAI's 10% service markup and are shown in credits using the Platinum Annual reference price.
- Every call has a minimum final charge of 0.1 credits and is rounded up to the nearest 0.1 credit.
- For API-key account billing, the API pre-charges a reserve before dispatch so users cannot consume compute they cannot pay for. The normal minimum reserve is 10 credits; requests with a large `max_tokens` or `max_completion_tokens` can reserve more.
- After the provider returns usage for API-key or x402 wallet-balance calls, the API refunds unused reserve or charges overage before recording final usage. Wallet-balance refunds return to the same x402 wallet balance.
- If neither `max_tokens` nor `max_completion_tokens` is supplied and the selected model supports output caps, the API defaults to 16000 max output tokens so the reserve is meaningful.
- For API-key billing, `usage.imgnai` includes credits reserved, credits charged, credits refunded, privacy mode, public model name, and `billing_source`.
- For x402 billing, `usage.imgnai` includes payment or wallet metadata such as `paid_usdc`, `wallet_balance_after_usdc`, and `billing_source`; account credit reserve fields are omitted.

Text privacy labels:

- `Anonymized`: customer account identity is not sent with the inference request. The model operator may process prompt content.
- `E2EE Private`: the request is routed through hardware-protected confidential computing. Private model metadata lists Intel TDX and NVIDIA Confidential Computing where applicable, and these models expose privacy proof metadata through `/v1/text/attestation` where supported.
- Public imgnAI docs intentionally do not expose the internal inference provider used for a given text model.

Privacy proof example for private text models:

```bash
curl 'https://kat.imgnai.com/v1/text/attestation?model=kimi-k2-6-private&nonce=<64_hex_nonce>' \
  -H 'Authorization: Bearer <api_key>:<api_secret>'
```

## Image Generation Basics

Image generation items use `type: "image"`. Required fields are normally `model`, `prompt`, and `aspect_ratio`. Some edit/reference models also accept input images. Use the public model name listed in this document, not the internal backend model name.

Minimal text-to-image example:

```json
{
  "requests": [
    {
      "type": "image",
      "model": "gpt-image-2",
      "prompt": "A clean product image of a ceramic mug on a table",
      "aspect_ratio": "1:1",
      "output_format": "png"
    }
  ]
}
```

Image edit/reference example with HTTPS image URLs:

```json
{
  "requests": [
    {
      "type": "image",
      "model": "gpt-image-2",
      "prompt": "Use the reference image as the product and place it on a clean studio table with soft daylight.",
      "aspect_ratio": "1:1",
      "output_format": "png",
      "image_urls": [
        "https://example.com/reference-image.png"
      ]
    }
  ]
}
```

Image edit/reference example with Base64 data URLs:

```json
{
  "requests": [
    {
      "type": "image",
      "model": "gpt-image-2",
      "prompt": "Use the reference image as the product and place it on a clean studio table with soft daylight.",
      "aspect_ratio": "1:1",
      "output_format": "png",
      "image_urls": [
        "data:image/png;base64,<BASE64_ENCODED_IMAGE_BYTES>"
      ]
    }
  ]
}
```

Image attachment rules:

- `image_urls` is the preferred compatibility field for source/reference images. `input_images` and `input_image_urls` are accepted aliases.
- `image_urls`, `input_images`, and `input_image_urls` may be a list of HTTPS URLs, data URLs, or raw base64 strings.
- `image_url`, `input_image_url`, `input_image`, and `input_image_b64` may provide a single source image.
- Do not send local filesystem paths, `file://` URLs, Telegram attachment paths, or private machine paths. Convert local files to `data:image/...;base64,...` before submitting.
- Do not send `http://` media URLs. Use HTTPS URLs or Base64 data URLs.
- Data URL format should look like `data:image/png;base64,<BASE64_BYTES>` or `data:image/jpeg;base64,<BASE64_BYTES>`.
- Raw base64 image strings are also accepted; the API wraps them as image data internally.
- Corrupt or unreadable base64 image data is rejected before dispatch.
- For large Base64 payloads, write the JSON body to a temporary file and call curl with `--data-binary @payload.json` or `-d @payload.json` instead of putting megabytes of Base64 directly in shell arguments.
- `aspect_ratio` accepts strings such as `1:1`, `16:9`, `9:16`, and `21:9`; `aspect_ratio: "auto"` inspects the first image input and chooses the closest model-supported ratio. If no image is supplied, AUTO defaults to `1:1` when the selected model supports it.
- `output_format` accepts `png`, `jpeg`, or `webp`. `jpg` is accepted as an alias for `jpeg`.
- Use only aspect ratios listed for the selected model. External API models often support a smaller aspect set than imgnAI-hosted models.
- On imgnAI-hosted image models only, `is_fast`/`fast_mode` request lower-cost half-resolution generation and `is_uhd`/`uhd_mode` request UHD generation. `is_uhd` takes precedence when both are sent.
- On supported tag/booru-based image models only, prompt assist fields such as `use_assistant`, `prompt_assist`, or `use_prompt_assist` let users write natural language that is translated to tag-style prompts before dispatch.

## Video Generation Basics

Video generation items use `type: "video"`. Required fields are normally `model`, `prompt`, `duration_seconds`, and `aspect_ratio`. Media inputs go in `video_image_data`; older compatibility fields such as `image_url` or `input_image_url` can map to first-frame image input, but new integrations should prefer `video_image_data`.

Minimal text-to-video example:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "A calm cinematic shot of a ceramic mug on a table",
      "duration_seconds": 5,
      "aspect_ratio": "16:9"
    }
  ]
}
```

Video with first-frame image URL:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Animate the scene from the first frame with a slow cinematic camera push and natural motion.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "first_frame_image_url": "https://example.com/first-frame.png"
      }
    }
  ]
}
```

Video with first-frame Base64 data URL:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Animate the scene from the first frame with a slow cinematic camera push and natural motion.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "first_frame_image_url": "data:image/png;base64,<BASE64_ENCODED_FIRST_FRAME>"
      }
    }
  ]
}
```

Video with first frame and last frame, for models where `supports_first_frame` and `supports_last_frame` are both true:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Create a smooth transformation from the first frame to the last frame with consistent subject identity.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "first_frame_image_url": "https://example.com/first-frame.png",
        "last_frame_image_url": "https://example.com/last-frame.png"
      }
    }
  ]
}
```

Video with reference images, for models where `supports_reference_images` is true:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Create a video using the uploaded references for subject identity, wardrobe, and setting.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "reference_image_urls": [
          "https://example.com/character-reference.png",
          "data:image/png;base64,<BASE64_ENCODED_PROP_REFERENCE>"
        ]
      }
    }
  ]
}
```

Video with reference image assets, using `reference_assets` instead of directly writing `video_image_data.reference_image_urls`:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Generate a short ad using the character reference and the environment reference.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "reference_assets": [
        {
          "kind": "reference_image",
          "url": "https://example.com/person.png"
        },
        {
          "kind": "image",
          "base64_data": "data:image/png;base64,<BASE64_ENCODED_ENVIRONMENT>"
        }
      ]
    }
  ]
}
```

Video with audio references, for models where `supports_audio_input` is true:

```json
{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Create a cinematic video and use the attached audio as the voice or sound reference.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "reference_image_urls": [
          "https://example.com/speaker-reference.png"
        ],
        "audio_input_urls": [
          "https://example.com/reference-voice.mp3"
        ]
      }
    }
  ]
}
```

## Complete HTTP cURL Examples

Text-to-image cURL example:

```bash
curl -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: <api_key>' \
  -H 'X-API-Secret: <secret_key>' \
  -d '{
  "requests": [
    {
      "type": "image",
      "model": "gpt-image-2",
      "prompt": "Use the reference image as the product and place it on a clean studio table with soft daylight.",
      "aspect_ratio": "1:1",
      "output_format": "png",
      "image_urls": [
        "https://example.com/reference-image.png"
      ]
    }
  ]
}'

# Then poll using the request_id returned above:
curl 'https://kat.imgnai.com/v1/generation-requests/<request_id>' \
  -H 'X-API-Key: <api_key>' \
  -H 'X-API-Secret: <secret_key>'
```

Video with reference images cURL example:

```bash
curl -X POST 'https://kat.imgnai.com/v1/generation-requests?wait=false' \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: <api_key>' \
  -H 'X-API-Secret: <secret_key>' \
  -d '{
  "requests": [
    {
      "type": "video",
      "model": "seedance-2-0",
      "prompt": "Create a video using the uploaded references for subject identity, wardrobe, and setting.",
      "duration_seconds": 5,
      "aspect_ratio": "16:9",
      "video_image_data": {
        "reference_image_urls": [
          "https://example.com/character-reference.png",
          "data:image/png;base64,<BASE64_ENCODED_PROP_REFERENCE>"
        ]
      }
    }
  ]
}'

# Then poll using the request_id returned above:
curl 'https://kat.imgnai.com/v1/generation-requests/<request_id>' \
  -H 'X-API-Key: <api_key>' \
  -H 'X-API-Secret: <secret_key>'
```

Base64 note for curl examples: replace placeholders such as `data:image/png;base64,<BASE64_ENCODED_IMAGE_BYTES>` with a real data URL. If creating the request from a local file, read the bytes, base64 encode them, and prefix with the correct media type, for example `data:image/png;base64,`. For large images, write the JSON payload to a temp file and submit with `curl --data-binary @payload.json` or `curl -d @payload.json`.

Video media input rules:

- Put video media inputs in `video_image_data`.
- `video_image_data.first_frame_image_url`: first/source frame image. Accepted values are HTTPS URL, data URL, or raw base64 image string. Top-level `image_url`, `input_image_url`, `input_image`, and `input_image_b64` are accepted compatibility aliases for this slot.
- `video_image_data.mid_frame_image_url`: mid-frame image, only if the model supports mid-frame input.
- `video_image_data.last_frame_image_url`: last/end frame image, only if the model supports last-frame input.
- `video_image_data.reference_image_urls`: array of reference images, only if the model supports reference images. Top-level `reference_image_urls` is accepted as a compatibility alias. Obey `maximum_reference_images`.
- `video_image_data.audio_input_urls`: array of audio reference URLs, only if the model supports audio input. Obey `maximum_reference_audio_files` and the global cap of 4.
- `reference_assets` can also provide media. Image kinds `style_reference`, `reference_image`, and `image` map to video reference images. Audio kinds `audio`, `source_audio`, `reference_audio`, and `audio_reference` map to audio reference inputs.
- Do not send local filesystem paths, `file://` URLs, Telegram attachment paths, private machine paths, or `http://` media URLs in video media fields. Use HTTPS URLs or Base64 data URLs.
- Do not send audio references to models where `supports_audio_input` is false. The API rejects unsupported audio input before dispatch.
- Do not mix first/last frame inputs with reference images unless the selected model's custom rules allow that combination.
- Use only durations listed in `video_lengths_and_costs` for the selected model.
- Use only aspect ratios listed in `supported_aspects` for the selected model. `aspect_ratio: "auto"` uses the first frame first, then the first reference image. If neither exists, AUTO defaults to `1:1` when supported by the selected model.
- `audio_gen_model: false` means the generated video is silent/no-audio. Most modern video models generate audio by default, so this document calls out silent models as the exception.

## Video Custom Rule Glossary

Always inspect each selected video model's `custom_rules` before composing a request. These rules are model-specific compatibility constraints enforced by the API before dispatch.

- `audio_15s_max`: Combined selected audio input is limited to 15 seconds. Keep audio references short.
- `audio_drives_duration`: The requested video duration follows the selected audio duration. Prefer matching the prompt and expected motion to the audio clip.
- `audio_ff_only`: Audio input can only be used with first-frame conditioning. Do not combine audio input with mid-frame, last-frame, or reference-image conditioning.
- `audio_needs_reference_image`: Audio input requires at least one reference image.
- `audio_or_fflf_exclusive`: Audio input cannot be combined with first-frame or last-frame inputs. Choose audio conditioning or frame conditioning.
- `lf_needs_ff`: A last-frame image requires a first-frame image. Do not send a last frame by itself.
- `reference_ff_only`: Reference images may be combined with first-frame input, but not with last-frame input.
- `reference_is_voice_timbre`: When reference images are present, reference audio is interpreted as voice timbre rather than a full audio track.
- `reference_no_ff_or_lf`: Reference images cannot be combined with first-frame or last-frame inputs. Use either reference images, or first/last frame conditioning, but not both.

## Common Failure Cases and How to Avoid Them

- Unsupported aspect ratio: choose a value from the selected model's supported aspect list.
- Unsupported duration: choose one of the selected video model's `video_lengths_and_costs` keys.
- Corrupt base64 image: validate that the data decodes to an actual image before submitting.
- Local file path supplied as media: convert it to a Base64 data URL before submission; the API cannot fetch paths on the caller's machine.
- Audio reference on unsupported model: only send `audio_input_urls` when `supports_audio_input` is true.
- Reference images mixed with first/last frame on incompatible video model: check `custom_rules`, especially `reference_no_ff_or_lf`.
- Last frame without first frame: prohibited on models with `lf_needs_ff`.
- Too many reference images: clamp to `multi_image_inputs_allowed` for image models or `maximum_reference_images` for video models.

## Model Selection Checklist for an LLM

1. Decide whether the user needs text, an image, or a video.
2. For text, use `/v1/chat/completions`; for image/video, use `/v1/generation-requests`.
3. Pick a model whose public model name, supported input types/aspects, privacy mode, and media capabilities match the requested workflow.
4. Use the public model name shown as `Model name to send`.
5. For image/video, choose an allowed `aspect_ratio`.
6. For video, choose an allowed `duration_seconds`.
7. Add only media fields supported by that model.
8. Apply every listed `custom_rule` for video models.
9. If using local image files, convert them to data URLs (`data:image/...;base64,...`) before sending.
10. Submit text to `/v1/chat/completions`; submit image/video to `/v1/generation-requests?wait=false`, save the returned `request_id`, then poll `GET /v1/generation-requests/{request_id}` until completion.

## Compact Model Catalog

Use this catalog to choose model IDs, pricing, aspect ratios, durations, media slots, and privacy capabilities without embedding every per-model sample. Fetch `GET /v1/models` for the full current machine-readable catalog and model-card metadata.

## Text Models

### Grok 4.3 (`grok-4-3`)

- Model name to send: `grok-4-3`
- Publisher: xAI
- Release date: 2026-04-30
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: not listed
- Supported input types: text, image
- Cost: 264.5 credits / 1M input tokens (US$1.38); 528.9 credits / 1M output tokens (US$2.75)
- Cache Read/Write Cost: Read: 42.4 credits / 1M cached input tokens (US$0.2200)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 264.5 credits per 1M input tokens.
- Output price: 528.9 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual accuracy. Reasoning can be configured between none/low/medium/high (default low) effort levels. It supports a 1 million token context window with no output token limit, making it well-suited for long-document analysis, deep research, and multi-step agentic tasks. Pricing is tiered: requests exceeding 200k total tokens are billed at a higher rate.

### Qwen3.6 35B A3B (`qwen3-6-35b-a3b`)

- Model name to send: `qwen3-6-35b-a3b`
- Publisher: Qwen
- Release date: 2026-04-27
- Privacy: Anonymized
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text, image, video
- Cost: 31.8 credits / 1M input tokens (US$0.1650); 211.6 credits / 1M output tokens (US$1.10)
- Cache Read/Write Cost: Read: 10.6 credits / 1M cached input tokens (US$0.0550)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 31.8 credits per 1M input tokens.
- Output price: 211.6 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Video input, Tool calling, Structured output, Long context
- Description: Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture combining Gated DeltaNet linear attention with standard gated attention layers, enabling efficient inference at a fraction of the compute cost. The model supports a 262K token native context window (extensible to 1M via YaRN) and accepts text, image, and video inputs. It includes integrated thinking mode with reasoning traces preserved across multi-turn conversations, function calling, and structured output. Released under the Apache 2.0 license.

### Qwen3.6 Flash (`qwen3-6-flash`)

- Model name to send: `qwen3-6-flash`
- Publisher: Qwen
- Release date: 2026-04-27
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: 65536
- Supported input types: text, image, video
- Cost: 52.9 credits / 1M input tokens (US$0.2750); 317.4 credits / 1M output tokens (US$1.65)
- Cache Read/Write Cost: Write: 66.2 credits / 1M cache write tokens (US$0.3438)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 52.9 credits per 1M input tokens.
- Output price: 317.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Video input, Tool calling, Structured output, Long context
- Description: Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window. Tiered pricing kicks in above 256K tokens. Prompt caching is supported, with both explicit cache read and cache creation pricing.

### Qwen3.6 Max Preview (`qwen3-6-max-preview`)

- Model name to send: `qwen3-6-max-preview`
- Publisher: Qwen
- Release date: 2026-04-27
- Privacy: Anonymized
- Context length: 262144 tokens
- Max output tokens: 65536
- Supported input types: text
- Cost: 220.0 credits / 1M input tokens (US$1.14); 1320.0 credits / 1M output tokens (US$6.86)
- Cache Read/Write Cost: Write: 275.0 credits / 1M cache write tokens (US$1.43)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 220.0 credits per 1M input tokens.
- Output price: 1320.0 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters. It is optimized for agentic coding, tool use, and long-context reasoning, supporting a 262K token context window. The model includes an integrated thinking mode that preserves reasoning traces across multi-turn conversations and supports structured output and function calling. Access is available exclusively through the Alibaba Cloud Model Studio and Qwen Studio APIs; no open weights are provided.

### DeepSeek V4 Flash (`deepseek-v4-flash`)

- Model name to send: `deepseek-v4-flash`
- Publisher: DeepSeek
- Release date: 2026-04-24
- Privacy: Anonymized
- Context length: 1048576 tokens
- Max output tokens: 384000
- Supported input types: text
- Cost: 29.7 credits / 1M input tokens (US$0.1540); 59.3 credits / 1M output tokens (US$0.3080)
- Cache Read/Write Cost: Read: 0.6 credits / 1M cached input tokens (US$0.0031)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 29.7 credits per 1M input tokens.
- Output price: 59.3 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance. The model includes hybrid attention for efficient long-context processing. Reasoning efforts high and xhigh are supported; xhigh maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

### DeepSeek V4 Pro (`deepseek-v4-pro`)

- Model name to send: `deepseek-v4-pro`
- Publisher: DeepSeek
- Release date: 2026-04-24
- Privacy: Anonymized
- Context length: 1048576 tokens
- Max output tokens: 384000
- Supported input types: text
- Cost: 92.1 credits / 1M input tokens (US$0.4785); 184.1 credits / 1M output tokens (US$0.9570)
- Cache Read/Write Cost: Read: 0.8 credits / 1M cached input tokens (US$0.0040)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 92.1 credits per 1M input tokens.
- Output price: 184.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, with strong performance across knowledge, math, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it introduces a hybrid attention system for efficient long-context processing. Reasoning efforts high and xhigh are supported; xhigh maps to max reasoning. It is well suited for complex workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both capability and efficiency are critical.

### GPT-5.5 (`gpt-5-5`)

- Model name to send: `gpt-5-5`
- Publisher: OpenAI
- Release date: 2026-04-24
- Privacy: Anonymized
- Context length: 1050000 tokens
- Max output tokens: 128000
- Supported input types: file, image, text
- Cost: 1057.7 credits / 1M input tokens (US$5.50); 6346.2 credits / 1M output tokens (US$33.00)
- Cache Read/Write Cost: Read: 105.8 credits / 1M cached input tokens (US$0.5500)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 1057.7 credits per 1M input tokens.
- Output price: 6346.2 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Tool calling, Structured output, Long context
- Description: GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling large-scale reasoning, coding, and multimodal workflows within a single system.

### Kimi K2.6 Private (`kimi-k2-6-private`)

- Model name to send: `kimi-k2-6-private`
- Publisher: MoonshotAI
- Release date: 2026-04-21
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text, image
- Cost: 230.6 credits / 1M input tokens (US$1.20); 973.1 credits / 1M output tokens (US$5.06)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 230.6 credits per 1M input tokens.
- Output price: 973.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Tool calling, Structured output, Long context
- Description: Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and can convert prompts and visual inputs into production-ready interfaces. Its agent swarm architecture scales to hundreds of parallel sub-agents for autonomous task decomposition - delivering documents, websites, and spreadsheets in a single run without human oversight.

### Qwen3 Coder Next Private (`qwen3-coder-next-private`)

- Model name to send: `qwen3-coder-next-private`
- Publisher: Qwen
- Release date: 2026-04-21
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text
- Cost: 38.1 credits / 1M input tokens (US$0.1980); 253.9 credits / 1M output tokens (US$1.32)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 38.1 credits per 1M input tokens.
- Output price: 253.9 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output, Long context
- Description: Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per token, delivering performance comparable to models with 10 to 20x higher active compute, which makes it well suited for cost-sensitive, always-on agent deployment. The model is trained with a strong agentic focus and performs reliably on long-horizon coding tasks, complex tool usage, and recovery from execution failures. With a native 256k context window, it integrates cleanly into real-world CLI and IDE environments and adapts well to common agent scaffolds used by modern coding tools. The model operates exclusively in non-thinking mode and does not emit <think> blocks, simplifying integration for production coding agents.

### GLM 5.1 Private (`glm-5-1-private`)

- Model name to send: `glm-5-1-private`
- Publisher: Z.ai
- Release date: 2026-04-20
- Privacy: E2EE Private
- Context length: 202752 tokens
- Max output tokens: 202752
- Supported input types: text
- Cost: 256.0 credits / 1M input tokens (US$1.33); 888.5 credits / 1M output tokens (US$4.62)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 256.0 credits per 1M input tokens.
- Output price: 888.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output, Long context
- Description: GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

### Kimi K2.6 (`kimi-k2-6`)

- Model name to send: `kimi-k2-6`
- Publisher: MoonshotAI
- Release date: 2026-04-20
- Privacy: Anonymized
- Context length: 262144 tokens
- Max output tokens: 16384
- Supported input types: text, image
- Cost: 158.7 credits / 1M input tokens (US$0.8250); 740.4 credits / 1M output tokens (US$3.85)
- Cache Read/Write Cost: Read: 52.9 credits / 1M cached input tokens (US$0.2750)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 158.7 credits per 1M input tokens.
- Output price: 740.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and can convert prompts and visual inputs into production-ready interfaces. Its agent swarm architecture scales to hundreds of parallel sub-agents for autonomous task decomposition - delivering documents, websites, and spreadsheets in a single run without human oversight.

### MiMo-V2-Flash Private (`mimo-v2-flash-private`)

- Model name to send: `mimo-v2-flash-private`
- Publisher: Xiaomi
- Release date: 2026-04-20
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text
- Cost: 21.2 credits / 1M input tokens (US$0.1100); 63.5 credits / 1M output tokens (US$0.3300)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 21.2 credits per 1M input tokens.
- Output price: 63.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output, Long context
- Description: MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Users can control the reasoning behaviour with the reasoning enabled boolean.

### Claude Opus 4.7 (`claude-opus-4-7`)

- Model name to send: `claude-opus-4-7`
- Publisher: Anthropic
- Release date: 2026-04-16
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: 128000
- Supported input types: text, image
- Cost: 1057.7 credits / 1M input tokens (US$5.50); 5288.5 credits / 1M output tokens (US$27.50)
- Cache Read/Write Cost: Read: 105.8 credits / 1M cached input tokens (US$0.5500); Write: 1322.2 credits / 1M cache write tokens (US$6.88)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 1057.7 credits per 1M input tokens.
- Output price: 5288.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on complex, multi-step tasks and more reliable agentic execution across extended workflows. It is especially effective for asynchronous agent pipelines where tasks unfold over time - large codebases, multi-stage debugging, and end-to-end project orchestration. Beyond coding, Opus 4.7 brings improved knowledge work capabilities - from drafting documents and building presentations to analyzing data. It maintains coherence across very long outputs and extended sessions, making it a strong default for tasks that require persistence, judgment, and follow-through.

### GLM 5.1 (`glm-5-1`)

- Model name to send: `glm-5-1`
- Publisher: Z.ai
- Release date: 2026-04-07
- Privacy: Anonymized
- Context length: 202752 tokens
- Max output tokens: 65535
- Supported input types: text
- Cost: 222.2 credits / 1M input tokens (US$1.16); 740.4 credits / 1M output tokens (US$3.85)
- Cache Read/Write Cost: Read: 111.1 credits / 1M cached input tokens (US$0.5775)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 222.2 credits per 1M input tokens.
- Output price: 740.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

### Gemma 4 26B A4B (`gemma-4-26b-a4b`)

- Model name to send: `gemma-4-26b-a4b`
- Publisher: Google
- Release date: 2026-04-03
- Privacy: Anonymized
- Context length: 262144 tokens
- Max output tokens: not listed
- Supported input types: image, text, video
- Cost: 12.7 credits / 1M input tokens (US$0.0660); 69.9 credits / 1M output tokens (US$0.3630)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 12.7 credits per 1M input tokens.
- Output price: 69.9 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Video input, Tool calling, Structured output, Long context
- Description: Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

### Gemma 4 31B (`gemma-4-31b`)

- Model name to send: `gemma-4-31b`
- Publisher: Google
- Release date: 2026-04-02
- Privacy: Anonymized
- Context length: 262144 tokens
- Max output tokens: 16384
- Supported input types: image, text, video
- Cost: 27.5 credits / 1M input tokens (US$0.1430); 80.4 credits / 1M output tokens (US$0.4180)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 27.5 credits per 1M input tokens.
- Output price: 80.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Video input, Tool calling, Structured output, Long context
- Description: Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

### Qwen3.6 Plus (`qwen3-6-plus`)

- Model name to send: `qwen3-6-plus`
- Publisher: Qwen
- Release date: 2026-04-02
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: 65536
- Supported input types: text, image, video
- Cost: 68.8 credits / 1M input tokens (US$0.3575); 412.5 credits / 1M output tokens (US$2.15)
- Cache Read/Write Cost: Write: 86.0 credits / 1M cache write tokens (US$0.4469)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 68.8 credits per 1M input tokens.
- Output price: 412.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Video input, Tool calling, Structured output, Long context
- Description: Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers major gains in agentic coding, front-end development, and overall reasoning, with a significantly improved “vibe coding” experience. The model excels at complex tasks such as 3D scenes, games, and repository-level problem solving, achieving a 78.8 score on SWE-bench Verified. It represents a substantial leap in both pure-text and multimodal capabilities, performing at the level of leading state-of-the-art models.

### Grok 4.20 (`grok-4-20`)

- Model name to send: `grok-4-20`
- Publisher: xAI
- Release date: 2026-03-31
- Privacy: Anonymized
- Context length: 2000000 tokens
- Max output tokens: not listed
- Supported input types: text, image, file
- Cost: 264.5 credits / 1M input tokens (US$1.38); 528.9 credits / 1M output tokens (US$2.75)
- Cache Read/Write Cost: Read: 42.4 credits / 1M cached input tokens (US$0.2200)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 264.5 credits per 1M input tokens.
- Output price: 528.9 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Tool calling, Structured output, Long context
- Description: Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled or disabled using the reasoning enabled parameter in the API.

### Grok 4.20 Multi-Agent (`grok-4-20-multi-agent`)

- Model name to send: `grok-4-20-multi-agent`
- Publisher: xAI
- Release date: 2026-03-31
- Privacy: Anonymized
- Context length: 2000000 tokens
- Max output tokens: not listed
- Supported input types: text, image, file
- Cost: 423.1 credits / 1M input tokens (US$2.20); 1269.3 credits / 1M output tokens (US$6.60)
- Cache Read/Write Cost: Read: 42.4 credits / 1M cached input tokens (US$0.2200)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 423.1 credits per 1M input tokens.
- Output price: 1269.3 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Structured output, Long context
- Description: Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: low / medium: 4 agents high / xhigh: 16 agents

### MiniMax M2.7 (`minimax-m2-7`)

- Model name to send: `minimax-m2-7`
- Publisher: MiniMax
- Release date: 2026-03-18
- Privacy: Anonymized
- Context length: 196608 tokens
- Max output tokens: 131072
- Supported input types: text
- Cost: 63.3 credits / 1M input tokens (US$0.3289); 253.9 credits / 1M output tokens (US$1.32)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 63.3 credits per 1M input tokens.
- Output price: 253.9 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output
- Description: MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent collaboration, enabling it to plan, execute, and refine complex tasks across dynamic environments. Trained for production-grade performance, M2.7 handles workflows such as live debugging, root cause analysis, financial modeling, and full document generation across Word, Excel, and PowerPoint. It delivers strong results on benchmarks including 56.2% on SWE-Pro and 57.0% on Terminal Bench 2, while achieving a 1495 ELO on GDPval-AA, setting a new standard for multi-agent systems operating in real-world digital workflows.

### GPT-5.4 Mini (`gpt-5-4-mini`)

- Model name to send: `gpt-5-4-mini`
- Publisher: OpenAI
- Release date: 2026-03-17
- Privacy: Anonymized
- Context length: 400000 tokens
- Max output tokens: 128000
- Supported input types: file, image, text
- Cost: 158.7 credits / 1M input tokens (US$0.8250); 952.0 credits / 1M output tokens (US$4.95)
- Cache Read/Write Cost: Read: 15.9 credits / 1M cached input tokens (US$0.0825)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 158.7 credits per 1M input tokens.
- Output price: 952.0 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Tool calling, Structured output, Long context
- Description: GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding, and tool use, while reducing latency and cost for large-scale deployments. The model is designed for production environments that require a balance of capability and efficiency, making it well suited for chat applications, coding assistants, and agent workflows that operate at scale. GPT-5.4 mini delivers reliable instruction following, solid multi-step reasoning, and consistent performance across diverse tasks with improved cost efficiency.

### GLM 5 Turbo (`glm-5-turbo`)

- Model name to send: `glm-5-turbo`
- Publisher: Z.ai
- Release date: 2026-03-15
- Privacy: Anonymized
- Context length: 202752 tokens
- Max output tokens: 131072
- Supported input types: text
- Cost: 253.9 credits / 1M input tokens (US$1.32); 846.2 credits / 1M output tokens (US$4.40)
- Cache Read/Write Cost: Read: 50.8 credits / 1M cached input tokens (US$0.2640)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 253.9 credits per 1M input tokens.
- Output price: 846.2 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows involving long execution chains, with improved complex instruction decomposition, tool use, scheduled and persistent execution, and overall stability across extended tasks.

### Qwen3.5-27B Private (`qwen3-5-27b-private`)

- Model name to send: `qwen3-5-27b-private`
- Publisher: Qwen
- Release date: 2026-03-13
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text, image, video
- Cost: 63.5 credits / 1M input tokens (US$0.3300); 507.7 credits / 1M output tokens (US$2.64)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 63.5 credits per 1M input tokens.
- Output price: 507.7 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Video input, Tool calling, Structured output, Long context
- Description: The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

### GPT-5.4 (`gpt-5-4`)

- Model name to send: `gpt-5-4`
- Publisher: OpenAI
- Release date: 2026-03-05
- Privacy: Anonymized
- Context length: 1050000 tokens
- Max output tokens: 128000
- Supported input types: text, image, file
- Cost: 528.9 credits / 1M input tokens (US$2.75); 3173.1 credits / 1M output tokens (US$16.50)
- Cache Read/Write Cost: Read: 52.9 credits / 1M cached input tokens (US$0.2750)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 528.9 credits per 1M input tokens.
- Output price: 3173.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Tool calling, Structured output, Long context
- Description: GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling high-context reasoning, coding, and multimodal analysis within the same workflow. The model delivers improved performance in coding, document understanding, tool use, and instruction following. It is designed as a strong default for both general-purpose tasks and software engineering, capable of generating production-quality code, synthesizing information across multiple sources, and executing complex multi-step workflows with fewer iterations and greater token efficiency.

### Gemini 3.1 Flash Lite Preview (`gemini-3-1-flash-lite-preview`)

- Model name to send: `gemini-3-1-flash-lite-preview`
- Publisher: Google
- Release date: 2026-03-03
- Privacy: Anonymized
- Context length: 1048576 tokens
- Max output tokens: 65536
- Supported input types: text, image, video, file, audio
- Cost: 52.9 credits / 1M input tokens (US$0.2750); 317.4 credits / 1M output tokens (US$1.65)
- Cache Read/Write Cost: Read: 5.3 credits / 1M cached input tokens (US$0.0275); Write: 17.7 credits / 1M cache write tokens (US$0.0917)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 52.9 credits per 1M input tokens.
- Output price: 317.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Audio input, Video input, Tool calling, Structured output, Long context
- Description: Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

### Qwen3.5 397B A17B Private (`qwen3-5-397b-a17b-private`)

- Model name to send: `qwen3-5-397b-a17b-private`
- Publisher: Qwen
- Release date: 2026-02-28
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text, image, video
- Cost: 116.4 credits / 1M input tokens (US$0.6050); 740.4 credits / 1M output tokens (US$3.85)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 116.4 credits per 1M input tokens.
- Output price: 740.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Video input, Tool calling, Structured output, Long context
- Description: The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers state-of-the-art performance comparable to leading-edge models across a wide range of tasks, including language understanding, logical reasoning, code generation, agent-based tasks, image understanding, video understanding, and graphical user interface (GUI) interactions. With its robust code-generation and agent capabilities, the model exhibits strong generalization across diverse agent.

### MiniMax M2.5 Private (`minimax-m2-5-private`)

- Model name to send: `minimax-m2-5-private`
- Publisher: MiniMax
- Release date: 2026-02-21
- Privacy: E2EE Private
- Context length: 196608 tokens
- Max output tokens: 196608
- Supported input types: text
- Cost: 42.4 credits / 1M input tokens (US$0.2200); 292.0 credits / 1M output tokens (US$1.52)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 42.4 credits per 1M input tokens.
- Output price: 292.0 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output
- Description: MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1 to extend into general office work, reaching fluency in generating and operating Word, Excel, and Powerpoint files, context switching between diverse software environments, and working across different agent and human teams. Scoring 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp, M2.5 is also more token efficient than previous generations, having been trained to optimize its actions and output through planning.

### Gemini 3.1 Pro Preview (`gemini-3-1-pro-preview`)

- Model name to send: `gemini-3-1-pro-preview`
- Publisher: Google
- Release date: 2026-02-19
- Privacy: Anonymized
- Context length: 1048576 tokens
- Max output tokens: 65536
- Supported input types: audio, file, image, text, video
- Cost: 423.1 credits / 1M input tokens (US$2.20); 2538.5 credits / 1M output tokens (US$13.20)
- Cache Read/Write Cost: Read: 42.4 credits / 1M cached input tokens (US$0.2200); Write: 79.4 credits / 1M cache write tokens (US$0.4125)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 423.1 credits per 1M input tokens.
- Output price: 2538.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Audio input, Video input, Tool calling, Structured output, Long context
- Description: Gemini 3.1 Pro Preview is Google's frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation of the Gemini 3 series, it combines high-precision reasoning across text, image, video, audio, and code with a 1M-token context window. Reasoning details must be preserved when using multi-turn tool calling. The 3.1 update introduces measurable gains in SWE benchmarks and real-world coding environments, along with stronger autonomous task execution in structured domains such as finance and spreadsheet-based workflows. Designed for advanced development and agentic systems, Gemini 3.1 Pro Preview improves long-horizon stability and tool orchestration while increasing token efficiency. It introduces a new medium thinking level to better balance cost, speed, and performance. The model excels in agentic coding, structured planning, multimodal analysis, and workflow automation, making it well-suited for autonomous agents, financial modeling, spreadsheet automation, and high-context enterprise tasks.

### Claude Sonnet 4.6 (`claude-sonnet-4-6`)

- Model name to send: `claude-sonnet-4-6`
- Publisher: Anthropic
- Release date: 2026-02-17
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: 128000
- Supported input types: text, image
- Cost: 634.7 credits / 1M input tokens (US$3.30); 3173.1 credits / 1M output tokens (US$16.50)
- Cache Read/Write Cost: Read: 63.5 credits / 1M cached input tokens (US$0.3300); Write: 793.3 credits / 1M cache write tokens (US$4.13)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 634.7 credits per 1M input tokens.
- Output price: 3173.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with memory, polished document creation, and confident computer use for web QA and workflow automation.

### GLM 5 (`glm-5`)

- Model name to send: `glm-5`
- Publisher: Z.ai
- Release date: 2026-02-11
- Privacy: Anonymized
- Context length: 202752 tokens
- Max output tokens: not listed
- Supported input types: text
- Cost: 127.0 credits / 1M input tokens (US$0.6600); 406.2 credits / 1M output tokens (US$2.11)
- Cache Read/Write Cost: Read: 25.4 credits / 1M cached input tokens (US$0.1320)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 127.0 credits per 1M input tokens.
- Output price: 406.2 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Tool calling, Structured output, Long context
- Description: GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading closed-source models. With advanced agentic planning, deep backend reasoning, and iterative self-correction, GLM-5 moves beyond code generation to full-system construction and autonomous execution.

### GLM 5 Private (`glm-5-private`)

- Model name to send: `glm-5-private`
- Publisher: Z.ai
- Release date: 2026-02-10
- Privacy: E2EE Private
- Context length: 202752 tokens
- Max output tokens: 202752
- Supported input types: text
- Cost: 253.9 credits / 1M input tokens (US$1.32); 740.4 credits / 1M output tokens (US$3.85)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 253.9 credits per 1M input tokens.
- Output price: 740.4 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output, Long context
- Description: GLM-5 is an open-source foundation model built for complex systems engineering and long-horizon agent workflows. It delivers production-grade productivity for large-scale programming tasks, with performance aligned to top closed-source models, and is designed for expert developers building at the system level.

### Claude Opus 4.6 (`claude-opus-4-6`)

- Model name to send: `claude-opus-4-6`
- Publisher: Anthropic
- Release date: 2026-02-04
- Privacy: Anonymized
- Context length: 1000000 tokens
- Max output tokens: 128000
- Supported input types: text, image
- Cost: 1057.7 credits / 1M input tokens (US$5.50); 5288.5 credits / 1M output tokens (US$27.50)
- Cache Read/Write Cost: Read: 105.8 credits / 1M cached input tokens (US$0.5500); Write: 1322.2 credits / 1M cache write tokens (US$6.88)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 1057.7 credits per 1M input tokens.
- Output price: 5288.5 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Opus 4.6 is Anthropic's strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective for large codebases, complex refactors, and multi-step debugging that unfolds over time. The model shows deeper contextual understanding, stronger problem decomposition, and greater reliability on hard engineering tasks than prior generations. Beyond coding, Opus 4.6 excels at sustained knowledge work. It produces near-production-ready documents, plans, and analyses in a single pass, and maintains coherence across very long outputs and extended sessions. This makes it a strong default for tasks that require persistence, judgment, and follow-through, such as technical design, migration planning, and end-to-end project execution.

### Kimi K2.5 Private (`kimi-k2-5-private`)

- Model name to send: `kimi-k2-5-private`
- Publisher: MoonshotAI
- Release date: 2026-01-29
- Privacy: E2EE Private
- Context length: 262144 tokens
- Max output tokens: 262144
- Supported input types: text, image
- Cost: 127.0 credits / 1M input tokens (US$0.6600); 634.7 credits / 1M output tokens (US$3.30)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 127.0 credits per 1M input tokens.
- Output price: 634.7 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Tool calling, Structured output, Long context
- Description: Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed visual and text tokens, it delivers strong performance in general reasoning, visual coding, and agentic tool-calling.

### Gemini 3 Flash Preview (`gemini-3-flash-preview`)

- Model name to send: `gemini-3-flash-preview`
- Publisher: Google
- Release date: 2025-12-17
- Privacy: Anonymized
- Context length: 1048576 tokens
- Max output tokens: 65536
- Supported input types: text, image, file, audio, video
- Cost: 105.8 credits / 1M input tokens (US$0.5500); 634.7 credits / 1M output tokens (US$3.30)
- Cache Read/Write Cost: Read: 10.6 credits / 1M cached input tokens (US$0.0550); Write: 17.7 credits / 1M cache write tokens (US$0.0917)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 105.8 credits per 1M input tokens.
- Output price: 634.7 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, File input, Audio input, Video input, Tool calling, Structured output, Long context
- Description: Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability. The model supports a 1M token context window and multimodal inputs including text, images, audio, video, and PDFs, with text output. It includes configurable reasoning via thinking levels (minimal, low, medium, high), structured output, tool use, and automatic context caching. Gemini 3 Flash Preview is optimized for users who want strong reasoning and agentic behavior without the cost or latency of full scale frontier models.

### DeepSeek V3.2 Private (`deepseek-v3-2-private`)

- Model name to send: `deepseek-v3-2-private`
- Publisher: DeepSeek
- Release date: 2025-12-03
- Privacy: E2EE Private
- Context length: 163840 tokens
- Max output tokens: 163840
- Supported input types: text
- Cost: 67.7 credits / 1M input tokens (US$0.3520); 101.6 credits / 1M output tokens (US$0.5280)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 67.7 credits per 1M input tokens.
- Output price: 101.6 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Tool calling, Structured output
- Description: DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that reduces training and inference cost while preserving quality in long-context scenarios. A scalable reinforcement learning post-training framework further improves reasoning, with reported performance in the GPT-5 class, and the model has demonstrated gold-medal results on the 2025 IMO and IOI. V3.2 also uses a large-scale agentic task synthesis pipeline to better integrate reasoning into tool-use settings, boosting compliance and generalization in interactive environments. Users can control the reasoning behaviour with the reasoning enabled boolean.

### Qwen3 Coder 480B A35B Private (`qwen3-coder-480b-a35b-private`)

- Model name to send: `qwen3-coder-480b-a35b-private`
- Publisher: Qwen
- Release date: 2025-11-28
- Privacy: E2EE Private
- Context length: 262000 tokens
- Max output tokens: 262000
- Supported input types: text
- Cost: 423.1 credits / 1M input tokens (US$2.20); 423.1 credits / 1M output tokens (US$2.20)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 423.1 credits per 1M input tokens.
- Output price: 423.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Long context
- Description: Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over repositories. The model features 480 billion total parameters, with 35 billion active per forward pass (8 out of 160 experts).

### Qwen3 VL 30B A3B Instruct Private (`qwen3-vl-30b-a3b-instruct-private`)

- Model name to send: `qwen3-vl-30b-a3b-instruct-private`
- Publisher: Qwen
- Release date: 2025-11-28
- Privacy: E2EE Private
- Context length: 128000 tokens
- Max output tokens: 128000
- Supported input types: text, image
- Cost: 42.4 credits / 1M input tokens (US$0.2200); 148.1 credits / 1M output tokens (US$0.7700)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 42.4 credits per 1M input tokens.
- Output price: 148.1 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Tool calling, Structured output
- Description: Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.

### Claude Haiku 4.5 (`claude-haiku-4-5`)

- Model name to send: `claude-haiku-4-5`
- Publisher: Anthropic
- Release date: 2025-10-15
- Privacy: Anonymized
- Context length: 200000 tokens
- Max output tokens: 64000
- Supported input types: image, text
- Cost: 211.6 credits / 1M input tokens (US$1.10); 1057.7 credits / 1M output tokens (US$5.50)
- Cache Read/Write Cost: Read: 21.2 credits / 1M cached input tokens (US$0.1100); Write: 264.5 credits / 1M cache write tokens (US$1.38)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 211.6 credits per 1M input tokens.
- Output price: 1057.7 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: Anonymized, Vision, Tool calling, Structured output, Long context
- Description: Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance across reasoning, coding, and computer-use tasks, Haiku 4.5 brings frontier-level capability to real-time and high-volume applications. It introduces extended thinking to the Haiku line; enabling controllable reasoning depth, summarized or interleaved thought output, and tool-assisted workflows with full support for coding, bash, web search, and computer-use tools. Scoring >73% on SWE-bench Verified, Haiku 4.5 ranks among the world’s best coding models while maintaining exceptional responsiveness for sub-agents, parallelized execution, and scaled deployment.

### Gemma 3 27B Private (`gemma-3-27b-private`)

- Model name to send: `gemma-3-27b-private`
- Publisher: Google
- Release date: 2025-10-03
- Privacy: E2EE Private
- Context length: 53920 tokens
- Max output tokens: 53920
- Supported input types: text, image
- Cost: 23.3 credits / 1M input tokens (US$0.1210); 84.7 credits / 1M output tokens (US$0.4400)
- x402: supported for non-streaming calls. Call without Authorization to receive a 402 payment requirement, then retry with PAYMENT-SIGNATURE. Text x402 calls are prepaid from the reserve estimate because final token usage is only known after inference.
- Input price: 23.3 credits per 1M input tokens.
- Output price: 84.7 credits per 1M output tokens.
- Billing: API-key calls reserve/refund from actual usage with a 0.1-credit rounded minimum. x402 wallet balance can refund unused reserve; direct x402 pay-as-you-go charges the quoted reserve.
- Features: E2EE Private, Vision, Tool calling, Structured output
- Description: Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to [Gemma 2](google/gemma-2-27b-it)


## Image Models

### GPT Image 2 (`gpt-image-2`)

- Model name to send: `gpt-image-2`
- Creator: OpenAI
- Cost: 14 credits per image (~$0.0728)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 16
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/gptimage2.webp
- Description: OpenAI's latest and greatest Image Generation & Editing model, delivering a massive leap forward in prompt adherence, stylistic control, and text rendering!

### Nano Banana 2 (`nano-banana-2`)

- Model name to send: `nano-banana-2`
- Creator: Google
- Cost: 32 credits per image (~$0.1664)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 14
- Supported aspects: 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/nanobanana2.webp
- Description: Google's frontier image creation and edit model, showcasing unprecedented quality and prompt adherence, along with in-built full web search capabilities!

### Gen (`gen`)

- Model name to send: `gen`
- Creator: imgnAI
- Cost: 3 credits per image (~$0.0156)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/flux1x.webp
- Description: Our flexible, general-purpose model for any style or subject.

### Seedream 5.0 Lite (`seedream-5-0-lite`)

- Model name to send: `seedream-5-0-lite`
- Creator: ByteDance
- Cost: 7 credits per image (~$0.0364)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 10
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/seedream5lite.webp
- Description: ByteDance's lightweight, next-gen Image model - with full edit and reference capabilities!

### WAN 2.7 Image Pro (`wan-2-7-image-pro`)

- Model name to send: `wan-2-7-image-pro`
- Creator: Alibaba
- Cost: 14 credits per image (~$0.0728)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 9
- Supported aspects: 1:1, 3:4, 4:3, 1:8, 8:1, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/wan27proimage.webp
- Description: Alibaba's WAN 2.7 Image Pro model, merging true-to-life realism with full reference & edit capabilities.

### Ani (`ani`)

- Model name to send: `ani`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/ani.webp
- Description: High-fidelity, classic styled anime model.

### Nano Banana Pro (`nano-banana-pro`)

- Model name to send: `nano-banana-pro`
- Creator: Google
- Cost: 28 credits per image (~$0.1456)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 8
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/nanobananapro.webp
- Description: Google's Pro Tier offering on their first-generation Nano Banana model, pushing the bar for high quality image creation, and high precision editing workloads.

### Qwen 2.0 (`qwen-2-0`)

- Model name to send: `qwen-2-0`
- Creator: Alibaba
- Cost: 8 credits per image (~$0.0416)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 3
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/qwen2image.webp
- Description: Qwen 2 by Alibaba packs a punch at entry-level pricing - with detailed text rendering, flexible editing, and great style control.

### Fur (`fur`)

- Model name to send: `fur`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/fur.webp
- Description: Anime-style Furry model.

### Synth (`synth`)

- Model name to send: `synth`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/synth.webp
- Description: Join the Cyberpunk revolution, with a touch of Synthwave to boot!

### Seedream 4.5 (`seedream-4-5`)

- Model name to send: `seedream-4-5`
- Creator: ByteDance
- Cost: 12 credits per image (~$0.0624)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 10
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 4:5, 5:4, 4:7, 7:4, 5:2, 2:5
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/seedream45.webp
- Description: Cutting-edge Native 4K model, with full edit capabilities.

### Noob (`noob`)

- Model name to send: `noob`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/noob.webp
- Description: Anime-style model with extreme style range and character recognition - prompt assist highly recommended.

### Aura (`aura`)

- Model name to send: `aura`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/aura.webp
- Description: Create gorgeous artwork in a professional graphic novel illustration style! Prompt assist recommended.

### Pixel (`pixel`)

- Model name to send: `pixel`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/pixel.webp
- Description: Create artwork from the golden age of the pixel art era! [New Pixel Core]

### FLUX.2 Klein 9b (`flux-2-klein-9b`)

- Model name to send: `flux-2-klein-9b`
- Creator: Black Forest Labs
- Cost: 10 credits per image (~$0.0520)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 4
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 4:5, 5:4, 4:7, 7:4, 5:2, 2:5
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/klein9b.webp
- Description: FLUX.2's Klein Flagship, HD outputs and full image edit support at an excellent price/performance ratio.

### Nano Banana (`nano-banana`)

- Model name to send: `nano-banana`
- Creator: Google
- Cost: 12 credits per image (~$0.0624)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 6
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 21:9
- Features: text prompts, requires image input, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/alt/image/nanobananaapi_modify.webp
- Description: Google's first-generation image creation and edit model.

### Hyper CGI (`hyper-cgi`)

- Model name to send: `hyper-cgi`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/hypercgi.webp
- Description: Polished CGI Characters, in the style of modern animations.

### FLUX.2 FLEX (`flux-2-flex`)

- Model name to send: `flux-2-flex`
- Creator: Black Forest Labs
- Cost: 28 credits per image (~$0.1456)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 8
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/flux2flex.webp
- Description: FLUX.2's flagship model, built to compete directly with Google's Nano Banana Pro.

### Imagine Art 1.5 Pro (`imagine-art-1-5-pro`)

- Model name to send: `imagine-art-1-5-pro`
- Creator: Imagine Art
- Cost: 14 credits per image (~$0.0728)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 3:1, 1:3, 3:2, 2:3
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/imagineart.webp
- Description: Imagine Art's high-fidelity native 4K model with lifelike realism, and refined aesthetics

### Volt (`volt`)

- Model name to send: `volt`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/volt.webp
- Description: Anime style characters with a retro palette.

### WAN 2.7 Image (`wan-2-7-image`)

- Model name to send: `wan-2-7-image`
- Creator: Alibaba
- Cost: 6 credits per image (~$0.0312)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 9
- Supported aspects: 1:1, 3:4, 4:3, 1:8, 8:1, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/wan27image.webp
- Description: Alibaba's Base WAN 2.7 Image model, bringing WAN's signature best-in-class natural realism at a budget cost!

### GPT Image 1.5 (`gpt-image-1-5`)

- Model name to send: `gpt-image-1-5`
- Creator: OpenAI
- Cost: 32 credits per image (~$0.1664)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/gptimage15.webp
- Description: OpenAI's flagship image model.

### Muse (`muse`)

- Model name to send: `muse`
- Creator: imgnAI
- Cost: 3 credits per image (~$0.0156)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/muse.webp
- Description: Flux-based model fine-tuned on Goths, Emos, and e-girls. Yes, really.

### FLUX.2 PRO (`flux-2-pro`)

- Model name to send: `flux-2-pro`
- Creator: Black Forest Labs
- Cost: 8 credits per image (~$0.0416)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 4
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/flux2pro.webp
- Description: FLUX.2's entry-level model, offering great performance at budget pricing.

### Gothic (`gothic`)

- Model name to send: `gothic`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/gothic.webp
- Description: Dark clothes, dark hair, dark eyeliner. All the best things in an image model.

### Z-Image Base (`z-image-base`)

- Model name to send: `z-image-base`
- Creator: Z.ai
- Cost: 7 credits per image (~$0.0364)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 4:5, 5:4, 4:7, 7:4, 5:2, 2:5
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/zimagebase.webp
- Description: Z.ai's foundational image model, built for high quality and generative diversity at budget pricing.

### Z-Image Turbo (`z-image-turbo`)

- Model name to send: `z-image-turbo`
- Creator: Z.ai
- Cost: 4 credits per image (~$0.0208)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 4:5, 5:4, 4:7, 7:4, 5:2, 2:5
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/zimageturbo.webp
- Description: Z.ai's Turbo image model, a lightning-fast distillation of Z-Image Base.

### Rend (`rend`)

- Model name to send: `rend`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/rend.webp
- Description: A high-quality blended mix of path-traced rendering, and anime stylization.

### Retro (`retro`)

- Model name to send: `retro`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/retro.webp
- Description: Retro-styled anime, straight from the 90's!

### FLUX.2 Klein 4b (`flux-2-klein-4b`)

- Model name to send: `flux-2-klein-4b`
- Creator: Black Forest Labs
- Cost: 4 credits per image (~$0.0208)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 4
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 4:5, 5:4, 4:7, 7:4, 5:2, 2:5
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/klein4b.webp
- Description: FLUX.2's small but powerful Klein 4b model, further extending the cost savings offered with Klein 9b.

### Neo (`neo`)

- Model name to send: `neo`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/neo.webp
- Description: High quality realism & render model optimized for UHD Mode.

### Pony (`pony`)

- Model name to send: `pony`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/pony.webp
- Description: Expressive sketch-like anime model with excellent character recognition.

### Nai (`nai`)

- Model name to send: `nai`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/nai.webp
- Description: Modern, alternative anime style model.

### FLUX 1.1 Ultra (`flux-1-1-ultra`)

- Model name to send: `flux-1-1-ultra`
- Creator: Black Forest Labs
- Cost: 18 credits per image (~$0.0936)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 4:5, 5:4, 21:9, 9:21, 2:1, 1:2
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/flux1ultra.webp
- Description: BFL's final FLUX.1 iteration, extremely high quality 4MP outputs.

### Glitch (`glitch`)

- Model name to send: `glitch`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/glitch.webp
- Description: Load up on glitch-burn aesthetics, with this compelling and unique art style!

### Qwen Image (`qwen-image`)

- Model name to send: `qwen-image`
- Creator: Alibaba Cloud
- Cost: 8 credits per image (~$0.0416)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/qwen.webp
- Description: Versatile model with excellent text capabilities.

### Seedream 4 (`seedream-4`)

- Model name to send: `seedream-4`
- Creator: ByteDance
- Cost: 12 credits per image (~$0.0624)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 10
- Supported aspects: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9
- Features: text prompts, supports reference images, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/seedream4.webp
- Description: Excellent quality 4K model, with full edit capabilities.

### WAN 2.2 Image (`wan-2-2-image`)

- Model name to send: `wan-2-2-image`
- Creator: Alibaba Cloud
- Cost: 8 credits per image (~$0.0416)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 21:9
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/wan22image.webp
- Description: Stunning still-frame imagery powered by the WAN Video model.

### Flux1.D (`flux1-d`)

- Model name to send: `flux1-d`
- Creator: Black Forest Labs
- Cost: 3 credits per image (~$0.0156)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: text prompts
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/fluxpro.webp
- Description: BFL's general-purpose Flux1.D model.

### Qwen Image Edit (`qwen-image-edit`)

- Model name to send: `qwen-image-edit`
- Creator: Alibaba Cloud
- Cost: 12 credits per image (~$0.0624)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4
- Features: text prompts, requires image input, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/alt/image/qweneditapi_modify.webp
- Description: Super-charged Qwen Image, with full image editing capabilities.

### Supra (`supra`)

- Model name to send: `supra`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/supra2.webp
- Description: Digital render style model, similar to MidJourney v6.

### Evo (`evo`)

- Model name to send: `evo`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/evo.webp
- Description: Slightly abstract realism, flexible model specializing in texture details.

### Toon (`toon`)

- Model name to send: `toon`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/toon.webp
- Description: Western cartoon style model.

### Wassie (`wassie`)

- Model name to send: `wassie`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/wassie.webp
- Description: Create Wassies of all shapes and sizes!

### HyperX (`hyperx`)

- Model name to send: `hyperx`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/hyperx.webp
- Description: Legacy CG-Style renders.

### FurXL Classic (`furxl-classic`)

- Model name to send: `furxl-classic`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/furxl.webp
- Description: Furries in all shapes and sizes!

### Flux Kontext Max (`flux-kontext-max`)

- Model name to send: `flux-kontext-max`
- Creator: Black Forest Labs
- Cost: 20 credits per image (~$0.1040)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 4:5, 5:4, 21:9, 9:21, 2:1, 1:2
- Features: text prompts, requires image input, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/alt/image/fluxkontextmax_modify.webp
- Description: BFL's legacy high-end image editing model.

### Flux Kontext Pro (`flux-kontext-pro`)

- Model name to send: `flux-kontext-pro`
- Creator: Black Forest Labs
- Cost: 12 credits per image (~$0.0624)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 4:5, 5:4, 21:9, 9:21, 2:1, 1:2
- Features: text prompts, requires image input, edit/reference workflow
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: send `image_urls` with HTTPS URLs, data URLs, or raw base64 image strings; never send local file paths or file:// URLs
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/alt/image/fluxkontextpro_modify.webp
- Description: BFL's legacy entry-level image editing model.

### Supra Classic (`supra-classic`)

- Model name to send: `supra-classic`
- Creator: imgnAI
- Cost: 2 credits per image (~$0.0104)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 1:1, 16:9, 21:9, 3:4, 9:16, 5:2, 4:3, 5:4, 4:5, 9:21, 4:7
- Features: standard image generation
- Output format: send `output_format` as `png`, `jpeg`, or `webp`.
- Image inputs: text prompt is the normal workflow; `image_url` or `image_urls` may be accepted where the backend supports it
- Aspect AUTO: send `aspect_ratio: "auto"` to inspect the first image input; with no input it defaults to `1:1` when supported.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/image/supra.webp
- Description: High-range Digital Renders


## Video Models

### Seedance 2.0 (`seedance-2-0`)

- Model name to send: `seedance-2-0`
- Creator: ByteDance
- Duration costs: 5 seconds: 200 credits (~$1.04); 10 seconds: 375 credits (~$1.95); 15 seconds: 550 credits (~$2.86)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 7
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, last frame, reference images up to 7, audio input up to 3 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 3 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 7 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance2video.webp
- Description: The world's frontier video model, offering unprecedented prompt adherence and audio/video quality.
- Custom rules for this model:
  - Reference images cannot be combined with first-frame or last-frame inputs.
  - Audio input cannot be combined with first-frame or last-frame inputs.
  - Last-frame input requires a first-frame input.
  - Combined selected audio input is limited to 15 seconds.

### Seedance 2.0 (480p) (`seedance-2-0-480p`)

- Model name to send: `seedance-2-0-480p`
- Creator: ByteDance
- Duration costs: 5 seconds: 120 credits (~$0.6240); 10 seconds: 230 credits (~$1.20); 15 seconds: 340 credits (~$1.77)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 7
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, last frame, reference images up to 7, audio input up to 3 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 3 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 7 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance2480pvideo.webp
- Description: All the power of Seedance 2.0's full model - optimized for Mobile resolutions, at half the cost!
- Custom rules for this model:
  - Reference images cannot be combined with first-frame or last-frame inputs.
  - Audio input cannot be combined with first-frame or last-frame inputs.
  - Last-frame input requires a first-frame input.
  - Combined selected audio input is limited to 15 seconds.

### Seedance 2.0 (Fast) (`seedance-2-0-fast`)

- Model name to send: `seedance-2-0-fast`
- Creator: ByteDance
- Duration costs: 5 seconds: 120 credits (~$0.6240); 10 seconds: 230 credits (~$1.20); 15 seconds: 340 credits (~$1.77)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 7
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, last frame, reference images up to 7, audio input up to 3 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 3 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 7 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance2fastvideo.webp
- Description: Seedance 2.0's Fast model, offering a powerful budget entry for full resolution outputs!
- Custom rules for this model:
  - Reference images cannot be combined with first-frame or last-frame inputs.
  - Audio input cannot be combined with first-frame or last-frame inputs.
  - Last-frame input requires a first-frame input.
  - Combined selected audio input is limited to 15 seconds.

### LTX 2.3 (`ltx-2-3`)

- Model name to send: `ltx-2-3`
- Creator: Lightricks
- Duration costs: 5 seconds: 20 credits (~$0.1040); 10 seconds: 40 credits (~$0.2080); 15 seconds: 60 credits (~$0.3120); 20 seconds: 80 credits (~$0.4160)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1
- Features: first frame, last frame, audio input up to 1 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 1 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/ltx23video.webp
- Description: Powerful open source audio-video model by Lightricks - hosted in-house for premium results at dramatically lower prices.
- Custom rules for this model:
  - Audio input can only be used with first-frame conditioning.
  - Video duration follows the selected audio duration.

### Happy Horse 1.0 (1080p) (`happy-horse-1-0-1080p`)

- Model name to send: `happy-horse-1-0-1080p`
- Creator: Alibaba
- Duration costs: 5 seconds: 300 credits (~$1.56); 10 seconds: 600 credits (~$3.12); 15 seconds: 900 credits (~$4.68)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 9
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, reference images up to 9
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: supported, up to 9 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/happyhorse101080pvideo.webp
- Description: Alibaba's newest frontier model; boasts breathtaking quality and prompt adherence, at full 1080p resolutions.
- Custom rules for this model:
  - Reference images cannot be combined with first-frame or last-frame inputs.

### Happy Horse 1.0 (720p) (`happy-horse-1-0-720p`)

- Model name to send: `happy-horse-1-0-720p`
- Creator: Alibaba
- Duration costs: 5 seconds: 150 credits (~$0.7800); 10 seconds: 300 credits (~$1.56); 15 seconds: 450 credits (~$2.34)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 9
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, reference images up to 9
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: supported, up to 9 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/happyhorse10720pvideo.webp
- Description: Alibaba's powerful Happy Horse 1.0 model - retaining breathtaking quality at 720p resolutions, coupled with discounted pricing!
- Custom rules for this model:
  - Reference images cannot be combined with first-frame or last-frame inputs.

### WAN 2.7 (1080p) (`wan-2-7-1080p`)

- Model name to send: `wan-2-7-1080p`
- Creator: Alibaba
- Duration costs: 5 seconds: 130 credits (~$0.6760); 10 seconds: 260 credits (~$1.35)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 5
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, last frame, reference images up to 5, audio input up to 1 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 1 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 5 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/wan271080pvideo.webp
- Description: WAN 2.7 excels in creating videos with breathtaking consistency across reference images, and high-fidelity textures.
- Custom rules for this model:
  - Reference audio is interpreted as voice timbre when reference images are present.
  - Reference images cannot be combined with first-frame or last-frame inputs.

### WAN 2.7 (720p) (`wan-2-7-720p`)

- Model name to send: `wan-2-7-720p`
- Creator: Alibaba
- Duration costs: 5 seconds: 90 credits (~$0.4680); 10 seconds: 180 credits (~$0.9360)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 5
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: first frame, last frame, reference images up to 5, audio input up to 1 file(s)
- Audio output: generates audio by default
- Audio references accepted: yes, up to 1 file(s)
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 5 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/wan27720pvideo.webp
- Description: Alibaba's 720p WAN 2.7 model - retaining excellent quality, at a discounted rate!
- Custom rules for this model:
  - Reference audio is interpreted as voice timbre when reference images are present.
  - Reference images cannot be combined with first-frame or last-frame inputs.

### Kling O3 4K (`kling-o3-4k`)

- Model name to send: `kling-o3-4k`
- Creator: Kling AI
- Duration costs: 5 seconds: 450 credits (~$2.34); 10 seconds: 900 credits (~$4.68); 15 seconds: 1350 credits (~$7.02)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 6
- Supported aspects: 16:9, 9:16, 1:1
- Features: first frame, last frame, reference images up to 6
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 6 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling304kvideo.webp
- Description: Kling's flagship O3 model, in astonishing 4K Quality!
- Custom rules for this model: none

### Kling 3.0 (`kling-3-0-kling30pro`)

- Model name to send: `kling-3-0-kling30pro`
- Creator: Kling AI
- Duration costs: 5 seconds: 350 credits (~$1.82); 10 seconds: 650 credits (~$3.38); 15 seconds: 925 credits (~$4.81)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 6
- Supported aspects: 16:9, 9:16, 1:1
- Features: first frame, last frame, reference images up to 6
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 6 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling30provideo.webp
- Description: Kling's flagship 3.0 model, pairing excellent prompt adherence with industry quality audio/video output.
- Custom rules for this model: none

### Kling 3.0 (`kling-3-0-kling30`)

- Model name to send: `kling-3-0-kling30`
- Creator: Kling AI
- Duration costs: 5 seconds: 280 credits (~$1.46); 10 seconds: 550 credits (~$2.86); 15 seconds: 800 credits (~$4.16)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 6
- Supported aspects: 16:9, 9:16, 1:1
- Features: first frame, last frame, reference images up to 6
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: supported
- Reference image input: supported, up to 6 image(s)
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling30video.webp
- Description: Kling's cutting-edge 3.0 model, with great prompt adherence, and stellar quality audio/video output.
- Custom rules for this model: none

### Veo3.1 (`veo3-1`)

- Model name to send: `veo3-1`
- Creator: Google
- Duration costs: 4 seconds: 380 credits (~$1.98); 8 seconds: 750 credits (~$3.90)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16
- Features: first frame
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/veo3video.webp
- Description: Google's updated Veo3.1 model, showcasing 1080p cinematic quality video, with full audio support.
- Custom rules for this model: none

### Veo3.1 Fast (`veo3-1-fast`)

- Model name to send: `veo3-1-fast`
- Creator: Google
- Duration costs: 4 seconds: 160 credits (~$0.8320); 8 seconds: 300 credits (~$1.56)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16
- Features: first frame
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/veo3videofast.webp
- Description: Veo 3.1's little brother - bringing high speeds at lower costs, while still offering superb quality audio/video generation.
- Custom rules for this model: none

### Veo3.1 Lite (`veo3-1-lite`)

- Model name to send: `veo3-1-lite`
- Creator: Google
- Duration costs: 8 seconds: 140 credits (~$0.7280)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16
- Features: first frame
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/veo3videolite.webp
- Description: Google's latest offering in their Veo3.1 line, packing excellent quality at an incredibly low cost!
- Custom rules for this model: none

### Seedance Pro (`seedance-pro`)

- Model name to send: `seedance-pro`
- Creator: ByteDance
- Duration costs: 5 seconds: 250 credits (~$1.30); 10 seconds: 450 credits (~$2.34)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance1provideo.webp
- Description: Excellent quality HD outputs across a range of styles, with excellent prompt adherence.
- Custom rules for this model: none

### WAN 2.5 (`wan-2-5`)

- Model name to send: `wan-2-5`
- Creator: Alibaba Cloud
- Duration costs: 5 seconds: 220 credits (~$1.14); 10 seconds: 400 credits (~$2.08)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, 9:21
- Features: first frame
- Audio output: generates audio by default
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/wan25video.webp
- Description: Excellent quality HD outputs with full audio generation, at a stunningly low cost.
- Custom rules for this model: none

### Hailuo 2 (MiniMax) (`hailuo-2-minimax`)

- Model name to send: `hailuo-2-minimax`
- Creator: Hailuo AI
- Duration costs: 6 seconds: 200 credits (~$1.04)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, 9:21
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/minimax2video.webp
- Description: Great quality at competitive pricing, excels at text rendering.
- Custom rules for this model: none

### Kling 2.1 (`kling-2-1-kling21`)

- Model name to send: `kling-2-1-kling21`
- Creator: Kling AI
- Duration costs: 5 seconds: 160 credits (~$0.8320); 10 seconds: 300 credits (~$1.56)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, 9:21
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling21video.webp
- Description: Quality close to Kling's master 2.0 model, at budget pricing.
- Custom rules for this model: none

### Kling 2.1 (`kling-2-1-kling21loop`)

- Model name to send: `kling-2-1-kling21loop`
- Creator: Kling AI
- Duration costs: 5 seconds: 160 credits (~$0.8320); 10 seconds: 300 credits (~$1.56)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, 9:21
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling21loopedvideo.webp
- Description: Infinitely looped videos with Kling 2.1!
- Custom rules for this model: none

### Seedance Lite (`seedance-lite-seedancelite`)

- Model name to send: `seedance-lite-seedancelite`
- Creator: ByteDance
- Duration costs: 5 seconds: 120 credits (~$0.6240); 10 seconds: 200 credits (~$1.04)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance1litevideo.webp
- Description: Exceptional cost-to-performance video model.
- Custom rules for this model: none

### Seedance Lite (`seedance-lite-seedanceliteloop`)

- Model name to send: `seedance-lite-seedanceliteloop`
- Creator: ByteDance
- Duration costs: 5 seconds: 120 credits (~$0.6240); 10 seconds: 200 credits (~$1.04)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1, 4:3, 3:4
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/seedance1liteloopedvideo.webp
- Description: Infinitely looped videos with Seedance Lite!
- Custom rules for this model: none

### Kling 2.0 (`kling-2-0`)

- Model name to send: `kling-2-0`
- Creator: Kling AI
- Duration costs: 5 seconds: 350 credits (~$1.82); 10 seconds: 650 credits (~$3.38)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling2video.webp
- Description: Kling's raw/master 2.0 video model. Excellent quality, but very highly priced for video-only.
- Custom rules for this model: none

### Kling 1.6 (`kling-1-6`)

- Model name to send: `kling-1-6`
- Creator: Kling AI
- Duration costs: 5 seconds: 130 credits (~$0.6760); 10 seconds: 250 credits (~$1.30)
- x402: supported with Base USDC and Solana USDC. Call without API-key headers to receive exact payment requirements.
- Reference images supported: 0
- Supported aspects: 16:9, 9:16, 1:1
- Features: requires image input, first frame
- Audio output: silent/no generated audio
- Audio references accepted: no
- First frame input: supported
- Mid frame input: not supported
- Last frame input: not supported
- Reference image input: not supported
- Image input value format: HTTPS URL, `data:image/...;base64,...` data URL, or raw base64 image string for supported image/frame/reference slots. Never send local file paths, file:// URLs, or http:// media URLs.
- Thumbnail: https://wasmall.imgnai.com/static/thumbnails/video/kling16video.webp
- Description: [Legacy] Kling's early-era 1.6 video generation model.
- Custom rules for this model: none