跳转到主要内容
POST
/
v1
/
chat
/
completions
Chat completions format
curl --request POST \
  --url https://tokensmind.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "content": {
        "type": "<string>",
        "text": "<string>",
        "image_url": "<string>",
        "video_url": "<string>"
      },
      "role": "<string>",
      "name": "<string>"
    }
  ],
  "max_tokens": 123,
  "stream": {},
  "stream_options": {
    "include_usage": true
  },
  "n": {},
  "seed": {},
  "frequency_penalty": {},
  "presence_penalty": {},
  "repetition_penalty": {},
  "stop": {},
  "temperature": {},
  "top_p": {},
  "top_k": {},
  "min_p": {},
  "logit_bias": {},
  "logprobs": {},
  "top_logprobs": {},
  "tools": {
    "type": "<string>",
    "function": {
      "name": "<string>",
      "description": {},
      "parameters": {},
      "strict": true
    }
  },
  "response_format": {
    "type": "<string>",
    "json_schema": {
      "name": "<string>",
      "description": {},
      "schema": {},
      "strict": true
    }
  },
  "separate_reasoning": {},
  "enable_thinking": {}
}
'
{
  "id": "<string>",
  "object": "chat.completion",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "system",
        "content": "<string>",
        "name": "<string>",
        "tool_calls": [
          {
            "id": "<string>",
            "type": "function",
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            }
          }
        ],
        "tool_call_id": "<string>",
        "reasoning_content": "<string>"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "text_tokens": 123,
      "audio_tokens": 123,
      "image_tokens": 123
    },
    "completion_tokens_details": {
      "text_tokens": 123,
      "audio_tokens": 123,
      "reasoning_tokens": 123
    }
  },
  "system_fingerprint": "<string>"
}

授权

Authorization
string
header
必填

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

请求头

Content-Type
enum<string>
必填
可用选项:
application/json
Authorization
string
必填

Bearer authentication: Bearer {{API key}}.

请求体

application/json
model
string
必填

Model ID

示例:

"gpt-4"

messages
object[]
必填

Conversation messages

temperature
number
默认值:1

Sampling temperature between 0 and 2. Higher values (e.g. 0.8) make output more random; lower values (e.g. 0.2) make it more focused and deterministic.

We generally recommend changing either this or top_p, not both.

必填范围: 0 <= x <= 2
top_p
number | null
默认值:1

An alternative to temperature, called nucleus sampling: the model considers tokens whose cumulative probability mass is top_p. So 0.1 means only the top 10% probability mass is considered. We generally recommend changing either this or temperature, not both.

必填范围: 0 < x <= 1
n
integer
默认值:1

How many completions to generate for each prompt. Note: This can consume your token quota quickly. Use with care and set max_tokens and stop to reasonable values.

必填范围: 1 <= x <= 128
stream
boolean
默认值:false

Whether to stream partial progress. When enabled, tokens are sent as data-only server-sent events (SSE) as they become available; the stream ends with a data: [DONE] message.

stream_options
object

Options for streaming responses. Only set when stream is true.

stop
string

Up to 4 sequences where the API will stop generating further tokens. Returned text includes the stop sequence.

max_tokens
integer

Maximum number of tokens to generate

If your prompt (prior messages) plus max_tokens would exceed the model context length, behavior depends on context_length_exceeded_behavior. By default, max_tokens is reduced to fit the context window instead of returning an error.

max_completion_tokens
integer

Maximum completion tokens

presence_penalty
number
默认值:0

Positive values penalize new tokens based on whether they appear in the text so far, increasing the chance the model talks about new topics.

For mild repetition reduction, try roughly 0.1 to 1. For strong suppression, you can increase toward 2, but quality may drop. Negative values encourage repetition.

See also frequency_penalty for penalizing tokens by how often they appear.

Required range: -2 < x < 2

必填范围: -2 <= x <= 2
frequency_penalty
number
默认值:0

Positive values penalize new tokens based on their existing frequency in the text, reducing the chance of repeating the same line verbatim.

For mild repetition reduction, try roughly 0.1 to 1. For strong suppression, you can increase toward 2, but quality may drop. Negative values encourage repetition.

See also presence_penalty for a fixed penalty on tokens that appear at least once.

Required range: -2 < x < 2

必填范围: -2 <= x <= 2
logit_bias
object

Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object mapping tokens to bias values between -100 and 100. Mathematically, the bias is added to the logits before sampling. Exact behavior varies by model. For example, "logit_bias":{"1024": 6} increases the likelihood of token ID 1024.

tools
object[] | null

Tools the model may call. Currently only functions are supported. Provide a list of functions the model can generate JSON arguments for.

See the function calling guide for more.

tool_choice
可用选项:
none,
auto,
required
response_format
object
seed
integer

If specified, the system will try to sample deterministically so repeated requests with the same seed and parameters return the same result.

reasoning_effort
enum<string>

Reasoning effort (for models that support reasoning)

可用选项:
low,
medium,
high
modalities
enum<string>[]
可用选项:
text,
audio
audio
object
repetition_penalty
integer

Penalty on repeated tokens to discourage or encourage repetition. 1.0 means no penalty and free repetition. Values above 1.0 penalize repetition. Values between 0.0 and 1.0 reward repetition. A balanced value is often around 1.2. The penalty applies to generated output and, in decoder-only models, to the prompt.

Required range: 0 < x < 2

top_k
integer | null

Top-k sampling keeps only the k most likely next tokens and redistributes probability mass among them. k controls how many candidates are considered per step.

必填范围: 0 < x < 128
min_p
number | null

Minimum relative probability for a token to be considered, compared to the most likely token.

必填范围: 0 <= x <= 1
logprobs
boolean | null
默认值:false

Whether to return log probabilities of output tokens. If true, returns logprobs for each output token in the message content.

top_logprobs
integer | null

Integer between 0 and 20: how many most likely tokens to return at each position, each with an associated log probability. If you use this, set logprobs to true.

必填范围: 0 <= x <= 20

响应

Response created successfully

id
string
object
string
示例:

"chat.completion"

created
integer
model
string
choices
object[]
usage
object
system_fingerprint
string