Skip to main content
POST
/
v1
/
completions
Native OpenAI format
curl --request POST \
  --url https://tokensmind.ai/v1/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data @- <<EOF
"curl --request POST \\\n  --url https: //api.jiekou.ai/openai/v1/completions \\\n  --header 'Authorization: <authorization>' \\\n  --header 'Content-Type: <content-type>' \\\n  --data '\n{\n    \"model\": \"<string>\",\n    \"prompt\": \"<string>\",\n    \"max_tokens\": 123,\n    \"stream\": {},\n    \"stream_options\": {\n        \"include_usage\": true\n    },\n    \"n\": {},\n    \"seed\": {},\n    \"frequency_penalty\": {},\n    \"presence_penalty\": {},\n    \"repetition_penalty\": {},\n    \"stop\": {},\n    \"temperature\": {},\n    \"top_p\": {},\n    \"top_k\": {},\n    \"min_p\": {},\n    \"logit_bias\": {},\n    \"logprobs\": {},\n    \"best_of\": 123\n}\n'"
EOF
{
  "choices": [
    {
      "finish_reason": "<string>",
      "index": 123,
      "logprobs": {
        "text_offset": [
          {}
        ],
        "token_logprobs": [
          {}
        ],
        "tokens": [
          {}
        ],
        "top_logprobs": [
          {
            "{key}": 123
          }
        ]
      },
      "text": "<string>"
    }
  ],
  "created": 123,
  "id": "<string>",
  "model": "<string>",
  "object": "<string>",
  "usage": {
    "completion_tokens": 123,
    "prompt_tokens": 123,
    "total_tokens": 123
  }
}

Headers

Content-Type
string
required
Authorization
string
required

Body

application/json
model
string
required

Model name to use. See the model catalog on TokensMind AI for available names

prompt
string
required

Prompt used to generate the completion (may be a string, array of strings, array of tokens, or array of token arrays).

best_of
integer
required

Defaults to 1. Generates best_of completions on the server and returns the "best" one (highest per-token log probability). Cannot be used with streaming.

When used with n, best_of is the number of candidate completions and n is how many to return. best_of must be greater than n.

Note: This can consume your token quota quickly. Use with care and set max_tokens and stop reasonably.

max_tokens
integer

Maximum number of tokens to generate. The sum of prompt tokens and max_tokens must not exceed the model's context length.

temperature
number | null

Sampling temperature; default is 1, between 0 and 2. Higher values (e.g. 0.8) make output more random; lower values (e.g. 0.2) make it more focused and deterministic.

We generally recommend changing either this or top_p, not both.

Required range: 0 <= x <= 2
top_p
number | null

An alternative to temperature, called nucleus sampling: the model considers tokens whose cumulative probability mass is top_p. So 0.1 means only the top 10% probability mass is considered. We generally recommend changing either this or temperature, not both.

Required range: 0 < x <= 1
n
integer | null

Note: This parameter can generate many completions and may consume your token quota quickly. Use with care and set max_tokens and stop to reasonable values.

Required range: 0 < x < 128
stream
boolean | null

Whether to stream tokens. Defaults to false. When enabled, tokens are sent as data-only server-sent events (SSE), ending with a data: [DONE] message.

stop
string

Up to 4 sequences where the API will stop generating further tokens. Returned text includes the stop sequence.

stream_options
object

Streaming options. Only set when stream is true.

seed
integer | null

If specified, the system will try to sample deterministically so repeated requests with the same seed and parameters return the same result.

frequency_penalty
number | null
default:0

Default 0. Positive values penalize new tokens based on their frequency in the text so far, reducing repetition.

For mild repetition reduction, try roughly 0.1 to 1. For strong suppression, you can increase toward 2, but quality may drop. Negative values encourage repetition.

See also presence_penalty for a fixed penalty on tokens that appear at least once

Required range: -2 < x < 2
presence_penalty
number | null

Default 0. Positive values penalize new tokens based on whether they appear in the text so far, encouraging new topics.

For mild repetition reduction, try roughly 0.1 to 1. For strong suppression, you can increase toward 2, but quality may drop significantly. Negative values encourage repetition.

See also frequency_penalty for frequency-based penalization

Required range: -2 < x < 2
repetition_penalty
number | null

Penalty applied to repeated tokens to discourage or encourage repetition. 1.0 means no penalty and free repetition. Values above 1.0 penalize repetition. Values between 0.0 and 1.0 reward repetition. A balanced value is often around 1.2. On decoder-only models, the penalty applies to both the prompt and the generated output.

Required range: 0 < x < 2
top_k
number | null

Top-k sampling keeps only the k most likely next tokens and redistributes probability mass among them. k controls how many candidates are considered per step.

Required range: 1 < x < 128
min_p
number | null

Minimum relative probability for a token to be considered, compared to the most likely token.

logit_bias
object

Defaults to null. Modifies likelihood of specified tokens in the completion. Accepts a JSON object mapping token IDs to bias values from -100 to 100.

01KP5NABP505Q46C0NW8XXAPCD
any
logprobs
integer | null

Returns log probabilities for the top logprobs output tokens at each step, including the probability of the selected token. For example, if logprobs is 5, the API returns the top 5 tokens' logprobs per step.

The maximum value for logprobs is 5.

Response

200 - application/json

Completion created successfully

id
string
object
string
Example:

"text_completion"

created
integer
model
string
choices
object[]
usage
object