Skip to main content
POST
/
v1
/
chat
/
completions
Create chat completion
curl --request POST \
  --url https://api.siray.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "frequency_penalty": -2,
  "max_tokens": 1,
  "messages": [
    {
      "content": "example_value",
      "role": "system"
    }
  ],
  "model": "z-ai/glm-4.5v",
  "presence_penalty": -2,
  "repetition_penalty": 0,
  "stream": false,
  "temperature": 0,
  "thinking_enabled": false,
  "top_p": 0,
  "web_search": {
    "enable": false
  }
}'
{}

Authorizations

Authorization
string
header
required

Bearer authentication using API key

Body

application/json

Request payload

OpenAI-compatible chat completions API request format

messages
object[]
required

Array of conversation messages with roles

Minimum length: 1
model
string
required

Model name to use for the request

frequency_penalty
number

Penalty for frequent tokens

Required range: -2 <= x <= 2
max_tokens
integer

Maximum number of tokens to generate

Required range: 1 <= x <= 4096
presence_penalty
number

Penalty for new topics

Required range: -2 <= x <= 2
repetition_penalty
number

Penalty for repeating tokens (1.0 = no penalty)

Required range: x >= 0
stream
boolean
default:false

Enable streaming response

temperature
number

Controls randomness in output (higher = more random)

Required range: 0 <= x <= 2
thinking_enabled
boolean
default:false

Enable thinking/reasoning mode

top_p
number

Nucleus sampling parameter (controls diversity)

Required range: 0 <= x <= 1

Web search configuration

Response

Successful response

Response object