Glm 4.5v API

Create chat completion

curl --request POST \
  --url https://api.siray.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "frequency_penalty": -2,
  "max_tokens": 1,
  "messages": [
    {
      "content": "example_value",
      "role": "system"
    }
  ],
  "model": "z-ai/glm-4.5v",
  "presence_penalty": -2,
  "repetition_penalty": 0,
  "stream": false,
  "temperature": 0,
  "thinking_enabled": false,
  "top_p": 0,
  "web_search": {
    "enable": false
  }
}'

{}

POST

chat

completions

Create chat completion

curl --request POST \
  --url https://api.siray.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "frequency_penalty": -2,
  "max_tokens": 1,
  "messages": [
    {
      "content": "example_value",
      "role": "system"
    }
  ],
  "model": "z-ai/glm-4.5v",
  "presence_penalty": -2,
  "repetition_penalty": 0,
  "stream": false,
  "temperature": 0,
  "thinking_enabled": false,
  "top_p": 0,
  "web_search": {
    "enable": false
  }
}'

{}

Authorizations

Authorization

string

header

required

Bearer authentication using API key

Body

application/json

Request payload

OpenAI-compatible chat completions API request format

messages

object[]

required

Array of conversation messages with roles

Minimum length: 1

Show child attributes

model

string

required

Model name to use for the request

frequency_penalty

number

Penalty for frequent tokens

Required range: -2 <= x <= 2

max_tokens

integer

Maximum number of tokens to generate

Required range: 1 <= x <= 4096

presence_penalty

number

Penalty for new topics

Required range: -2 <= x <= 2

repetition_penalty

number

Penalty for repeating tokens (1.0 = no penalty)

Required range: x >= 0

stream

boolean

default:false

Enable streaming response

temperature

number

Controls randomness in output (higher = more random)

Required range: 0 <= x <= 2

thinking_enabled

boolean

default:false

Enable thinking/reasoning mode

top_p

number

Nucleus sampling parameter (controls diversity)

Required range: 0 <= x <= 1

web_search

object

Web search configuration

Show child attributes

Response

Successful response

Response object

Grok 4 Fast Reasoning API Glm 4.6 API

⌘I

API documentation

Model API Examples

Model APIs

Authorizations

Body

Response