Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.siray.ai/llms.txt

Use this file to discover all available pages before exploring further.

Model Use Cases

A cutting-edge MoE model achieving SOTA performance across text, image, audio, and video simultaneously. It uses a Thinker–Talker architecture for low-latency, real-time, streaming responses.

Try Qwen3 Omni Flash on Siray.ai

Key Features

  • Natively Omni-Modal: Unifies processing of text, image, audio, and video, ensuring high performance across all modalities.
  • Real-Time Speed: Features ultra-low latency streaming and natural speech output, enabling fluent audio-visual dialogue.
  • SOTA Audio: Achieves state-of-the-art results in audio benchmarks, excelling at speech recognition and sound analysis.
  • Flexible Control: Supports customization via system prompts and function calling for seamless integration with external tools.

Get Started with the API