Skip to main contentPlayground
The flagship Multimodal LLM that natively processes text, image, and audio inputs. It offers superior real-time interaction, coding, and complex reasoning performance.
Key Features
- Native Multimodality: Processes all inputs (text, image, audio) through a single network for seamless performance.
- Real-Time Speed: Optimized for rapid, low-latency interaction, enabling fluid voice and video chat applications.
- Superior Reasoning: High performance in complex logic, coding, and multi-step problem-solving.