🌙 Moondream 3 Preview - Vision Language Model

Architecture: 9B total parameters, 2B active, with mixture-of-experts
Skills: Query (Q&amp;A), Caption, Point detection, Object detection
Features: 32K context length, multi-crop high resolution processing
Model: moondream/moondream3-preview

Experience the power of Moondream 3, a state-of-the-art vision language model with mixture-of-experts architecture. This demo showcases all four skills: Query, Caption, Point, and Detect.

Example Queries

🌙 Moondream 3 Preview - Vision Language Model