Toggle enable_thinking and low_effort live. Dial reasoning_budget to control thinking tokens.
Watch green reasoning tokens stream, then orange incremental delta.tool_calls args build before the tool executes.
The model is post-trained for long-horizon orchestration. This tiny harness makes the capabilities visible.
git clone https://github.com/cobusgreyling/nemotron-3-ultra-showcase.git
cd nemotron-3-ultra-showcase
pip install -r requirements.txt
export NVIDIA_API_KEY="nvapi-..."
python app.py
# open http://localhost:7860
See the full analysis and build notes in BLOG.md. The same reasoning + tool patterns are productionized in nemotron-think.