Query AI (Ollama)
1 version
Query Ollama LLM with streaming or batch responses
Use This When
- Building conversational AI assistants or chatbots
- Creating interactive voice interfaces needing natural language understanding
- Implementing intelligent agents that respond to user prompts
- Adding LLM reasoning or decision-making to pipelines
What It Does
- Sends prompts to Ollama-hosted LLM models with configurable temperature
- Accepts a per-request system prompt as a second input
- Supports streaming mode pushing token-by-token chunks or batch full responses
- Optionally displays chain-of-thought reasoning before final answer
- Preempts ongoing generation when new input arrives if enabled
Works Best With
- Voice input → transcribe-audio → this component → TTS for voice assistants
- Document text → this component for Q&A or summarization
- Integration with activate-voice-assistant for wake-word gated conversations
- Multi-modal pipelines combining caption-image-lavis → this component for visual Q&A
Caveats
- Requires Ollama service running and model pulled; component fails without backend
- Streaming mode generates multiple output messages; downstream must handle fragments
- Preemption stops ongoing generation abruptly; may truncate useful partial responses
- Display reasoning mode adds significant latency and token count
Versions
- 74be85e8latestdefaultlinux/amd64
Automated release