Query AI (Ollama) avatar

Query AI (Ollama)

1 version
Open in App

Query Ollama LLM with streaming or batch responses

Use This When

  • Building conversational AI assistants or chatbots
  • Creating interactive voice interfaces needing natural language understanding
  • Implementing intelligent agents that respond to user prompts
  • Adding LLM reasoning or decision-making to pipelines

What It Does

  • Sends prompts to Ollama-hosted LLM models with configurable temperature
  • Accepts a per-request system prompt as a second input
  • Supports streaming mode pushing token-by-token chunks or batch full responses
  • Optionally displays chain-of-thought reasoning before final answer
  • Preempts ongoing generation when new input arrives if enabled

Works Best With

  • Voice input → transcribe-audio → this component → TTS for voice assistants
  • Document text → this component for Q&A or summarization
  • Integration with activate-voice-assistant for wake-word gated conversations
  • Multi-modal pipelines combining caption-image-lavis → this component for visual Q&A

Caveats

  • Requires Ollama service running and model pulled; component fails without backend
  • Streaming mode generates multiple output messages; downstream must handle fragments
  • Preemption stops ongoing generation abruptly; may truncate useful partial responses
  • Display reasoning mode adds significant latency and token count

Versions

  • 74be85e8latestdefaultlinux/amd64

    Automated release