Chat API

RAG-basierter Chat mit Vektorsuche und LLM-Integration. Verwendet lokale Embeddings (Ollama), Vektorsuche (Qdrant) und Claude (Anthropic) oder Ollama als LLM.

Endpunkt (HTML)	POST /chat/message
Endpunkt (JSON)	POST /api/v1/chat
Controller	/src/Controller/ChatController.php
Service	/src/Infrastructure/AI/ChatService.php

Request

POST /api/v1/chat
Content-Type: application/json

{
    "question": "Was ist systemische Therapie?",
    "model": "anthropic",
    "collection": "documents",
    "limit": 5
}

Parameter	Typ	Default	Beschreibung
question	string	required	Benutzer-Frage
model	string	"anthropic"	"anthropic" (claude-opus-4-5-20251101) oder "ollama" (gemma3:4b-it-qat)
collection	string	"documents"	Qdrant Collection
limit	int	5	Max. Kontext-Chunks

Response

{
    "answer": "Systemische Therapie ist ein psychotherapeutischer Ansatz...",
    "sources": [
        {
            "title": "Einführung in die systemische Therapie",
            "content": "...",
            "score": 0.89,
            "path": "/Documents/therapie/einfuehrung.pdf"
        }
    ],
    "model": "claude-opus-4-5-20251101",
    "tokens": 1234
}

RAG-Pipeline

Die Chat-Funktionalität nutzt native PHP-Services (keine Python-Aufrufe mehr):

1. Embedding erstellen (OllamaService)
   question → mxbai-embed-large → [1024-dim vector]

2. Vektorsuche (QdrantService)
   vector → Qdrant :6333 → top-k similar chunks

3. Kontext aufbauen (ChatService)
   chunks → formatierter Kontext mit Quellenangaben
   max. 3000 tokens (~12000 chars)

4. LLM-Anfrage (ClaudeService oder OllamaService)
   system_prompt + rag_prompt + context → answer

5. Quellen extrahieren (ChatService)
   search_results → deduplizierte Quellen-Liste

6. Response zusammenstellen (ChatService)
   answer + sources + metadata + usage

Architektur

Browser/Client
    ↓
ChatController (/src/Controller/)
    ↓
ChatService (/src/Infrastructure/AI/)
    ↓
┌──────────────────┬──────────────────┬──────────────────┐
│ OllamaService    │ QdrantService    │ ClaudeService    │
│ (Embeddings)     │ (Vektorsuche)    │ (LLM)            │
└──────────────────┴──────────────────┴──────────────────┘
    ↓                   ↓                   ↓
┌──────────────────┬──────────────────┬──────────────────┐
│ Ollama API       │ Qdrant API       │ Anthropic API    │
│ :11434           │ :6333            │ api.anthropic... │
└──────────────────┴──────────────────┴──────────────────┘

Migration: Python shell_exec → Native PHP (2025-12-20)
Status: Produktiv

Fehlerbehandlung

HTTP Code	Bedeutung
400	Fehlende oder ungültige Parameter
500	Service-Fehler (Ollama, Qdrant, Claude)
503	Service nicht verfügbar

Exception-Handling

Alle Services werfen RuntimeException bei Fehlern:

try {
    $config = AIConfig::fromCredentialsFile();
    $chat = $config->createChatService();
    $result = $chat->chat($question);
} catch (RuntimeException $e) {
    // Fehler-Logging
    error_log('Chat failed: ' . $e->getMessage());

    // JSON Response
    http_response_code(500);
    echo json_encode([
        'error' => 'Chat request failed',
        'message' => $e->getMessage()
    ]);
}

Typische Fehler

Fehler	Ursache	Lösung
Embedding generation failed	Ollama nicht erreichbar	systemctl status ollama prüfen
Vector search failed	Qdrant nicht erreichbar	systemctl status qdrant prüfen
No relevant documents found	Keine Treffer in Collection	Collection prüfen, limit erhöhen
Claude API request failed	API-Key ungültig, Rate-Limit	credentials.md prüfen, warten

HTMX-Integration

<form hx-post="/chat/message" hx-target="#chat-messages" hx-swap="beforeend">
    <select name="model" id="chat-model">
        <option value="anthropic">claude-opus-4-5-20251101</option>
        <option value="ollama">gemma3:4b-it-qat (lokal)</option>
    </select>
    <input type="text" name="message" placeholder="Frage stellen...">
    <button type="submit">Senden</button>
</form>

<div id="chat-messages"></div>

Model-Persistenz

Die Modell-Auswahl wird im localStorage gespeichert und beim Seitenaufruf wiederhergestellt:

// Speichern
localStorage.setItem('chat-model-preference', model);

// Laden
const saved = localStorage.getItem('chat-model-preference');