RAG

Erstellt: 2025-12-20 | Aktualisiert: 2025-12-20

Retrieval-Augmented Generation für kontextbasierte Content-Erstellung.

Modul	generate.py, embed.py
Vektor-DB	Qdrant
Quellen-Tabelle	content_sources

Architektur

Briefing
    ↓
┌─────────────┐
│  Embedding  │  → Vektor aus Briefing
└─────────────┘
    ↓
┌─────────────┐
│   Qdrant    │  → Ähnlichkeitssuche
└─────────────┘
    ↓
┌─────────────┐
│  Kontext    │  → Top-N relevante Chunks
└─────────────┘
    ↓
┌─────────────┐
│    LLM      │  → Content mit Kontext
└─────────────┘

Collections

Collection	Inhalt	Verwendung
documents	Dokumente, PDFs	Allgemeine Wissensbasis
entities	Entitäten, Konzepte	Fachbegriffe, Personen
mail	E-Mail-Korrespondenz	Kommunikationshistorie

get_rag_context() - Zeile 18-35

def get_rag_context(briefing, collection="documents", limit=5):
    """
    Relevanten Kontext aus Qdrant laden.
    
    Args:
        briefing: Suchtext für Ähnlichkeitssuche
        collection: Qdrant-Collection
        limit: Anzahl Ergebnisse
    
    Returns:
        List[{content, source, score}]
    """
    results = search_similar(briefing, collection, limit)
    
    return [{
        "content": r["payload"]["content"],
        "source": r["payload"]["document_title"],
        "score": round(r["score"], 4)
    } for r in results]

Kontext-Limit

Limit	Tokens (~)	Empfehlung
3	~1500	Schnelle Generierung, wenig Kontext
5	~2500	Standard, ausgewogen
10	~5000	Umfangreicher Kontext, langsamer

Prompt-Integration

Der RAG-Kontext wird in den Generierungs-Prompt eingebettet:

## Kontext aus der Wissensbasis:

[Quelle 1: Dokument A]
Lorem ipsum dolor sit amet...

[Quelle 2: Dokument B]
Consectetur adipiscing elit...

## Briefing:
Schreibe einen Artikel über...

Quellen-Speicherung

Nach der Generierung werden Quellen in content_sources gespeichert:

def save_sources(order_id, context):
    for ctx in context:
        # Chunk-ID via Content-Match finden
        chunk = db.execute(
            "SELECT id FROM chunks WHERE content LIKE %s",
            (ctx["content"][:100] + "%",)
        )
        
        # Quelle mit Relevanz-Score speichern
        db.execute(
            "INSERT INTO content_sources 
             (order_id, chunk_id, relevance_score) 
             VALUES (%s, %s, %s)",
            (order_id, chunk.id, ctx["score"])
        )

Relevanz-Score

Der Score zeigt die semantische Ähnlichkeit (0.0 - 1.0):

Score	Interpretation
> 0.85	Sehr relevant
0.70 - 0.85	Relevant
0.50 - 0.70	Möglicherweise relevant
< 0.50	Geringe Relevanz

UI-Anzeige

Quellen werden in der Version-Ansicht angezeigt:

<div class="sources">
  <strong>Quellen:</strong>
  <ul>
    <li>Dokument A (85%)</li>
    <li>Dokument B (72%)</li>
  </ul>
</div>

Qdrant-Verbindung

Die Verbindung wird in embed.py konfiguriert:

from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)

RAG