{
"event": "PreToolUse",
"tool_name": "mcp__mcp-tasks__tasks_update",
"tool_input": {
"id": 497,
"description": "## Ziel\nPython-Executor um 6 neue step_types erweitern.\n\n---\n\n## ⚠️ PFLICHT VOR ARBEIT\n\n### 1. CLAUDE.md lesen\n```bash\ncat \/var\/www\/CLAUDE.md\n```\nWichtig: Constants-Pflicht (`from constants import *`)\n\n### 2. Bestehenden Python-Code analysieren\n```bash\nls -la \/opt\/scripts\/pipeline\/\ncat \/opt\/scripts\/pipeline\/executor.py\ncat \/opt\/scripts\/pipeline\/steps\/__init__.py\n```\n\n### 3. Bestehende Step-Implementierungen studieren\n```bash\ncat \/opt\/scripts\/pipeline\/steps\/extract.py\ncat \/opt\/scripts\/pipeline\/steps\/entity.py\n```\n\n---\n\n## Code-Prinzipien\n\n| Prinzip | Anwendung |\n|---------|-----------|\n| **DRY** | Basis-Klasse für LLM-Steps |\n| **KISS** | Ein Step = eine Funktion |\n| **SRP** | Analyse und Store getrennt |\n| **SOLID** | Interface für Step-Contract |\n| **Constants** | `from constants import OLLAMA_TIMEOUT, LLM_TIMEOUT` |\n\n---\n\n## Neue Step-Implementierungen\n\n### Datei: \/opt\/scripts\/pipeline\/steps\/semantic.py\n\n```python\n\"\"\"Textsemantik und Wissenssemantik Steps.\"\"\"\nfrom constants import LLM_TIMEOUT, DEFAULT_LIMIT\nimport ollama\nimport json\n\ndef step_duplicate_check(doc_data: dict, config: dict) -> dict:\n \"\"\"Prüft auf Duplikate via Content-Hash.\"\"\"\n content_hash = doc_data.get('content_hash')\n if not content_hash:\n return {'status': 'skip', 'reason': 'no_hash'}\n \n # Via MCP-DB prüfen (kein direkter SQL!)\n existing = db.select(\n f\"SELECT id FROM documents WHERE content_hash = '{content_hash}' AND id != {doc_data['id']}\",\n database='ki_content'\n )\n \n if existing:\n return {\n 'status': 'abort',\n 'reason': 'duplicate',\n 'duplicate_id': existing[0]['id']\n }\n return {'status': 'ok'}\n\n\ndef step_text_semantic_analyze(chunks: list, config: dict) -> list:\n \"\"\"Analysiert WIE etwas gesagt wird.\"\"\"\n model = config.get('model', 'mistral')\n \n prompt = '''Analysiere den Text semantisch. Antworte NUR mit JSON:\n{\n \"statement_form\": \"assertion|question|command|conditional\",\n \"intent\": \"explain|argue|define|compare|exemplify|warn|instruct\",\n \"frame\": \"theoretical|practical|historical|methodological|critical\",\n \"is_negated\": false,\n \"discourse_role\": \"thesis|evidence|example|counter|summary|definition\"\n}\n\nText: {content}'''\n \n for chunk in chunks:\n response = ollama.generate(\n model=model,\n prompt=prompt.format(content=chunk['content']),\n options={'timeout': LLM_TIMEOUT}\n )\n chunk['text_semantics'] = json.loads(response['response'])\n \n return chunks\n\n\ndef step_text_semantic_store(chunks: list, config: dict) -> dict:\n \"\"\"Speichert Textsemantik in DB.\"\"\"\n stored = 0\n for chunk in chunks:\n if 'text_semantics' not in chunk:\n continue\n sem = chunk['text_semantics']\n db.insert('chunk_text_semantics', {\n 'chunk_id': chunk['id'],\n 'statement_form': sem['statement_form'],\n 'intent': sem.get('intent'),\n 'frame': sem.get('frame'),\n 'is_negated': sem.get('is_negated', False),\n 'discourse_role': sem.get('discourse_role'),\n 'model_used': config.get('model', 'mistral')\n }, database='ki_content')\n stored += 1\n return {'stored': stored}\n\n\ndef step_knowledge_semantic_analyze(entities: list, config: dict) -> list:\n \"\"\"Analysiert WAS Entitäten BEDEUTEN.\"\"\"\n model = config.get('model', 'mistral')\n \n prompt = '''Analysiere die Bedeutung dieser Entität im Kontext. Antworte NUR mit JSON:\n{\n \"semantic_role\": \"agent|patient|instrument|location|cause|effect\",\n \"properties\": {\"key\": \"value\"},\n \"functional_category\": \"method|tool|concept|actor|outcome|process\",\n \"context_meaning\": \"Kurze Erklärung der Bedeutung\"\n}\n\nEntität: {name} (Typ: {type})\nKontext: {context}'''\n \n for entity in entities:\n response = ollama.generate(\n model=model,\n prompt=prompt.format(\n name=entity['name'],\n type=entity['type'],\n context=entity.get('context', '')\n ),\n options={'timeout': LLM_TIMEOUT}\n )\n entity['knowledge_semantics'] = json.loads(response['response'])\n \n return entities\n\n\ndef step_knowledge_semantic_store(entities: list, config: dict) -> dict:\n \"\"\"Speichert Wissenssemantik in DB.\"\"\"\n stored = 0\n for entity in entities:\n if 'knowledge_semantics' not in entity:\n continue\n sem = entity['knowledge_semantics']\n db.insert('entity_knowledge_semantics', {\n 'entity_id': entity['id'],\n 'chunk_id': entity['chunk_id'],\n 'semantic_role': sem.get('semantic_role'),\n 'properties': json.dumps(sem.get('properties', {})),\n 'functional_category': sem.get('functional_category'),\n 'context_meaning': sem.get('context_meaning'),\n 'model_used': config.get('model', 'mistral')\n }, database='ki_content')\n stored += 1\n return {'stored': stored}\n\n\ndef step_ontology_store(classifications: list, config: dict) -> dict:\n \"\"\"Speichert Ontologie-Klassifikationen.\"\"\"\n stored = 0\n for cls in classifications:\n db.insert('ontology_classes', {\n 'entity_id': cls['entity_id'],\n 'class_name': cls['class_name'],\n 'parent_class': cls.get('parent_class'),\n 'confidence': cls.get('confidence', 1.0)\n }, database='ki_content')\n stored += 1\n return {'stored': stored}\n```\n\n### Step-Registry erweitern\n\n```python\n# In \/opt\/scripts\/pipeline\/steps\/__init__.py\nfrom .semantic import (\n step_duplicate_check,\n step_text_semantic_analyze,\n step_text_semantic_store,\n step_knowledge_semantic_analyze,\n step_knowledge_semantic_store,\n step_ontology_store,\n)\n\nSTEP_REGISTRY = {\n # ... bestehende Steps ...\n 'duplicate_check': step_duplicate_check,\n 'text_semantic_analyze': step_text_semantic_analyze,\n 'text_semantic_store': step_text_semantic_store,\n 'knowledge_semantic_analyze': step_knowledge_semantic_analyze,\n 'knowledge_semantic_store': step_knowledge_semantic_store,\n 'ontology_store': step_ontology_store,\n}\n```\n\n---\n\n## Qualitätssicherung\n\n```bash\n# Syntax prüfen\npython3 -m py_compile \/opt\/scripts\/pipeline\/steps\/semantic.py\n\n# Imports testen\ncd \/opt\/scripts\/pipeline && python3 -c \"from steps.semantic import *; print('OK')\"\n\n# Ruff Check\n\/opt\/scripts\/pipeline\/venv\/bin\/ruff check steps\/semantic.py\n```\n\n---\n\n## Akzeptanzkriterien\n- [ ] CLAUDE.md gelesen\n- [ ] Bestehender Code analysiert (DRY-Prüfung)\n- [ ] Constants importiert (keine Magic Numbers)\n- [ ] 6 neue Step-Funktionen implementiert\n- [ ] Step-Registry erweitert\n- [ ] Syntax-Check bestanden\n- [ ] Ruff-Check bestanden"
}
}
{
"tool_response": [
{
"type": "text",
"text": "{\n \"success\": true,\n \"task\": {\n \"id\": 497,\n \"uuid\": \"b265c524-ca27-4dbf-9cde-d94f594464aa\",\n \"title\": \"5\/6 Python-Executor: Implementierung der 6 neuen step_types\",\n \"description\": \"## Ziel\\nPython-Executor um 6 neue step_types erweitern.\\n\\n---\\n\\n## ⚠️ PFLICHT VOR ARBEIT\\n\\n### 1. CLAUDE.md lesen\\n```bash\\ncat \/var\/www\/CLAUDE.md\\n```\\nWichtig: Constants-Pflicht (`from constants import *`)\\n\\n### 2. Bestehenden Python-Code analysieren\\n```bash\\nls -la \/opt\/scripts\/pipeline\/\\ncat \/opt\/scripts\/pipeline\/executor.py\\ncat \/opt\/scripts\/pipeline\/steps\/__init__.py\\n```\\n\\n### 3. Bestehende Step-Implementierungen studieren\\n```bash\\ncat \/opt\/scripts\/pipeline\/steps\/extract.py\\ncat \/opt\/scripts\/pipeline\/steps\/entity.py\\n```\\n\\n---\\n\\n## Code-Prinzipien\\n\\n| Prinzip | Anwendung |\\n|---------|-----------|\\n| **DRY** | Basis-Klasse für LLM-Steps |\\n| **KISS** | Ein Step = eine Funktion |\\n| **SRP** | Analyse und Store getrennt |\\n| **SOLID** | Interface für Step-Contract |\\n| **Constants** | `from constants import OLLAMA_TIMEOUT, LLM_TIMEOUT` |\\n\\n---\\n\\n## Neue Step-Implementierungen\\n\\n### Datei: \/opt\/scripts\/pipeline\/steps\/semantic.py\\n\\n```python\\n\\\"\\\"\\\"Textsemantik und Wissenssemantik Steps.\\\"\\\"\\\"\\nfrom constants import LLM_TIMEOUT, DEFAULT_LIMIT\\nimport ollama\\nimport json\\n\\ndef step_duplicate_check(doc_data: dict, config: dict) -> dict:\\n \\\"\\\"\\\"Prüft auf Duplikate via Content-Hash.\\\"\\\"\\\"\\n content_hash = doc_data.get('content_hash')\\n if not content_hash:\\n return {'status': 'skip', 'reason': 'no_hash'}\\n \\n # Via MCP-DB prüfen (kein direkter SQL!)\\n existing = db.select(\\n f\\\"SELECT id FROM documents WHERE content_hash = '{content_hash}' AND id != {doc_data['id']}\\\",\\n database='ki_content'\\n )\\n \\n if existing:\\n return {\\n 'status': 'abort',\\n 'reason': 'duplicate',\\n 'duplicate_id': existing[0]['id']\\n }\\n return {'status': 'ok'}\\n\\n\\ndef step_text_semantic_analyze(chunks: list, config: dict) -> list:\\n \\\"\\\"\\\"Analysiert WIE etwas gesagt wird.\\\"\\\"\\\"\\n model = config.get('model', 'mistral')\\n \\n prompt = '''Analysiere den Text semantisch. Antworte NUR mit JSON:\\n{\\n \\\"statement_form\\\": \\\"assertion|question|command|conditional\\\",\\n \\\"intent\\\": \\\"explain|argue|define|compare|exemplify|warn|instruct\\\",\\n \\\"frame\\\": \\\"theoretical|practical|historical|methodological|critical\\\",\\n \\\"is_negated\\\": false,\\n \\\"discourse_role\\\": \\\"thesis|evidence|example|counter|summary|definition\\\"\\n}\\n\\nText: {content}'''\\n \\n for chunk in chunks:\\n response = ollama.generate(\\n model=model,\\n prompt=prompt.format(content=chunk['content']),\\n options={'timeout': LLM_TIMEOUT}\\n )\\n chunk['text_semantics'] = json.loads(response['response'])\\n \\n return chunks\\n\\n\\ndef step_text_semantic_store(chunks: list, config: dict) -> dict:\\n \\\"\\\"\\\"Speichert Textsemantik in DB.\\\"\\\"\\\"\\n stored = 0\\n for chunk in chunks:\\n if 'text_semantics' not in chunk:\\n continue\\n sem = chunk['text_semantics']\\n db.insert('chunk_text_semantics', {\\n 'chunk_id': chunk['id'],\\n 'statement_form': sem['statement_form'],\\n 'intent': sem.get('intent'),\\n 'frame': sem.get('frame'),\\n 'is_negated': sem.get('is_negated', False),\\n 'discourse_role': sem.get('discourse_role'),\\n 'model_used': config.get('model', 'mistral')\\n }, database='ki_content')\\n stored += 1\\n return {'stored': stored}\\n\\n\\ndef step_knowledge_semantic_analyze(entities: list, config: dict) -> list:\\n \\\"\\\"\\\"Analysiert WAS Entitäten BEDEUTEN.\\\"\\\"\\\"\\n model = config.get('model', 'mistral')\\n \\n prompt = '''Analysiere die Bedeutung dieser Entität im Kontext. Antworte NUR mit JSON:\\n{\\n \\\"semantic_role\\\": \\\"agent|patient|instrument|location|cause|effect\\\",\\n \\\"properties\\\": {\\\"key\\\": \\\"value\\\"},\\n \\\"functional_category\\\": \\\"method|tool|concept|actor|outcome|process\\\",\\n \\\"context_meaning\\\": \\\"Kurze Erklärung der Bedeutung\\\"\\n}\\n\\nEntität: {name} (Typ: {type})\\nKontext: {context}'''\\n \\n for entity in entities:\\n response = ollama.generate(\\n model=model,\\n prompt=prompt.format(\\n name=entity['name'],\\n type=entity['type'],\\n context=entity.get('context', '')\\n ),\\n options={'timeout': LLM_TIMEOUT}\\n )\\n entity['knowledge_semantics'] = json.loads(response['response'])\\n \\n return entities\\n\\n\\ndef step_knowledge_semantic_store(entities: list, config: dict) -> dict:\\n \\\"\\\"\\\"Speichert Wissenssemantik in DB.\\\"\\\"\\\"\\n stored = 0\\n for entity in entities:\\n if 'knowledge_semantics' not in entity:\\n continue\\n sem = entity['knowledge_semantics']\\n db.insert('entity_knowledge_semantics', {\\n 'entity_id': entity['id'],\\n 'chunk_id': entity['chunk_id'],\\n 'semantic_role': sem.get('semantic_role'),\\n 'properties': json.dumps(sem.get('properties', {})),\\n 'functional_category': sem.get('functional_category'),\\n 'context_meaning': sem.get('context_meaning'),\\n 'model_used': config.get('model', 'mistral')\\n }, database='ki_content')\\n stored += 1\\n return {'stored': stored}\\n\\n\\ndef step_ontology_store(classifications: list, config: dict) -> dict:\\n \\\"\\\"\\\"Speichert Ontologie-Klassifikationen.\\\"\\\"\\\"\\n stored = 0\\n for cls in classifications:\\n db.insert('ontology_classes', {\\n 'entity_id': cls['entity_id'],\\n 'class_name': cls['class_name'],\\n 'parent_class': cls.get('parent_class'),\\n 'confidence': cls.get('confidence', 1.0)\\n }, database='ki_content')\\n stored += 1\\n return {'stored': stored}\\n```\\n\\n### Step-Registry erweitern\\n\\n```python\\n# In \/opt\/scripts\/pipeline\/steps\/__init__.py\\nfrom .semantic import (\\n step_duplicate_check,\\n step_text_semantic_analyze,\\n step_text_semantic_store,\\n step_knowledge_semantic_analyze,\\n step_knowledge_semantic_store,\\n step_ontology_store,\\n)\\n\\nSTEP_REGISTRY = {\\n # ... bestehende Steps ...\\n 'duplicate_check': step_duplicate_check,\\n 'text_semantic_analyze': step_text_semantic_analyze,\\n 'text_semantic_store': step_text_semantic_store,\\n 'knowledge_semantic_analyze': step_knowledge_semantic_analyze,\\n 'knowledge_semantic_store': step_knowledge_semantic_store,\\n 'ontology_store': step_ontology_store,\\n}\\n```\\n\\n---\\n\\n## Qualitätssicherung\\n\\n```bash\\n# Syntax prüfen\\npython3 -m py_compile \/opt\/scripts\/pipeline\/steps\/semantic.py\\n\\n# Imports testen\\ncd \/opt\/scripts\/pipeline && python3 -c \\\"from steps.semantic import *; print('OK')\\\"\\n\\n# Ruff Check\\n\/opt\/scripts\/pipeline\/venv\/bin\/ruff check steps\/semantic.py\\n```\\n\\n---\\n\\n## Akzeptanzkriterien\\n- [ ] CLAUDE.md gelesen\\n- [ ] Bestehender Code analysiert (DRY-Prüfung)\\n- [ ] Constants importiert (keine Magic Numbers)\\n- [ ] 6 neue Step-Funktionen implementiert\\n- [ ] Step-Registry erweitert\\n- [ ] Syntax-Check bestanden\\n- [ ] Ruff-Check bestanden\",\n \"type\": \"ai_task\",\n \"status\": \"pending\",\n \"created_by\": \"mcp-tasks\",\n \"created_by_type\": \"ai\",\n \"parent_task_id\": 492,\n \"due_date\": null,\n \"created_at\": \"2025-12-28T01:33:12.736956\",\n \"updated_at\": \"2025-12-28T01:41:04.310610\",\n \"completed_at\": null,\n \"metadata\": {}\n },\n \"message\": \"Task #497 updated\"\n}"
}
]
}