feat(skills): add databricks-memory-prompts skill#564
Open
menonpg wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Databricks “skill” (databricks-memory-prompts) documenting patterns for memory-aware prompt construction using Unity Catalog tables, Vector Search retrieval, a Lakeflow learning loop, and MLflow Prompt Registry/Tracing lineage.
Changes:
- Introduces a new
SKILL.mdwith end-to-end patterns (schema → retrieval → prompt enhancement → MLflow registration). - Adds reference guides for Unity Catalog DDL, Vector Search setup, a Lakeflow learning pipeline, and MLflow integration patterns.
- Provides example snippets for tracing, experiments, and prompt evolution/lineage.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
databricks-skills/databricks-memory-prompts/SKILL.md |
Main skill doc with quick start schema + MemoryPromptEnhancer example and usage patterns |
databricks-skills/databricks-memory-prompts/1-memory-schema.md |
Detailed Unity Catalog DDL for memory tables and supporting structures |
databricks-skills/databricks-memory-prompts/2-vector-search-setup.md |
Vector Search endpoint/index setup and query helper examples |
databricks-skills/databricks-memory-prompts/3-learning-pipeline.md |
Lakeflow pipeline example for extracting/updating/decaying memories |
databricks-skills/databricks-memory-prompts/4-mlflow-integration.md |
MLflow Prompt Registry + tracing + experiment tracking + evolution examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+33
to
+41
| CREATE TABLE IF NOT EXISTS catalog.memory.decisions ( | ||
| id STRING DEFAULT uuid(), | ||
| decision TEXT NOT NULL, | ||
| context TEXT, | ||
| rationale TEXT, | ||
| created_at TIMESTAMP DEFAULT current_timestamp(), | ||
| confidence DOUBLE DEFAULT 1.0, | ||
| tags ARRAY<STRING> | ||
| ); |
Comment on lines
+44
to
+52
| CREATE TABLE IF NOT EXISTS catalog.memory.patterns ( | ||
| id STRING DEFAULT uuid(), | ||
| pattern TEXT NOT NULL, | ||
| evidence TEXT, | ||
| frequency INT DEFAULT 1, | ||
| last_seen TIMESTAMP DEFAULT current_timestamp(), | ||
| confidence DOUBLE, | ||
| tags ARRAY<STRING> | ||
| ); |
Comment on lines
+55
to
+63
| CREATE TABLE IF NOT EXISTS catalog.memory.feedback ( | ||
| id STRING DEFAULT uuid(), | ||
| run_id STRING, | ||
| input_hash STRING, | ||
| output TEXT, | ||
| correction TEXT, | ||
| feedback_type STRING, -- 'correction', 'preference', 'complaint' | ||
| created_at TIMESTAMP DEFAULT current_timestamp() | ||
| ); |
Comment on lines
+83
to
+101
| def retrieve_context(self, task_description: str, k: int = 5) -> dict: | ||
| """Retrieve relevant memories for a task.""" | ||
| # Semantic search across memory | ||
| results = self.vs_client.index(self.vector_index).similarity_search( | ||
| query_text=task_description, | ||
| columns=["memory_type", "content", "confidence"], | ||
| num_results=k | ||
| ) | ||
|
|
||
| # Group by type | ||
| context = {"decisions": [], "patterns": [], "feedback": []} | ||
| for row in results.get("result", {}).get("data_array", []): | ||
| memory_type = row[0] | ||
| if memory_type in context: | ||
| context[memory_type].append({ | ||
| "content": row[1], | ||
| "confidence": row[2] | ||
| }) | ||
| return context |
Comment on lines
+103
to
+124
| def enhance_prompt(self, base_prompt: str, task: str) -> str: | ||
| """Enhance a prompt with memory context.""" | ||
| context = self.retrieve_context(task) | ||
|
|
||
| # Build context block | ||
| context_lines = [] | ||
|
|
||
| if context["decisions"]: | ||
| context_lines.append("## Relevant Decisions") | ||
| for d in context["decisions"]: | ||
| context_lines.append(f"- {d['content']} (confidence: {d['confidence']:.2f})") | ||
|
|
||
| if context["patterns"]: | ||
| context_lines.append("\n## Learned Patterns") | ||
| for p in context["patterns"]: | ||
| context_lines.append(f"- {p['content']}") | ||
|
|
||
| if context["feedback"]: | ||
| context_lines.append("\n## Past Feedback") | ||
| for f in context["feedback"]: | ||
| context_lines.append(f"- {f['content']}") | ||
|
|
Comment on lines
+353
to
+362
| mem = spark.sql(f""" | ||
| SELECT * FROM catalog.memory.{link.memory_type}s | ||
| WHERE id = '{link.memory_id}' | ||
| """).first() | ||
| if mem: | ||
| memories.append({ | ||
| "type": link.memory_type, | ||
| "content": mem.get("decision") or mem.get("pattern") or mem.get("correction"), | ||
| "influence": link.influence_score | ||
| }) |
Comment on lines
+14
to
+17
| decision TEXT NOT NULL COMMENT 'The decision that was made', | ||
| context TEXT COMMENT 'What situation led to this decision', | ||
| rationale TEXT COMMENT 'Why this decision was made', | ||
| alternatives TEXT COMMENT 'Other options considered', |
Comment on lines
+45
to
+46
| pattern TEXT NOT NULL COMMENT 'The learned pattern', | ||
| evidence TEXT COMMENT 'Examples or proof of this pattern', |
Comment on lines
+78
to
+81
| input_text TEXT COMMENT 'The input that produced the output', | ||
| input_hash STRING COMMENT 'Hash of input for deduplication', | ||
| output_text TEXT COMMENT 'What the model produced', | ||
| correction TEXT COMMENT 'What the user said it should be', |
Comment on lines
+113
to
+116
| memory_type STRING NOT NULL COMMENT 'decision, pattern, or feedback', | ||
| source_id STRING NOT NULL COMMENT 'ID in source table', | ||
| content TEXT NOT NULL COMMENT 'Text to embed', | ||
| embedding ARRAY<FLOAT> COMMENT 'Vector embedding', |
b536ad2 to
39ad106
Compare
Add a skill for building AI applications with persistent memory using RAG + RLM (Recursive Language Modeling) architecture from soul.py. Components: - SKILL.md: Main skill with quick start and architecture overview - 1-memory-schema.md: Unity Catalog DDL for decisions, patterns, feedback - 2-vector-search-setup.md: Vector Search index configuration - 3-learning-pipeline.md: Lakeflow pipeline for pattern extraction - 4-mlflow-integration.md: Prompt Registry + Tracing integration The core pattern: 1. Store what you learn (patterns, decisions) in Unity Catalog 2. Retrieve relevant context when building prompts 3. Inject context into prompts before calling the LLM Based on: github.com/menonpg/soul.py Contributed by: ThinkCreate.AI (thinkcreate.ai)
39ad106 to
490ca2e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a skill for building memory-aware prompts — AI applications that learn from production feedback using RAG + RLM (Recursive Language Modeling) architecture.
The Problem
You deploy a PII redaction prompt. Users report bugs:
You fix the prompt. A month later, a colleague deploys a similar prompt — same bugs. The learning was in your head, not in the system.
The Solution: RAG + RLM Memory Architecture
This skill implements the memory pattern from soul.py.
How the Three Tables Connect
RAG (Retrieval-Augmented Generation)
At prompt-time:
patternsanddecisionstablesRLM (Recursive Language Modeling)
Periodically:
feedbacktableai_query()to distill into patternspatternstable with confidence scoresHow It Works
Step 1: Create tables (user must do this manually)
Step 2: Log feedback when users report issues
Step 3: RLM distillation (in 3-learning-pipeline.md)
Step 4: RAG retrieval at prompt-time
What the LLM sees:
Components
SKILL.md1-memory-schema.md2-vector-search-setup.md3-learning-pipeline.md4-mlflow-integration.mdPrerequisites
Why This Matters
The RAG + RLM pattern from soul.py enables:
feedbacktableai_query()(RLM)Without memory: Session 1 fixes a bug. Session 10 hits the same bug.
With memory: Session 1 logs the pattern. Session 10 inherits it automatically.
Related
Contribution from ThinkCreate.AI