feat(skills): add databricks-memory-prompts skill by menonpg · Pull Request #564 · databricks-solutions/ai-dev-kit

menonpg · 2026-06-18T00:18:36Z

Summary

Add a skill for building memory-aware prompts — AI applications that learn from production feedback using RAG + RLM (Recursive Language Modeling) architecture.

Based on soul.py — open source RAG + RLM memory architecture
Contributed by ThinkCreate.AI

The Problem

You deploy a PII redaction prompt. Users report bugs:

"It missed phone extensions like 555-1234 x789"
"It redacted Boston General Hospital but that is not patient PII"

You fix the prompt. A month later, a colleague deploys a similar prompt — same bugs. The learning was in your head, not in the system.

The Solution: RAG + RLM Memory Architecture

This skill implements the memory pattern from soul.py.

How the Three Tables Connect

┌─────────────────────────────────────────────────────────────────┐
│                    THE COMPLETE FLOW                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  USER REPORTS BUG                                               │
│  "555-1234 x789 wasn't redacted"                             │
│              │                                                   │
│              ▼                                                   │
│  ┌─────────────────────────────────────┐                        │
│  │     feedback table (raw input)      │                        │
│  │  "User said extension not caught"   │                        │
│  │  "User said hospital name redacted" │                        │
│  │  "User said brother name missed"    │                        │
│  └─────────────────────────────────────┘                        │
│              │                                                   │
│              │  Weekly Lakeflow job runs ai_query():            │
│              │  "Summarize these 3 complaints into 1 pattern"   │
│              │                                                   │
│              ▼  ← THIS IS RLM (LLM distills feedback)           │
│  ┌─────────────────────────────────────┐                        │
│  │     patterns table (distilled)      │                        │
│  │  "Phone extensions need handling"   │  confidence: 0.9       │
│  │  "Facility names are not PII"       │  confidence: 0.85      │
│  └─────────────────────────────────────┘                        │
│              │                                                   │
│              │  You explicitly document high-confidence ones    │
│              ▼                                                   │
│  ┌─────────────────────────────────────┐                        │
│  │     decisions table (explicit)      │                        │
│  │  "Use [NAME] not [REDACTED]"        │  confidence: 1.0       │
│  └─────────────────────────────────────┘                        │
│              │                                                   │
│              │  At prompt-time: SELECT FROM patterns, decisions │
│              ▼  ← THIS IS RAG (retrieve before generate)        │
│  ┌─────────────────────────────────────┐                        │
│  │     ENHANCED PROMPT                 │                        │
│  │  "You are a PII system...           │                        │
│  │   Learned patterns:                 │                        │
│  │   - Phone extensions need handling" │                        │
│  └─────────────────────────────────────┘                        │
│              │                                                   │
│              ▼                                                   │
│  LLM generates better output                                    │
│  (which may produce new feedback → cycle continues)             │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

RAG (Retrieval-Augmented Generation)

At prompt-time:

Query patterns and decisions tables
Inject relevant context into the prompt
LLM sees your accumulated knowledge

RLM (Recursive Language Modeling)

Periodically:

Collect raw feedback from feedback table
Use ai_query() to distill into patterns
Store in patterns table with confidence scores
Recursive: These patterns inform future LLM calls, which produce better outputs, which generate less feedback

How It Works

Step 1: Create tables (user must do this manually)

CREATE SCHEMA IF NOT EXISTS my_catalog.memory;

-- Raw user corrections go here
CREATE TABLE IF NOT EXISTS my_catalog.memory.feedback (
    id STRING DEFAULT uuid(),
    correction TEXT,
    task_scope STRING,
    created_at TIMESTAMP DEFAULT current_timestamp()
);

-- Distilled patterns (RLM output)
CREATE TABLE IF NOT EXISTS my_catalog.memory.patterns (
    id STRING DEFAULT uuid(),
    pattern TEXT NOT NULL,
    evidence TEXT,
    task_scope STRING,
    confidence DOUBLE DEFAULT 0.8,
    created_at TIMESTAMP DEFAULT current_timestamp()
);

-- Explicit decisions you document
CREATE TABLE IF NOT EXISTS my_catalog.memory.decisions (
    id STRING DEFAULT uuid(),
    decision TEXT NOT NULL,
    rationale TEXT,
    task_scope STRING,
    created_at TIMESTAMP DEFAULT current_timestamp()
);

Step 2: Log feedback when users report issues

INSERT INTO my_catalog.memory.feedback (correction, task_scope)
VALUES ('Phone extension x789 was not redacted', 'pii_redaction');

Step 3: RLM distillation (in 3-learning-pipeline.md)

# Weekly job: distill feedback into patterns using ai_query()
feedback_df = spark.sql("""
    SELECT correction, COUNT(*) as freq
    FROM my_catalog.memory.feedback
    WHERE task_scope = 'pii_redaction'
    GROUP BY correction
    HAVING COUNT(*) >= 3
""")

# Use LLM to extract a pattern from similar feedback
patterns_df = feedback_df.withColumn("pattern", expr("""
    ai_query('databricks-meta-llama-3-3-70b-instruct',
        concat('Extract one pattern from this feedback: ', correction))
"""))

# Store distilled patterns
patterns_df.select("pattern").write.mode("append").saveAsTable("my_catalog.memory.patterns")

Step 4: RAG retrieval at prompt-time

# Query the database for patterns
patterns_df = spark.sql("""
    SELECT pattern FROM my_catalog.memory.patterns
    WHERE task_scope = 'pii_redaction'
    ORDER BY confidence DESC
    LIMIT 5
""")

# Build enhanced prompt with retrieved context
enhanced_prompt = base_prompt + "\n\nLearned patterns:\n"
for row in patterns_df.collect():
    enhanced_prompt += f"- {row.pattern}\n"

What the LLM sees:

You are a PII redaction system...

Learned patterns:
- Phone numbers with extensions (x1234) require explicit handling

Components

File	Purpose
`SKILL.md`	Main skill — quick start, RAG + RLM architecture overview
`1-memory-schema.md`	Production DDL with indexes, retention, confidence decay
`2-vector-search-setup.md`	Semantic retrieval via Databricks Vector Search
`3-learning-pipeline.md`	RLM distillation — auto-extract patterns from feedback using Lakeflow + ai_query()
`4-mlflow-integration.md`	Track which memories shaped each prompt version

Prerequisites

Unity Catalog with CREATE TABLE permission
Serverless SQL Warehouse or DBR 15.1+ cluster
Tables must be created manually (Step 1 above)

Why This Matters

"MLflow tracks what you deployed. Memory tracks what you learned."

The RAG + RLM pattern from soul.py enables:

Feedback accumulates → raw user corrections in feedback table
Patterns distill → LLM extracts patterns via ai_query() (RLM)
Context injects → patterns retrieved at prompt-time (RAG)
Learning compounds → better outputs produce less feedback

Without memory: Session 1 fixes a bug. Session 10 hits the same bug.

With memory: Session 1 logs the pattern. Session 10 inherits it automatically.

Pull request overview

Adds a new Databricks “skill” (databricks-memory-prompts) documenting patterns for memory-aware prompt construction using Unity Catalog tables, Vector Search retrieval, a Lakeflow learning loop, and MLflow Prompt Registry/Tracing lineage.

Changes:

Introduces a new SKILL.md with end-to-end patterns (schema → retrieval → prompt enhancement → MLflow registration).
Adds reference guides for Unity Catalog DDL, Vector Search setup, a Lakeflow learning pipeline, and MLflow integration patterns.
Provides example snippets for tracing, experiments, and prompt evolution/lineage.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
`databricks-skills/databricks-memory-prompts/SKILL.md`	Main skill doc with quick start schema + `MemoryPromptEnhancer` example and usage patterns
`databricks-skills/databricks-memory-prompts/1-memory-schema.md`	Detailed Unity Catalog DDL for memory tables and supporting structures
`databricks-skills/databricks-memory-prompts/2-vector-search-setup.md`	Vector Search endpoint/index setup and query helper examples
`databricks-skills/databricks-memory-prompts/3-learning-pipeline.md`	Lakeflow pipeline example for extracting/updating/decaying memories
`databricks-skills/databricks-memory-prompts/4-mlflow-integration.md`	MLflow Prompt Registry + tracing + experiment tracking + evolution examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+CREATE TABLE IF NOT EXISTS catalog.memory.decisions (
+    id STRING DEFAULT uuid(),
+    decision TEXT NOT NULL,
+    context TEXT,
+    rationale TEXT,
+    created_at TIMESTAMP DEFAULT current_timestamp(),
+    confidence DOUBLE DEFAULT 1.0,
+    tags ARRAY<STRING>
+);


+CREATE TABLE IF NOT EXISTS catalog.memory.patterns (
+    id STRING DEFAULT uuid(),
+    pattern TEXT NOT NULL,
+    evidence TEXT,
+    frequency INT DEFAULT 1,
+    last_seen TIMESTAMP DEFAULT current_timestamp(),
+    confidence DOUBLE,
+    tags ARRAY<STRING>
+);


+CREATE TABLE IF NOT EXISTS catalog.memory.feedback (
+    id STRING DEFAULT uuid(),
+    run_id STRING,
+    input_hash STRING,
+    output TEXT,
+    correction TEXT,
+    feedback_type STRING, -- 'correction', 'preference', 'complaint'
+    created_at TIMESTAMP DEFAULT current_timestamp()
+);


+    def retrieve_context(self, task_description: str, k: int = 5) -> dict:
+        """Retrieve relevant memories for a task."""
+        # Semantic search across memory
+        results = self.vs_client.index(self.vector_index).similarity_search(
+            query_text=task_description,
+            columns=["memory_type", "content", "confidence"],
+            num_results=k
+        )
+
+        # Group by type
+        context = {"decisions": [], "patterns": [], "feedback": []}
+        for row in results.get("result", {}).get("data_array", []):
+            memory_type = row[0]
+            if memory_type in context:
+                context[memory_type].append({
+                    "content": row[1],
+                    "confidence": row[2]
+                })
+        return context


+    def enhance_prompt(self, base_prompt: str, task: str) -> str:
+        """Enhance a prompt with memory context."""
+        context = self.retrieve_context(task)
+
+        # Build context block
+        context_lines = []
+
+        if context["decisions"]:
+            context_lines.append("## Relevant Decisions")
+            for d in context["decisions"]:
+                context_lines.append(f"- {d['content']} (confidence: {d['confidence']:.2f})")
+
+        if context["patterns"]:
+            context_lines.append("\n## Learned Patterns")
+            for p in context["patterns"]:
+                context_lines.append(f"- {p['content']}")
+
+        if context["feedback"]:
+            context_lines.append("\n## Past Feedback")
+            for f in context["feedback"]:
+                context_lines.append(f"- {f['content']}")
+


+            mem = spark.sql(f"""
+                SELECT * FROM catalog.memory.{link.memory_type}s
+                WHERE id = '{link.memory_id}'
+            """).first()
+            if mem:
+                memories.append({
+                    "type": link.memory_type,
+                    "content": mem.get("decision") or mem.get("pattern") or mem.get("correction"),
+                    "influence": link.influence_score
+                })


+    decision TEXT NOT NULL COMMENT 'The decision that was made',
+    context TEXT COMMENT 'What situation led to this decision',
+    rationale TEXT COMMENT 'Why this decision was made',
+    alternatives TEXT COMMENT 'Other options considered',


+    pattern TEXT NOT NULL COMMENT 'The learned pattern',
+    evidence TEXT COMMENT 'Examples or proof of this pattern',


+    input_text TEXT COMMENT 'The input that produced the output',
+    input_hash STRING COMMENT 'Hash of input for deduplication',
+    output_text TEXT COMMENT 'What the model produced',
+    correction TEXT COMMENT 'What the user said it should be',


+    memory_type STRING NOT NULL COMMENT 'decision, pattern, or feedback',
+    source_id STRING NOT NULL COMMENT 'ID in source table',
+    content TEXT NOT NULL COMMENT 'Text to embed',
+    embedding ARRAY<FLOAT> COMMENT 'Vector embedding',


Add a skill for building AI applications with persistent memory using RAG + RLM (Recursive Language Modeling) architecture from soul.py. Components: - SKILL.md: Main skill with quick start and architecture overview - 1-memory-schema.md: Unity Catalog DDL for decisions, patterns, feedback - 2-vector-search-setup.md: Vector Search index configuration - 3-learning-pipeline.md: Lakeflow pipeline for pattern extraction - 4-mlflow-integration.md: Prompt Registry + Tracing integration The core pattern: 1. Store what you learn (patterns, decisions) in Unity Catalog 2. Retrieve relevant context when building prompts 3. Inject context into prompts before calling the LLM Based on: github.com/menonpg/soul.py Contributed by: ThinkCreate.AI (thinkcreate.ai)

Copilot AI review requested due to automatic review settings June 18, 2026 00:18

Copilot started reviewing on behalf of menonpg June 18, 2026 00:19 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

menonpg force-pushed the feature/memory-prompts-skill branch 2 times, most recently from b536ad2 to 39ad106 Compare June 18, 2026 00:47

menonpg force-pushed the feature/memory-prompts-skill branch from 39ad106 to 490ca2e Compare June 18, 2026 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add databricks-memory-prompts skill#564

feat(skills): add databricks-memory-prompts skill#564
menonpg wants to merge 1 commit into
databricks-solutions:mainfrom
menonpg:feature/memory-prompts-skill

menonpg commented Jun 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		pattern TEXT NOT NULL COMMENT 'The learned pattern',
		evidence TEXT COMMENT 'Examples or proof of this pattern',

Conversation

menonpg commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The Problem

The Solution: RAG + RLM Memory Architecture

How the Three Tables Connect

RAG (Retrieval-Augmented Generation)

RLM (Recursive Language Modeling)

How It Works

Components

Prerequisites

Why This Matters

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

menonpg commented Jun 18, 2026 •

edited

Loading