Skip to content

[FEATURE]: Implement the Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models #15456

@charmandercha

Description

@charmandercha

Feature hasn't been suggested before.

  • I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request

Proposal: Agentic Context Engineering (ACE) Implementation - Technical Review Needed

Context

I am proposing the implementation of the paper Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models (ICLR 2026), by Zhang et al. (Stanford/SambaNova).

What has been done

I used an AI agent (running on a fork of OpenCode) to attempt implementing ACE, and have now released the skills that made that process possible: ACE-implementation

The repository contains:

  • 9 specialized skills designed for an AI agent to build ACE components (Foundation, Agentic Roles, Semantic Intelligence, Core Integration, Multi-Epoch Reflection, etc.)
  • Integration pattern with Model2Vec for embeddings (chosen for extreme performance)
  • Structured documentation that guided the agent through the implementation

Important: These are NOT a working implementation - they are prompt-based skills that allow an AI agent to implement ACE in any codebase. Think of them as a "recipe" that attempts to reproduce the process I went through.

Why a fresh implementation by developers is the right approach

I lack the technical competence to evaluate whether these skills will actually produce efficient code. I used Gemini 3.1 Pro and MiniMax to assist in developing these skills, and the resulting prompts are in a much better standard than I could write alone. However, without adequate technical knowledge, I cannot distinguish what works well from what might be wrong.

This is exactly why I am sharing these skills - not as a final product, but as a starting point. A proper implementation built from scratch by developers who understand the codebase would be far superior to anything an AI agent could produce following my prompts.

Why this matters: ACE Advantages

The paper presents significant advances:

  1. Brevity Bias Resolution: Previous methods lost domain-specific insights when summarizing to concise instructions. ACE preserves details through structured, incremental updates.

  2. Context Collapse Prevention: Previous iterative rewrites degraded information over time. ACE uses modular updates that maintain detailed knowledge.

  3. Self-learning without labeled supervision: The framework can adapt using natural execution feedback, not requiring expensive labels.

  4. Impressive results:

    • +10.6% on agent benchmarks (AppWorld)
    • +8.6% on finance (FiNER, Formula)
    • 86.9% lower adaptation latency
    • Matched the top-ranked agent using only a smaller open-source model

What this means for a Terminal Agent: Cognitive Advancement

For a terminal agent, ACE represents a significant cognitive leap:

  • Continuous Evolution: Context is not static - it accumulates, refines, and organizes strategies automatically through multiple iterations
  • Adaptive Memory: Each interaction improves the agent's playbook, allowing it to learn from successes and failures
  • Efficiency: Low computational overhead with high adaptation quality, reducing API costs and latency
  • Scalability: Works both offline (system prompts) and online (agent memory), adapting to different scenarios
  • Self-Improvement: Ability to improve without human intervention, using natural execution feedback

What I'm asking for

I suggest developers evaluate the skills and consider:

  1. Test if the skills can guide an agent to produce a working ACE implementation
  2. Use them as inspiration to build a proper implementation from scratch (which would be much better)
  3. Identify inefficiencies or gaps in the prompts
  4. Contribute improvements or create a better version

The skills may not be perfect, but I believe they can serve as a reference for how an AI agent approached the problem. The theoretical foundation of the paper is solid and the results justify serious exploration.

If we're having a problem with too many code suggestions from AI, it seems more plausible to implement it with my AI code, transform the entire process into a SKILL, and ask someone more competent to do this implementation in a supervised manner.

I sincerely hope this content is helpful, because using Gemini 3.1 cost me $10.


Repository link: https://github.com/charmandercha/ACE-implementation
Paper: https://arxiv.org/abs/2510.04618

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)discussionUsed for feature requests, proposals, ideas, etc. Open discussion

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions