Skip to content

Commit 3580fc4

Browse files
Merge branch 'main' into markdown_fix
2 parents 82cfa1a + 9952953 commit 3580fc4

File tree

91 files changed

+4176
-289
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+4176
-289
lines changed

.agents/notion-agent.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ import type { AgentDefinition } from './types/agent-definition'
33
const definition: AgentDefinition = {
44
id: 'notion-query-agent',
55
displayName: 'Notion Query Agent',
6-
model: 'x-ai/grok-4-fast',
6+
model: 'google/gemini-3.1-flash-lite-preview',
77

88
spawnerPrompt:
99
'Expert at querying Notion databases and pages to find information and answer questions about content stored in Notion workspaces.',

.agents/notion-researcher.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ const definition: AgentDefinition = {
66
id: 'notion-researcher',
77
publisher,
88
displayName: 'Notion Researcher',
9-
model: 'x-ai/grok-4-fast',
9+
model: 'google/gemini-3.1-flash-lite-preview',
1010

1111
spawnerPrompt:
1212
'Expert at conducting comprehensive research across Notion workspaces by spawning multiple notion agents in parallel waves to gather information from different angles and sources.',
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# LESSONS
2+
3+
## What went well
4+
- `git diff -- cli/src/index.tsx` immediately after editing made it easy to enforce exact scope for a one-line change.
5+
- Validating with `bun run cli/src/index.tsx --help` gave a quick, non-effectful end-to-end check that startup output works.
6+
7+
## What was tricky
8+
- Bun script invocation shape from repo root was easy to misremember: `bun --cwd cli run typecheck` failed, while `bun run --cwd cli typecheck` succeeded.
9+
10+
## Useful patterns
11+
- Entrypoint logs placed at the top of `main()` apply to all command paths that enter `main()`; verify with a non-interactive path first.
12+
- For tiny requests, combine: (1) minimal code edit, (2) scoped diff check, (3) one runtime smoke check, (4) one typecheck.
13+
14+
## Future efficiency notes
15+
- Put exact validation commands directly in `PLAN.md` to avoid command-syntax backtracking during validation.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# PLAN
2+
3+
## Implementation Steps
4+
1. Update `cli/src/index.tsx` by adding `console.log('Codebuff CLI starting')` as the first statement in `main()`.
5+
2. Inspect the diff to confirm scope: exactly one new `console.log` line in `cli/src/index.tsx` and no unintended edits.
6+
3. Run lightweight validation for CLI startup behavior:
7+
- Run a non-interactive path (`--help`) and confirm the line appears once.
8+
- Confirm the log sits before command branching in `main()` so it applies to all `main()` paths.
9+
10+
## Dependencies / Ordering
11+
- Step 1 must happen before Step 2 and Step 3.
12+
- Step 2 should complete before Step 3 to ensure we validate the intended change only.
13+
14+
## Risk Areas
15+
- Low risk overall.
16+
- Minor UX risk: the new stdout line appears for all command paths entering `main()` (including `--help`, `login`, and `publish`). This is intentional per spec.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# SPEC
2+
3+
## Overview
4+
Add a single startup `console.log` to the CLI entrypoint so there is explicit stdout output when the CLI boots.
5+
6+
## Requirements
7+
1. Modify `cli/src/index.tsx` only for functional code changes.
8+
2. Add exactly one `console.log(...)` statement.
9+
3. Place the log at the start of `main()`.
10+
4. Use a static message string (no timestamp or dynamic args). Chosen message: `Codebuff CLI starting`.
11+
5. The log should print for any execution path that enters `main()` (including normal startup and command modes like `login`/`publish`).
12+
6. Keep all existing behavior unchanged aside from the added stdout line.
13+
14+
## Technical Approach
15+
Insert one `console.log('Codebuff CLI starting')` call as the first statement inside `main()` so it prints once per process run before the rest of startup flow proceeds.
16+
17+
## Files to Create/Modify
18+
- `cli/src/index.tsx` (modify)
19+
- `.agents/sessions/03-03-0909-add-console-log/SPEC.md` (this spec)
20+
21+
## Out of Scope
22+
- Replacing existing logger usage with `console.log`
23+
- Adding additional logs
24+
- Refactoring startup flow or command handling
25+
- Any server/web/API changes

.agents/skills/meta/SKILL.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
name: meta
3+
description: Broad project-level implementation and validation heuristics
4+
---
5+
6+
# Meta
7+
8+
- When validating CLI changes, run a non-effectful command path first (for example `--help`) before any command that could trigger external side effects. (from .agents/sessions/03-03-0909-add-console-log)
9+
- For tightly scoped edits, pair runtime smoke-checks with `git diff -- <file>` to verify no unintended spillover. (from .agents/sessions/03-03-0909-add-console-log)
10+
- From monorepo root, run workspace scripts as `bun run --cwd <workspace> <script>`; if Bun prints global run help, re-check flag order/command shape. (from .agents/sessions/03-03-0909-add-console-log)

.agents/types/agent-definition.ts

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -370,26 +370,32 @@ export type ModelName =
370370
// Recommended Models
371371

372372
// OpenAI
373+
| 'openai/gpt-5.3'
374+
| 'openai/gpt-5.3-codex'
375+
| 'openai/gpt-5.2'
373376
| 'openai/gpt-5.1'
374377
| 'openai/gpt-5.1-chat'
375378
| 'openai/gpt-5-mini'
376379
| 'openai/gpt-5-nano'
377380

378381
// Anthropic
382+
| 'anthropic/claude-sonnet-4.6'
383+
| 'anthropic/claude-opus-4.6'
384+
| 'anthropic/claude-haiku-4.5'
379385
| 'anthropic/claude-sonnet-4.5'
380386
| 'anthropic/claude-opus-4.1'
381-
| 'anthropic/claude-opus-4.6'
382387

383388
// Gemini
389+
| 'google/gemini-3-pro-preview'
390+
| 'google/gemini-3-flash-preview'
391+
| 'google/gemini-3.1-flash-lite-preview'
384392
| 'google/gemini-2.5-pro'
385393
| 'google/gemini-2.5-flash'
386394
| 'google/gemini-2.5-flash-lite'
387-
| 'google/gemini-2.5-flash-preview-09-2025'
388-
| 'google/gemini-2.5-flash-lite-preview-09-2025'
389395

390396
// X-AI
391-
| 'x-ai/grok-4-07-09'
392397
| 'x-ai/grok-4-fast'
398+
| 'x-ai/grok-4.1-fast'
393399
| 'x-ai/grok-code-fast-1'
394400

395401
// Qwen
@@ -416,12 +422,14 @@ export type ModelName =
416422
| 'moonshotai/kimi-k2:nitro'
417423
| 'moonshotai/kimi-k2.5'
418424
| 'moonshotai/kimi-k2.5:nitro'
425+
| 'z-ai/glm-5'
419426
| 'z-ai/glm-4.6'
420427
| 'z-ai/glm-4.6:nitro'
421428
| 'z-ai/glm-4.7'
422429
| 'z-ai/glm-4.7:nitro'
423430
| 'z-ai/glm-4.7-flash'
424431
| 'z-ai/glm-4.7-flash:nitro'
432+
| 'minimax/minimax-m2.5'
425433
| (string & {})
426434

427435
import type { ToolName, GetToolParams } from './tools'

.agents/types/tools.ts

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
*/
44
export type ToolName =
55
| 'add_message'
6+
| 'apply_patch'
67
| 'ask_user'
78
| 'code_search'
89
| 'end_turn'
@@ -33,6 +34,7 @@ export type ToolName =
3334
*/
3435
export interface ToolParamsMap {
3536
add_message: AddMessageParams
37+
apply_patch: ApplyPatchParams
3638
ask_user: AskUserParams
3739
code_search: CodeSearchParams
3840
end_turn: EndTurnParams
@@ -67,6 +69,21 @@ export interface AddMessageParams {
6769
content: string
6870
}
6971

72+
/**
73+
* Apply a file operation (create, update, or delete) using Codex-style apply_patch format.
74+
*/
75+
export interface ApplyPatchParams {
76+
/** The file operation to perform. */
77+
operation: {
78+
/** Operation type: create_file, update_file, or delete_file */
79+
type: 'create_file' | 'update_file' | 'delete_file'
80+
/** File path relative to project root */
81+
path: string
82+
/** Diff content. Required for create_file and update_file. Lines prefixed with + for creates, unified diff with @@ hunks for updates. */
83+
diff?: string
84+
}
85+
}
86+
7087
/**
7188
* Ask the user multiple choice questions and pause execution until they respond.
7289
*/

agents/__tests__/file-picker.test.ts

Lines changed: 21 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,7 @@ describe('file-picker agent', () => {
8080
})
8181

8282
describe('createFilePicker - max mode', () => {
83-
test('uses grok model', () => {
84-
const maxPicker = createFilePicker('max')
85-
expect(maxPicker.model).toBe('x-ai/grok-4.1-fast')
86-
})
87-
88-
test('spawns two file-listers in parallel', () => {
83+
test('spawns single file-lister-max', () => {
8984
const maxPicker = createFilePicker('max')
9085
const mockAgentState = createMockAgentState()
9186
const mockLogger = {
@@ -105,9 +100,13 @@ describe('file-picker agent', () => {
105100

106101
const toolCall = result.value as ToolCall<'spawn_agents'>
107102
expect(toolCall.toolName).toBe('spawn_agents')
108-
expect(toolCall.input.agents).toHaveLength(2)
109-
expect(toolCall.input.agents[0].agent_type).toBe('file-lister')
110-
expect(toolCall.input.agents[1].agent_type).toBe('file-lister')
103+
expect(toolCall.input.agents).toHaveLength(1)
104+
expect(toolCall.input.agents[0].agent_type).toBe('file-lister-max')
105+
})
106+
107+
test('includes file-lister-max in spawnableAgents', () => {
108+
const maxPicker = createFilePicker('max')
109+
expect(maxPicker.spawnableAgents).toContain('file-lister-max')
111110
})
112111
})
113112

@@ -424,7 +423,7 @@ describe('file-picker agent', () => {
424423
})
425424

426425
describe('handleStepsMax', () => {
427-
test('spawns two file-listers in parallel', () => {
426+
test('spawns single file-lister-max with prompt and params', () => {
428427
const maxPicker = createFilePicker('max')
429428
const mockAgentState = createMockAgentState()
430429
const mockLogger = {
@@ -445,16 +444,13 @@ describe('file-picker agent', () => {
445444

446445
const toolCall = result.value as ToolCall<'spawn_agents'>
447446
expect(toolCall.toolName).toBe('spawn_agents')
448-
expect(toolCall.input.agents).toHaveLength(2)
449-
450-
// Both should have same prompt and params
447+
expect(toolCall.input.agents).toHaveLength(1)
448+
expect(toolCall.input.agents[0].agent_type).toBe('file-lister-max')
451449
expect(toolCall.input.agents[0].prompt).toBe('Find auth files')
452-
expect(toolCall.input.agents[1].prompt).toBe('Find auth files')
453450
expect(toolCall.input.agents[0].params).toEqual({ directories: ['src'] })
454-
expect(toolCall.input.agents[1].params).toEqual({ directories: ['src'] })
455451
})
456452

457-
test('merges results from both file-listers', () => {
453+
test('extracts results from file-lister-max', () => {
458454
const maxPicker = createFilePicker('max')
459455
const mockAgentState = createMockAgentState()
460456
const mockLogger = {
@@ -472,7 +468,6 @@ describe('file-picker agent', () => {
472468

473469
generator.next()
474470

475-
// Mock result with two spawned agent results - wrapped in toolResult with production structure
476471
const mockToolResult = {
477472
agentState: createMockAgentState(),
478473
toolResult: [
@@ -481,29 +476,14 @@ describe('file-picker agent', () => {
481476
value: [
482477
{
483478
agentName: 'File Lister',
484-
agentType: 'file-lister',
479+
agentType: 'file-lister-max',
485480
value: {
486481
type: 'lastMessage',
487482
value: [
488483
{
489484
role: 'assistant',
490485
content: [
491-
{ type: 'text', text: 'src/auth.ts\nsrc/login.ts' },
492-
],
493-
},
494-
],
495-
},
496-
},
497-
{
498-
agentName: 'File Lister',
499-
agentType: 'file-lister',
500-
value: {
501-
type: 'lastMessage',
502-
value: [
503-
{
504-
role: 'assistant',
505-
content: [
506-
{ type: 'text', text: 'src/user.ts\nsrc/auth.ts' }, // auth.ts is duplicate
486+
{ type: 'text', text: 'src/auth.ts\nsrc/login.ts\nsrc/user.ts' },
507487
],
508488
},
509489
],
@@ -517,7 +497,6 @@ describe('file-picker agent', () => {
517497

518498
const result = generator.next(mockToolResult)
519499

520-
// Should merge and deduplicate
521500
const toolCall = result.value as ToolCall<'read_files'>
522501
const paths = toolCall.input.paths
523502
expect(paths).toHaveLength(3)
@@ -526,7 +505,7 @@ describe('file-picker agent', () => {
526505
expect(paths).toContain('src/user.ts')
527506
})
528507

529-
test('handles partial failures in max mode', () => {
508+
test('handles error from file-lister-max', () => {
530509
const maxPicker = createFilePicker('max')
531510
const mockAgentState = createMockAgentState()
532511
const mockLogger = {
@@ -544,7 +523,6 @@ describe('file-picker agent', () => {
544523

545524
generator.next()
546525

547-
// One success, one error - wrapped in toolResult with production structure
548526
const mockToolResult = {
549527
agentState: createMockAgentState(),
550528
toolResult: [
@@ -553,23 +531,10 @@ describe('file-picker agent', () => {
553531
value: [
554532
{
555533
agentName: 'File Lister',
556-
agentType: 'file-lister',
557-
value: {
558-
type: 'lastMessage',
559-
value: [
560-
{
561-
role: 'assistant',
562-
content: [{ type: 'text', text: 'src/file.ts' }],
563-
},
564-
],
565-
},
566-
},
567-
{
568-
agentName: 'File Lister',
569-
agentType: 'file-lister',
534+
agentType: 'file-lister-max',
570535
value: {
571536
type: 'error',
572-
message: 'Second file-lister failed',
537+
message: 'File lister max failed',
573538
},
574539
},
575540
],
@@ -580,10 +545,10 @@ describe('file-picker agent', () => {
580545

581546
const result = generator.next(mockToolResult)
582547

583-
// Should still proceed with successful results
584-
const toolCall = result.value as ToolCall<'read_files'>
585-
expect(toolCall.toolName).toBe('read_files')
586-
expect(toolCall.input.paths).toContain('src/file.ts')
548+
const stepText = result.value as StepText
549+
expect(stepText.type).toBe('STEP_TEXT')
550+
expect(stepText.text).toContain('Error from file-lister')
551+
expect(stepText.text).toContain('File lister max failed')
587552
})
588553
})
589554

agents/base2/base-deep-evals.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import { createBaseDeep } from './base-deep'
2+
3+
const definition = {
4+
...createBaseDeep({ noAskUser: true, noLearning: true }),
5+
id: 'base-deep-evals',
6+
displayName: 'Buffy the Codex Evals Orchestrator',
7+
}
8+
export default definition

0 commit comments

Comments
 (0)