Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/TALXIS.CLI.MCP/GuideReasoningEngine.cs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ public class GuideReasoningEngine
["guide_deployment"] = ["deployment-sequence", "solution-management"],
["guide_data"] = ["data-migration-workflow"],
["guide_config"] = [],
["guide_testing"] = ["testing-workflow"],
};

/// <summary>
Expand Down
90 changes: 89 additions & 1 deletion src/TALXIS.CLI.MCP/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@
var publicSkillLoader = new PublicSkillLoader();
publicSkillLoader.LoadIndex();

// Load UI testing step bindings catalog via reflection
var testingBindingsCatalog = new TestingBindingsCatalog();
testingBindingsCatalog.Load();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much performance penalty this is but I don't think we need to load this on MCP server start.


// Session-scoped active tool set — starts with always-on tools only
var activeToolSet = new ActiveToolSet();
var guideHandler = new GuideHandler(mcpToolRegistry.Catalog, activeToolSet, reasoningEngine);
Expand Down Expand Up @@ -71,6 +75,7 @@
- guide_deployment: Deployment lifecycle — import/export/pack solutions, manage components, publish. Requires profile.
- guide_data: LIVE data operations — SQL/FetchXML/OData queries, record CRUD, bulk ops, CMT migration. Requires profile.
- guide_config: CLI setup — auth credentials, connections, profiles, settings. Required before environment operations.
- guide_testing: UI test generation — discover available Reqnroll step bindings for Power Apps BDD tests.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep this guide general for any type of testing


WORKFLOW: Call a guide tool → use execute_operation for immediate execution → discovered tools become direct calls on next turn.

Expand Down Expand Up @@ -214,6 +219,10 @@ async ValueTask<CallToolResult> HandleGuideToolAsync(
};
}
}
else if (guideName == "guide_testing")
{
result = await HandleGuideTestingAsync(query, top, server, ct);
}
else if (workflowScope is not null)
{
result = await guideHandler.HandleWorkflowGuideAsync(workflowScope, query, top, server, ct, guideName);
Expand All @@ -228,6 +237,78 @@ async ValueTask<CallToolResult> HandleGuideToolAsync(
return result;
}

// Handles guide_testing calls — uses TestingBindingsCatalog + sampling to recommend step bindings
async ValueTask<CallToolResult> HandleGuideTestingAsync(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this Program.cs is already 1000 LoC. we're breaking single responsibility princeple and we need to decompose it.

string query, int top, McpServer server, CancellationToken ct)
{
if (testingBindingsCatalog.Count == 0)
{
return new CallToolResult
{
Content = [new TextContentBlock { Text = "No step bindings loaded. Ensure the TALXIS.TestKit.Bindings assembly is available." }],
IsError = true
};
}

if (string.IsNullOrEmpty(query))
{
// No query — return full catalog listing
return new CallToolResult
{
Content = [new TextContentBlock { Text = testingBindingsCatalog.GetCatalogPrompt() }]
};
}

// Use sampling to select relevant bindings based on user's query
var skillsContext = reasoningEngine.GetSkillsContext("guide_testing");
var catalogPrompt = testingBindingsCatalog.GetCatalogPrompt();

var systemPrompt = $@"You are a Power Apps UI test automation assistant. Given the user's testing task and available step bindings, produce a Gherkin feature file or scenario.

FORMAT YOUR RESPONSE AS:
1. A complete Gherkin scenario (or scenarios) using the available step bindings
2. Include comments explaining any custom steps that would need to be implemented

RULES:
- Use ONLY the step bindings listed below when possible
- For login, ALWAYS start with: Given I am logged in to the '{{app}}' app as '{{user}}'
- Use realistic placeholder values based on the user's description
- Each scenario should test ONE behavior
- Include test data setup (Given steps) before actions (When steps)
- End with assertions (Then steps)
- If the available bindings don't cover something, note it as a custom step with a comment

{catalogPrompt}{skillsContext}";

var samplingParams = new CreateMessageRequestParams
{
Messages =
[
new SamplingMessage
{
Role = Role.User,
Content = [new TextContentBlock { Text = $"Generate a Gherkin test for this scenario: {query}" }]
}
],
SystemPrompt = systemPrompt,
MaxTokens = 2000,
ModelPreferences = new ModelPreferences
{
SpeedPriority = 0.6f,
CostPriority = 0.4f,
IntelligencePriority = 0.8f
},
};

var result = await server.SampleAsync(samplingParams, ct);
var responseText = result.Content.OfType<TextContentBlock>().FirstOrDefault()?.Text ?? "";

return new CallToolResult
{
Content = [new TextContentBlock { Text = responseText }]
};
}

// Bridge: execute_operation dispatches any tool from the internal catalog
async ValueTask<CallToolResult> HandleExecuteOperationAsync(
IDictionary<string, JsonElement>? arguments, RequestContext<CallToolRequestParams> ctx, CancellationToken ct)
Expand Down Expand Up @@ -473,7 +554,7 @@ async ValueTask<CallToolResult> ExecuteAsTaskAsync(
bool IsGuideTool(string toolName)
{
return toolName is "guide" or "guide_workspace" or "guide_environment"
or "guide_deployment" or "guide_data" or "guide_config";
or "guide_deployment" or "guide_data" or "guide_config" or "guide_testing";
}

// Helper: checks if a tool is an MCP-specific in-process tool (not a CLI subprocess)
Expand Down Expand Up @@ -563,6 +644,13 @@ void RegisterAlwaysOnTools(ActiveToolSet toolSet, McpToolRegistry registry, Publ
InputSchema = BuildGuideInputSchema()
});

toolSet.AddAlwaysOn(new Tool
{
Name = "guide_testing",
Description = @"Helps generate Power Apps UI tests using Reqnroll (BDD). Discovers available step bindings from TALXIS.TestKit.Bindings and generates Gherkin scenarios. Provide a description of what you want to test and get ready-to-use feature file content with Given/When/Then steps.",
InputSchema = BuildGuideInputSchema()
});

// execute_operation bridge — for same-turn execution
toolSet.AddAlwaysOn(new Tool
{
Expand Down
69 changes: 69 additions & 0 deletions src/TALXIS.CLI.MCP/Skills/Internal/testing-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# UI Testing Workflow

<!-- Internal reasoning skill: contains ONLY test-generation guidance. -->
<!-- For available step bindings, see the TestingBindingsCatalog prompt. -->

## User wants to "write a UI test" / "test a form" / "test navigation"

-> STRUCTURE: Feature file (.feature) with Scenario(s) using Given/When/Then
-> ALWAYS start with a Given step for login: `Given I am logged in to the '{app}' app as '{user}'`
-> ALWAYS use pre-built step bindings from TALXIS.TestKit.Bindings where available
-> ONLY write custom step bindings for app-specific logic not covered by the library

## Test Structure Best Practices

-> Feature files group related scenarios by business capability
-> Each scenario should be independent (no shared state between scenarios)
-> Use Background for shared Given steps across all scenarios in a feature
-> Keep scenarios focused on ONE behavior/assertion
-> Use Scenario Outline for data-driven tests

## Given/When/Then Conventions

-> Given: Setup preconditions (login, create test data, navigate to starting point)
-> When: Perform the action being tested (click, enter data, navigate)
-> Then: Assert the expected outcome (field values, visibility, error messages)
-> Avoid multiple When steps — split into separate scenarios instead

## Test Data Setup

-> Use `Given I have created '{alias}'` with JSON data files in a /data folder
-> Data files use Web API deep-insert syntax with @logicalName, @alias, @extends
-> Use faker.js templates for dynamic data ({{name.firstName}}, {{finance.amount}})
-> Set `deleteTestData: true` in appsettings.json for cleanup after scenarios

## Common Patterns

### Testing form field entry:
```gherkin
When I enter '{value}' into the '{field label}' field on the form
```

### Testing navigation:
```gherkin
When I open the sub area '{subarea}' under the '{area}' area
When I open the '{subarea}' sub area of the '{group}' group
```

### Testing command bar:
```gherkin
When I select the '{command}' command
Then I should be able to see the '{command}' command
```

### Testing grids/views:
```gherkin
When I open the record at position '{n}' in the grid
Then I can see '{n}' records in the grid
```

### Testing lookups:
```gherkin
When I select '{value}' from the '{field}' lookup field
```

## Error Recovery

-> If a step binding fails with a timeout: check if Driver.WaitForTransaction() is needed
-> If login fails: verify user credentials in appsettings.json and OTP token configuration
-> If element not found: the app may need WaitForPageToLoad before interaction
1 change: 1 addition & 0 deletions src/TALXIS.CLI.MCP/TALXIS.CLI.MCP.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Hosting" Version="10.0.5" />
<PackageReference Include="ModelContextProtocol" Version="1.2.0" />
<PackageReference Include="TALXIS.TestKit.Bindings" Version="1.0.10" ExcludeAssets="build;buildTransitive" />
<ProjectReference Include="../TALXIS.CLI/TALXIS.CLI.csproj" />
<ProjectReference Include="../TALXIS.CLI.Features.Data/TALXIS.CLI.Features.Data.csproj" />
<ProjectReference Include="../TALXIS.CLI.Features.Docs/TALXIS.CLI.Features.Docs.csproj" />
Expand Down
Loading