Merged
Conversation
- Implement tests for Float32Ptr to validate pointer creation for float32 values. - Create tests for ExtractJSON to ensure correct extraction of JSON from various input formats. - Add tests for cleanJavaScriptStringConcat to verify string concatenation handling in JavaScript context. - Introduce tests for StringSliceContains to check for string presence in slices. - Implement tests for MergeStringMaps to validate merging behavior of multiple string maps, including overwrites and handling of nil/empty maps.
…ove unused ChatMessage type
…Pex context conversion
… tests in export_test.go - Changed modelParams from pointer to value in toGitHubModelsPrompt function for better clarity and safety. - Updated the assignment of ModelParameters to use the value directly instead of dereferencing a pointer. - Introduced a new test suite in export_test.go to cover various scenarios for GitHub models evaluation generation, including edge cases and expected outputs. - Ensured that the tests validate the correct creation of files and their contents based on the provided context and options.
- Added NewPromptPex function to create a new PromptPex instance. - Implemented Run method to execute the PromptPex pipeline with context management. - Created context from prompt files or loaded existing context from JSON. - Developed pipeline steps including intent generation, input specification, output rules, and tests. - Added functionality for generating groundtruth outputs and evaluating test results. - Implemented test expansion and rating features for improved test coverage. - Introduced error handling and logging throughout the pipeline execution.
- Implemented TestCreateContext to validate various prompt YAML configurations and their expected context outputs. - Added TestCreateContextRunIDUniqueness to ensure unique RunIDs are generated for multiple context creations. - Created TestCreateContextWithNonExistentFile to handle cases where the prompt file does not exist. - Developed TestCreateContextPromptValidation to check for valid and invalid prompt formats. - Introduced TestGithubModelsEvalsGenerate to test the generation of GitHub Models eval files with various scenarios. - Added TestToGitHubModelsPrompt to validate the conversion of prompts to GitHub Models format. - Implemented TestExtractTemplateVariables and TestExtractVariablesFromText to ensure correct extraction of template variables. - Created TestGetMapKeys and TestGetTestScenario to validate utility functions related to maps and test scenarios.
…tPex configuration
… summary generation
… improved summary reporting
…se and restore its implementation; remove obsolete promptpex.go and summary_test.go files
…covering various scenarios and error handling
…entiment analysis test prompt
…neFlags function and update flag parsing to use consistent naming
… in generate_test.go
…ck responses for sentiment analysis stages
…odology for test generation
…derMessagesToString for message formatting
Co-authored-by: Sarah Vessels <82317+cheshire137@users.noreply.github.com>
…clarify effort flag usage
Contributor
Author
|
I address a few comments. |
sgoedecke
approved these changes
Aug 3, 2025
Collaborator
sgoedecke
left a comment
There was a problem hiding this comment.
Happy to approve again after merge conflicts are fixed
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR implements a comprehensive test generation feature for the GitHub Models CLI extension using the PromptPex methodology. The changes add an automated system for creating robust test cases from prompt files, along with significant infrastructure improvements for model output handling and developer workflows.
- PromptPex-based test generation: New
generatecommand that systematically creates test cases by analyzing prompt intent, input specifications, and output rules - Enhanced prompt file format: Added support for test data persistence, evaluators, and cleaner YAML output with
omitemptytags - Improved developer tooling: Better HTTP logging, template variable parsing utilities, and expanded CLI functionality
Reviewed Changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/generate/ | Complete implementation of PromptPex test generation pipeline with LLM interactions, parsing, and evaluation |
| pkg/util/util.go | Moved template variable parsing to shared utility for reuse across commands |
| pkg/prompt/prompt.go | Enhanced prompt file structure with test data support and save functionality |
| internal/azuremodels/ | Added HTTP logging capabilities for debugging API interactions |
| cmd/run/ | Updated to use shared template variable parsing utility |
| examples/ | Added example prompt file for test generation |
Contributor
Author
|
@sgoedecke I added a 'min' effort level that aggressively disables the amount of tests generated. I think we should have |
…r clean and build
sgoedecke
approved these changes
Aug 4, 2025
sgoedecke
approved these changes
Aug 4, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement PromptPex strategy to generate tests for prompts automatically.
🚀 PromptPex Power-Up: Smarter Prompt Test Generation, Output Cleaning, and Dev Experience Improvements
This PR supercharges the
gh modelsCLI extension with a suite of new features and quality-of-life upgrades focused on prompt test generation, output handling, and developer workflow. Highlights include:🧪 PromptPex-Based Test Generation
generatecommand leveraging the PromptPex framework for systematic, rules-driven prompt test creation.🧹 Model Output Cleaning Utilities
🧠 Context Persistence & Effort Configuration
🏆 Rules-Based Output Evaluation
🔬 Advanced Parsing & Robustness
🛠️ Utility Functions & CLI Enhancements
🧪 Expanded Unit Test Coverage
📝 Developer Experience & CI Improvements
🗂️ Prompt File Handling Upgrades
omitemptystruct tags.These changes collectively deliver smarter, more reliable prompt test automation, improved output handling, and a better developer experience for working with LLM prompts and tests.