export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
arcade evals .
Split these into two. The evals call should stand alone.
Reference your recent quickstart for how to handle environment variables. Possibly call this out in a warning (hey, don't forget to set your env variables!).
I've reviewed:
Suggested edits for both
Edits for Evaluate Tools
my_server, but if you created an MCP server as per Prerequisites, you'll already be in that folder. Rephrase as, "in your server's root folder, create a new Python file..."Run Evaluations/Run evaluations with the Arcade CLI
Overall this page is both guide (how Evals work) where the former is a tutorial, and it's also a reference (all the options). It is like a non-tutorial version of the last page, which makes it a little repetitive. I would lean into making this a comprehensive guide for
arcade evalsand move the advanced content from Evaluate in to it. You might split this into a guide as well as a reference, to DRY and shorten the pages (folks looking for a command reference are not looking for a tutorial)The section on Handling multiple models needs to be removed. Currently it only supports OpenAI, though you could just point this out and say "more coming soon!"