-
Notifications
You must be signed in to change notification settings - Fork 485
env server #799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
env server #799
Conversation
c8bb765 to
a38022a
Compare
|
|
I think removing the scoring concurrency is fine. Users can make this part of their rubrics if they want (via class_objects or globals), have used this for multi-part judge rubrics + works well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Description
This PR introduces the
EnvClientandEnvServerwhich expose therun_rolloutandrun_groupmethods of an environment from a separate process (pool). This is especially useful for multi-env training (e.g. inprime-rl) and multi-env evals (e.g. invf-evalor online evals).Example
Runnning
vf-evalwill spawn environments in env-server mode by defaultDesign
Env Server Mode
You can put an environment into "env server mode" by calling
This will implicitly start an env server as a sidecar (in a subprocess) and try to route all calls to
run_rolloutandrun_groupto the env server.EnvServer
A
EnvServeris initialized like a regular environment with anenv_idandenv_argsEnvClient
A
EnvClientcommunicates with a env server over the configuredaddressSidecar Pattern
To sidecar an env server (e.g. from
vf-eval) simply wrap therun_serverclass method in aProcessand connect the client to the same addressMisc Changes
vf.setup_logging(...)supports logging to file now as wellRolloutOutputto be able to display error chains as beforeType of Change
Testing
uv run pytestlocally.Checklist
Additional Notes
Note
High Risk
Introduces a new multiprocess, networked execution path (ZMQ/msgpack) and refactors core rollout scheduling/serialization, which can affect correctness, performance, and cleanup behavior across evaluation runs.
Overview
Adds an environment “server mode” for evaluation/training.
Environmentcan now spawn a sidecarZMQEnvServerprocess and routerun_rollout/run_groupover a newEnvClient/ZMQEnvClientusing ZMQ + msgpack, andvf-evalis updated to start/stop the server around each run.Refactors rollout execution and serialization. Generation/scoring no longer use separate generation vs scoring semaphores; a single concurrency limit is applied via
with_sem, tasks are always cleaned up on exit, andrun_rollout/run_groupnow return pre-serializedRolloutOutputobjects (builder now accumulates outputs, not states).Changes error and logging surfaces. Rollout
erroris now a structuredErrorInfo(type + chain strings) instead of a repr string,ErrorChainstring/repr semantics are swapped to preserve prior displays, and logging supports optional file output; tests/docs/CLI config are updated accordingly. Dependencies addpyzmqandmsgpack.Written by Cursor Bugbot for commit ed2b7d9. This will update automatically on new commits. Configure here.