Pragnay/randomagents by mpragnay · Pull Request #292 · Emerge-Lab/PufferDrive

mpragnay · 2026-02-12T01:47:23Z

Spawning logic added with a random number of agents and randomized dimensions
Integrated with the 2.0 goal sampling for initial goals
Collision and off-road checks during spawns

…nd removed everything else to prevent redundencies

…sampling logic for initial goals, configurable params for spawn settings

greptile-apps · 2026-02-12T01:51:11Z

Greptile Overview

Greptile Summary

This PR implements random agent spawning with configurable dimensions and collision-free placement. Agents are spawned with randomized vehicle dimensions (width, length, height) on drivable lanes, with rejection sampling to prevent collisions and offroad placements. Initial goals are sampled using the 2.0 goal sampling logic.

Key Changes:

Added random_agents initialization mode with configurable agent counts per environment (min_agents_per_env, max_agents_per_env)
Implemented collision and offroad checking during spawn via check_spawn_collision and check_spawn_offroad functions
Added randomized vehicle dimensions via AgentSpawnSettings struct with configurable bounds
Removed max_controlled_agents parameter in favor of new spawning system
Integrated 2.0 goal sampling logic for initial goal generation

Critical Issues Found:

Runtime error in drive.py:279-280: undefined attributes self.min_spawn_agents and self.max_spawn_agents (should be min_agents_per_env and max_agents_per_env)
Logic error in drive.h:3484: incorrect cosine calculation using atan2f instead of vector magnitude
Logic error in drive.h:1304-1322: spawn position always updated even when all rejection attempts fail
Logic error in drive.h:2144-2146: respawn_agent respawns all agents instead of just the individual agent in random mode

Confidence Score: 1/5

This PR will cause immediate runtime failures and has critical logic errors
The undefined attribute references in drive.py will cause AttributeError on initialization. Additionally, critical logic bugs in spawn validation, respawn behavior, and goal sampling math will cause incorrect simulation behavior if the runtime errors are fixed
Pay close attention to pufferlib/ocean/drive/drive.py (runtime errors) and pufferlib/ocean/drive/drive.h (logic errors in spawn/respawn/goal sampling)

Important Files Changed

Filename	Overview
pufferlib/ocean/drive/binding.c	added random agent distribution logic across environments, removed `max_controlled_agents` parameter
pufferlib/ocean/drive/drive.c	added debug config logging and random agent count support in demo mode
pufferlib/ocean/drive/drive.h	major spawning logic implementation with collision/offroad checks, but contains critical bugs in respawn logic, spawn validation, and goal sampling math
pufferlib/ocean/drive/drive.py	added Python bindings for random spawning parameters, but has undefined attribute references that will cause runtime errors
pufferlib/ocean/env_config.h	added config parsing for spawn dimension parameters and init_mode string parsing

Sequence Diagram

sequenceDiagram
    participant Python as Drive.py
    participant Binding as binding.c
    participant Core as drive.h
    
    Python->>Binding: shared() with random_agents mode
    Binding->>Binding: Distribute agents across envs
    Binding-->>Python: agent_offsets, map_ids, env_count
    
    Python->>Binding: env_init() for each env
    Binding->>Core: Initialize Drive struct
    Core->>Core: load_map_binary()
    Core->>Core: set_active_agents()
    Core->>Core: spawn_agents_with_counts()
    
    loop For each agent
        Core->>Core: spawn_agent()
        Core->>Core: Random vehicle dimensions
        
        loop Rejection sampling (MAX_SPAWN_ATTEMPTS)
            Core->>Core: get_random_point_on_lane()
            Core->>Core: check_spawn_collision()
            Core->>Core: check_spawn_offroad()
            alt Valid spawn
                Core->>Core: break
            else Invalid
                Core->>Core: continue
            end
        end
        
        Core->>Core: Update agent position & state
    end
    
    Core->>Core: init_goal_positions()
    Core->>Core: sample_new_goal() for each agent
    Core-->>Python: Environment ready

greptile-apps

_{9 files reviewed, 6 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-12T01:51:15Z

+            float mod_heading = atan2f(agent->heading_y, agent->heading_x);
+            float cos_theta = dot/(mod_to_pt * mod_heading);


incorrect cosine calculation - mod_heading uses atan2f which returns an angle, not a magnitude

the cosine is already computed via the dot product formula: cos(theta) = dot / (|v1| * |v2|), where |heading| should be the magnitude (which is 1.0 for normalized heading vectors), not atan2f

Suggested change

float mod_heading = atan2f(agent->heading_y, agent->heading_x);

float cos_theta = dot/(mod_to_pt * mod_heading);

float mod_heading = sqrtf(agent->heading_x * agent->heading_x + agent->heading_y * agent->heading_y);

float cos_theta = dot/(mod_to_pt * mod_heading);

Copilot

Pull request overview

This PR adds a new random_agents initialization mode to the Ocean Drive environment, enabling randomized per-environment agent counts and randomized vehicle dimensions/poses during spawning, with additional spawn-time collision/off-road rejection and integration with goal sampling.

Changes:

Extend INI/env config and Python/C bindings to support random_agents mode and spawn dimension parameters.
Implement random agent spawning logic in the C core, including collision/off-road checks and goal initialization behavior.
Update demo/visualization entrypoints and default drive.ini settings to exercise the new mode.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`pufferlib/ocean/env_config.h`	Adds new spawn/agent-count config fields and parses `init_mode` strings.
`pufferlib/ocean/drive/visualize.c`	Wires spawn settings into visualization env init; randomizes agent counts in random mode.
`pufferlib/ocean/drive/error.h`	Extends error types and adjusts error string formatting.
`pufferlib/ocean/drive/drive.py`	Adds Python-side config for random agent counts/dimensions and passes through to bindings.
`pufferlib/ocean/drive/drive.h`	Implements spawn rejection sampling, off-road checks, random init mode wiring, and goal sampling adjustments.
`pufferlib/ocean/drive/drive.c`	Updates demo to use spawn settings and random agent counts; changes default weights/map.
`pufferlib/ocean/drive/datatypes.h`	Adds `free_agents` helper to free an array of agents safely.
`pufferlib/ocean/drive/binding.c`	Adds random agent-count partitioning in `shared()` and passes spawn settings into env init.
`pufferlib/config/ocean/drive.ini`	Updates defaults to enable `random_agents` and sets spawn/agent-count parameters.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Changed counts to max supported by sim, max_controlled, num_created and removed everything else to prevent redundencies * Fixed memory leaks

…ed defaults to align with requirements

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-02T17:20:00Z

+    // Design Choice(we don't have wide vehicles on roads)
+    if (spawn_width > spawn_length)
+        spawn_width = spawn_length;
+    float spawn_height = 1.5f; // Design Choice: Doesn't matter as we don't have flying cars in 2026


spawn_height is currently hardcoded to 1.5f, so the configured env->spawn_settings.h is ignored even though it’s parsed and passed through config/bindings. Use the configured spawn height (and keep the z-centering adjustment) so spawns match the INI/Python parameters.

Suggested change

float spawn_height = 1.5f; // Design Choice: Doesn't matter as we don't have flying cars in 2026

float spawn_height = spawn_settings.h;

Copilot · 2026-03-02T17:20:01Z

    c_reset(&env);
    c_render(&env);
-    Weights *weights = load_weights("resources/drive/puffer_drive_resampling_speed_lane.bin");
+    Weights *weights = load_weights("best_policy_with_reward_conditioning.bin");


load_weights("best_policy_with_reward_conditioning.bin") appears to reference a file that isn’t in the repo and also drops the resources/drive/ prefix used elsewhere. This breaks the demo by default unless the artifact is committed and the path is corrected.

Is this comment correct? It's true that I don't see this file

Copilot · 2026-03-02T17:20:02Z

+                // Ensure last env can still meet min_agents_per_env requirement
+                int upper = (remaining - max_agents_per_env < min_agents_per_env) ? remaining - min_agents_per_env
+                                                                                  : max_agents_per_env;
+                count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1);


The upper bound computation for count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1) can yield upper < min_agents_per_env for infeasible combinations of remaining, min_agents_per_env, and max_agents_per_env (e.g. remaining=9, min=5, max=6), leading to modulo-by-zero/UB. Add a feasibility check and a fallback strategy (e.g. error out, relax the min for the final env, or adjust prior counts) before computing the modulo range.

Suggested change

count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1);

if (upper < min_agents_per_env) {

/* Infeasible to satisfy min_agents_per_env for all envs with remaining agents.

* Fall back by relaxing the minimum for this env: fill up to max_agents_per_env

* (or all remaining if fewer). This avoids a zero/negative modulo range.

*/

count = remaining > max_agents_per_env ? max_agents_per_env : remaining;

} else {

count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1);

}

eugenevinitsky · 2026-03-04T01:37:22Z

+reward_lane_align = 0
+reward_lane_center = 0


can we revert this?

eugenevinitsky · 2026-03-04T01:37:31Z

 ; Determines the target distance to the new goal in the case of goal_behavior = generate_new_goals.
 ; Large numbers will select a goal point further away from the agent's current position.
-min_goal_distance = 0.5
+min_goal_distance = 5.0


can we revert this?

eugenevinitsky · 2026-03-04T01:44:54Z

 // Initialization modes
 #define INIT_ALL_VALID 0
 #define INIT_ONLY_CONTROLLABLE_AGENTS 1
+#define RANDOM_AGENTS 2


I think this is sort of a bad name, it implies all sorts of things that aren't just "we randomize the number of agents in the scene". It could be called INIT_VARIABLE_AGENT_NUMBER even if that's mouthy

eugenevinitsky · 2026-03-04T01:45:26Z

+    if (env->init_mode == RANDOM_AGENTS) {
+        // TODO: Needs to be handled later if we want GOAL_RESPAWN for random agents
+        spawn_agents_with_counts(env);


A little puzzled by this. Anytime we need to respawn a single agent we respawn the whole thing?

…her code cleanup

eugenevinitsky · 2026-03-06T05:22:53Z

 render = True
 render_async = False     # Render interval of below 50 might cause process starvation and slowness in training
-render_interval = 250
+render_interval = 1000


why change this?

Rendering was taking a really long amount of time even with async on so I just increased the interval

This only happens when max_agents is very high; previously, we were working with just 32 so it was not a problem

eugenevinitsky · 2026-03-06T05:23:57Z


    env->num_agents = max_agents;
+    if (env->init_mode == INIT_VARIABLE_AGENT_NUMBER) {
+        env->spawn_settings.max_agents_in_sim = max_agents_per_env; // Random Agents only supports controlled agents


Random Agents is not a good name to use throughout the codebase for this feature, it can mean too many things

eugenevinitsky · 2026-03-06T05:25:44Z

                    f"Please reduce num_maps, add more maps to {map_dir}, or set allow_fewer_maps=True."
                )

+        self.use_all_maps = use_all_maps


i'm sort of struggling to see why these use_all_maps changes showed up in this particular PR

This was a tiny bug with debugging using drive.py

This was the exact place of issue: https://github.com/Emerge-Lab/PufferDrive/pull/292/changes#diff-fae14af53080a7325bdddc95a97c9675183e4d0bb11f6e537d4c62e8bbb0eee6L441

eugenevinitsky · 2026-03-06T05:33:33Z

-            agent->log_velocity_y = (float *)malloc(array_size * sizeof(float));
-            agent->log_heading = (float *)malloc(array_size * sizeof(float));
-            agent->log_valid = (int *)malloc(array_size * sizeof(int));
+            allocate_agent_trajectories(agent);


I am a tiny bit puzzled why this was pulled into a function

I didn't want to like bloat the code and the log trajectories was required for goal respawn behavior

eugenevinitsky · 2026-03-06T05:36:05Z

    return (s >= 0 && s <= 1 && t >= 0 && t <= 1);
 }

+static bool check_offroad(Drive *env, Agent *agent) {


did we not have an offroad check elsewhere we could reuse?

Oh I guess you just needed to reuse it in a few spots, good

No like, this needs to be reworked, cuz it's a good practice

PufferDrive/pufferlib/ocean/drive/drive.h

Line 2207 in 8fd053f

if (check_line_intersection(corners[k], corners[next], start, end)) {

But yea, there is no single function for check_offroad

eugenevinitsky · 2026-03-06T05:43:40Z

+    float spawn_length = random_uniform(spawn_settings.min_l, spawn_settings.max_l);
+    float spawn_width = random_uniform(spawn_settings.min_w, spawn_settings.max_w);
+
+    // Design Choice(we don't have wide vehicles on roads)
+    if (spawn_width > spawn_length)
+        spawn_width = spawn_length;


btw, this is sort of a biased sampler. It'll tend to sample a lot of squares. Just a heads up. This is pretty fast, so it's possible you can just rejection sample until you get a valid sample

mpragnay added 3 commits February 9, 2026 19:11

Changed counts to max supported by sim, max_controlled, num_created a…

42b8955

…nd removed everything else to prevent redundencies

Spawning logic with collision and offroad free spawns, uses 2.0 goal …

c49b15e

…sampling logic for initial goals, configurable params for spawn settings

Fix mem leaks

8c9376c

mpragnay marked this pull request as ready for review February 12, 2026 01:47

Copilot AI review requested due to automatic review settings February 12, 2026 01:47

Copilot started reviewing on behalf of mpragnay February 12, 2026 01:48 View session

greptile-apps Bot reviewed Feb 12, 2026

View reviewed changes

Copilot AI reviewed Feb 12, 2026

View reviewed changes

mpragnay and others added 5 commits February 11, 2026 21:15

Apply suggestion from @Copilot

87bd891

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fixed Agent Counts to align with gigaflow feature set (#288)

5466e9e

* Changed counts to max supported by sim, max_controlled, num_created and removed everything else to prevent redundencies * Fixed memory leaks

Bug fixing demo and renderer, fixed mem leaks in bindings code, chang…

7b376fd

…ed defaults to align with requirements

Merge branch '3.0_beta' into pragnay/randomagents

bc5b942

Code cleanup, error handling

5cbf4c4

riccardosavorgnan force-pushed the 3.0_beta branch from e646569 to 09e725d Compare February 12, 2026 22:05

mpragnay added 15 commits February 16, 2026 18:01

merged changes from 3.0_beta

7237135

Minor fixes

d62e586

minor bugs

932d727

Pre-Compute lanes for spawning

ef50fd4

Previously Working Settings

2ade546

Fixed reset to use the same goal

257d721

Separate out goal resets for bug avoidance with other modes

ff6a94f

Fixed agent collisions for variable dimensions

7ceeeb1

Fixed agent collisions for variable dimensions

0aa0a87

working jerk configs

0c8d21b

Merge branch '3.0_beta' into pragnay/randomagents

b6c64c9

pre-commit fixes

dc972e5

Relative Speed Observation fix

974b93e

Visualizer Ego POV road lanes added

2f5df66

Merge branch '3.0_beta' into pragnay/randomagents

e94420b

mpragnay added 6 commits February 28, 2026 18:18

Change configs for run

1044152

Fixed max_agents

2971422

Increased timeout, fixed default bin in viz

21f6940

Config changes

1eb3a6b

Reset experimental configs

7b5111a

Added back constants

1fe73a9

mpragnay requested a review from Copilot March 2, 2026 17:11

Copilot started reviewing on behalf of mpragnay March 2, 2026 17:12 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

mpragnay added 2 commits March 2, 2026 12:28

Added faulty dimensions check

b8aeed9

Minor Fixes

3a25ac7

eugenevinitsky reviewed Mar 4, 2026

View reviewed changes

Comment thread pufferlib/config/ocean/drive.ini

eugenevinitsky reviewed Mar 4, 2026

View reviewed changes

Comment thread pufferlib/ocean/drive/drive.h

eugenevinitsky reviewed Mar 4, 2026

View reviewed changes

Comment thread pufferlib/ocean/drive/drive.h Outdated

Addressing concerns with goal_respawn, complete revert of configs, ot…

8fd053f

…her code cleanup

eugenevinitsky reviewed Mar 6, 2026

View reviewed changes

Addressing comments

392905e

eugenevinitsky approved these changes Mar 7, 2026

View reviewed changes

Merge 3.0_beta into randomagents branch

e5436bf

mpragnay merged commit 3430228 into 3.0_beta Mar 7, 2026
10 checks passed

riccardosavorgnan mentioned this pull request Mar 9, 2026

Merge 3.0_beta into 3.0 #330

Merged

		float mod_heading = atan2f(agent->heading_y, agent->heading_x);
		float cos_theta = dot/(mod_to_pt * mod_heading);

	float spawn_height = 1.5f; // Design Choice: Doesn't matter as we don't have flying cars in 2026
	float spawn_height = spawn_settings.h;

-                count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1);
+                if (upper < min_agents_per_env) {
+                    /* Infeasible to satisfy min_agents_per_env for all envs with remaining agents.
+                     * Fall back by relaxing the minimum for this env: fill up to max_agents_per_env
+                     * (or all remaining if fewer). This avoids a zero/negative modulo range.
+                     */
+                    count = remaining > max_agents_per_env ? max_agents_per_env : remaining;
+                } else {
+                    count = min_agents_per_env + rand() % (upper - min_agents_per_env + 1);
+                }

		reward_lane_align = 0
		reward_lane_center = 0

Conversation

mpragnay commented Feb 12, 2026

Uh oh!

greptile-apps Bot commented Feb 12, 2026

Greptile Overview

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eugenevinitsky Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

eugenevinitsky Mar 6, 2026 •

edited

Loading