Skip to content

[BUG] reward_lose not correctly set at timeout #3

@samvelyan

Description

@samvelyan

This is a migration copy of the original bug report here: facebookresearch#103.


When setting max_episode_steps, while gym correctly triggers a reset, MiniHack (and perhaps even NLE) does not set StepStatus.ABORTED.

As a consequence, the reward for Timeout is not reward_lose.

To Reproduce

import gym
import gym.vector
import minihack

MAX_STEPS = 20

env = gym.make(
    "MiniHack-KeyRoom-S5-v0",
    reward_lose=-1.0,
    max_episode_steps=MAX_STEPS,
)
timestep = env.reset()
env.render()


for i in range(MAX_STEPS):
    timestep = env.step(3) 
    reward = timestep[1]
    info = timestep[-1]
    print(reward, info["end_status"])

assert int(info["end_status"]) == -1
Expected behavior
int(info["end_status"]) should be -1

Potential reasons

gym.make accepts max_episode_steps as argument.
For this reason, max_episode_steps is not included in kwargs and it is not passed through to the MiniHack constructor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions