This is a migration copy of the original bug report here: facebookresearch#103.
When setting max_episode_steps, while gym correctly triggers a reset, MiniHack (and perhaps even NLE) does not set StepStatus.ABORTED.
As a consequence, the reward for Timeout is not reward_lose.
To Reproduce
import gym
import gym.vector
import minihack
MAX_STEPS = 20
env = gym.make(
"MiniHack-KeyRoom-S5-v0",
reward_lose=-1.0,
max_episode_steps=MAX_STEPS,
)
timestep = env.reset()
env.render()
for i in range(MAX_STEPS):
timestep = env.step(3)
reward = timestep[1]
info = timestep[-1]
print(reward, info["end_status"])
assert int(info["end_status"]) == -1
Expected behavior
int(info["end_status"]) should be -1
Potential reasons
gym.make accepts max_episode_steps as argument.
For this reason, max_episode_steps is not included in kwargs and it is not passed through to the MiniHack constructor.