Skip to content

Model-based multi-task control in partially-observable environments

Notifications You must be signed in to change notification settings

suhasjs/model-based-rl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model-based multi-task RL for Partially observable environments.

The Environment

The environment use for this project is the L-world. It's a set of 2 gridworlds with L-shaped 'walls' the run along the side as shown here:

alt text

The agent cannot enter the GREEN squares in each case. The start state is the top left cell and the end state is the bottom right cell. A simple policy independent of the fact that green squares change isn't really possible since the environment changes and the optimal path is entirely different in each case.

The agent uses the following architecture to optimally determine the task( environment ) and solve it ( the agent can only observe a 5x5 square of states around it ) alt text

After running a fresh( randomly initialized ) agent on about 200 episodes, we see that the environment model has learnt the environment to a reasonable extent and can use the model to predict what the current task is:

alt text

Agent - YELLOW

Observed Cell - RED

Hidden Cell - GREY

Imagined Wall - PURPLE

Observed Wall - GREEN

Model vs Model-free performance: alt text

The Y-axis is the average number of steps required to reach the bottom-right square( Goal )

About

Model-based multi-task control in partially-observable environments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 94.4%
  • Makefile 4.0%
  • C 1.6%