Model-based multi-task RL for Partially observable environments.

The Environment

The environment use for this project is the L-world. It's a set of 2 gridworlds with L-shaped 'walls' the run along the side as shown here:

The agent cannot enter the GREEN squares in each case. The start state is the top left cell and the end state is the bottom right cell. A simple policy independent of the fact that green squares change isn't really possible since the environment changes and the optimal path is entirely different in each case.

The agent uses the following architecture to optimally determine the task( environment ) and solve it ( the agent can only observe a 5x5 square of states around it )

After running a fresh( randomly initialized ) agent on about 200 episodes, we see that the environment model has learnt the environment to a reasonable extent and can use the model to predict what the current task is:

Agent - YELLOW

Observed Cell - RED

Hidden Cell - GREY

Imagined Wall - PURPLE

Observed Wall - GREEN

Model vs Model-free performance:

The Y-axis is the average number of steps required to reach the bottom-right square( Goal )

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
generator		generator
models		models
simulator		simulator
ActualExperiment		ActualExperiment
ActualExperiment.cpp		ActualExperiment.cpp
ActualExperiment.o		ActualExperiment.o
AgentTestExperiment.cpp		AgentTestExperiment.cpp
CRBMTestExperiment		CRBMTestExperiment
CRBMTestExperiment.cpp		CRBMTestExperiment.cpp
CRBMTestExperiment.o		CRBMTestExperiment.o
DynaQAgent.cpp		DynaQAgent.cpp
DynaQAgent.o		DynaQAgent.o
MSRBMTestExperiment.cpp		MSRBMTestExperiment.cpp
MSRBMTestExperiment.o		MSRBMTestExperiment.o
Makefile		Makefile
README.md		README.md
README.txt		README.txt
Screen Shot 2016-10-06 at 8.53.23 PM.png		Screen Shot 2016-10-06 at 8.53.23 PM.png
Screen Shot 2016-10-06 at 9.12.45 PM.png		Screen Shot 2016-10-06 at 9.12.45 PM.png
Screen Shot 2016-10-06 at 9.14.54 PM.png		Screen Shot 2016-10-06 at 9.14.54 PM.png
Screen Shot 2016-10-06 at 9.15.16 PM.png		Screen Shot 2016-10-06 at 9.15.16 PM.png
SkeletonEnvironment.cpp		SkeletonEnvironment.cpp
SkeletonEnvironment.o		SkeletonEnvironment.o
SkeletonExperiment		SkeletonExperiment
cs6700_pa3.pdf		cs6700_pa3.pdf
data.dat		data.dat
putils.cpp		putils.cpp
putils.h		putils.h
putils.o		putils.o
rl_report.pdf		rl_report.pdf
testExp.cpp		testExp.cpp
utils.cpp		utils.cpp
utils.h		utils.h
utils.o		utils.o

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model-based multi-task RL for Partially observable environments.

The Environment

About

Uh oh!

Releases

Packages

Languages

suhasjs/model-based-rl

Folders and files

Latest commit

History

Repository files navigation

Model-based multi-task RL for Partially observable environments.

The Environment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages