-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Hi, I'm interested in your work and appreciate the sharing of source code. I have some questions.
First, I run MT10-Conditioned task, I find that the time consumption is average 200s per epoch, meaning that we need 18 days to perform all 7500 epoches. Moreover, I also run MT50-Fixed task, the consumption is average 2500s per epoch. And you use multiple-processing technique, even the policy network and Q-function network is deployed in GPU, these networks only consume 1.5G GPU memory. Is it normal training speed?
Second, you use multiple-processing technique to collect data and perform multi-task learning, What is the training process of multi-task? Each time you input a state vector and task id one-hot vector into policy network, it means that the batch size is equal to 1, but you define the batch size as 1280, What is the specific training detail?
Look forward to your reply, thanks!