Skip to content

logs33:Observe RL values

Higepon Taro Minowa edited this page Jul 1, 2018 · 3 revisions

Questions to answer

  • Should we change the temperature?
  • How quickly it learns something?
  • Should we standardize reward?

Clone this wiki locally