-
Notifications
You must be signed in to change notification settings - Fork 19
logs:33 Long RL run
| Log Type | Detail |
|---|---|
| 1: What specific output am I working on right now? | Run reward_qi + reward_s RL see if reward goes up. |
| 2: Thinking out loud - hypotheses about the current problem - what to work on next - how can I verify |
Just see tensorboard graph |
| 3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer | - did reward goes up -does the answer look okay? |
| 4: Results of runs and conclusion | -reward was flat didn't go up - |
| 5: Next steps | |
| 6: mega.nz | 20180709140004_rl_test |
In the middle, there were actually good results.
ありがとうございます🙇🏻♀️💓こちらこそありがとうございます❇︎これからよろしくお願いします(^^)[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : こちら♡です!!!!!!!!!!!!!!!!!!!!!!!! -0.43 => (-16.16) <= -15.73 [RL greedy] : いえいえ(^^)よろしくお願いします🙇 -2.77 => [RL sample]: いえいえ(^^)よろしくお願いします🙇 -6.82 => (-20.74) <= -13.92
But later it showed short results.
なのにもう汚いお勉強、えらいな![PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : ありがとう!!!!!!!!!!!!!!!!!!!!!!!!!! -0.23 => (-19.05) <= -18.82 [RL greedy] : 痛かっゆこ -5.23 => [RL sample]: エアコン -7.77 => (-17.96) <= -10.19
In the end, it's not even human readable and reward is very low.
PS4とモンハンで計5万はやばいセットのやつ?[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : 💩の💩のやつ 💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩 -2.55 => (-8.05) <= -5.50 [RL greedy] : 一室ジャポニカ
|||)|||)⊂)⊂)⊂)⊂)⊂)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||) -1.41 => [RL sample]: アサデス山中湖|||)|||)⊂)⊂)⊂)⊂)⊂)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||)`|||) -13.26 => (-20.16) <= -6.90 reward_qi size= 64 28