logs:33 Long RL run

Log Type	Detail
1: What specific output am I working on right now?	Run reward_qi + reward_s RL see if reward goes up.
2: Thinking out loud - hypotheses about the current problem - what to work on next - how can I verify	Just see tensorboard graph
3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer	- did reward goes up -does the answer look okay?
4: Results of runs and conclusion	-reward was flat didn't go up -
5: Next steps
6: mega.nz	20180709140004_rl_test

In the middle, there were actually good results.

ありがとうございます🙇🏻‍♀️💓こちらこそありがとうございます❇︎これからよろしくお願いします(^^)[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : こちら♡です！！！！！！！！！！！！！！！！！！！！！！！！ -0.43 => (-16.16) <= -15.73 [RL greedy] : いえいえ(^^)よろしくお願いします🙇 -2.77 => [RL sample]: いえいえ(^^)よろしくお願いします🙇 -6.82 => (-20.74) <= -13.92

But later it showed short results.

なのにもう汚いお勉強、えらいな！[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : ありがとう！！！！！！！！！！！！！！！！！！！！！！！！！！ -0.23 => (-19.05) <= -18.82 [RL greedy] : 痛かっゆこ -5.23 => [RL sample]: エアコン -7.77 => (-17.96) <= -10.19

In the end, it's not even human readable and reward is very low.

PS4とモンハンで計5万はやばいセットのやつ？[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : 💩の💩のやつ 💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩💩 -2.55 => (-8.05) <= -5.50 [RL greedy] : 一室ジャポニカ|||)|||)⊂)⊂)⊂)⊂)⊂)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||) -1.41 => [RL sample]: アサデス山中湖|||)|||)⊂)⊂)⊂)⊂)⊂)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||)｀|||) -13.26 => (-20.16) <= -6.90 reward_qi size= 64 28

logs:33 Long RL run

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally