logs:33 Long RL run

Log Type	Detail
1: What specific output am I working on right now?	Run reward_qi + reward_s RL see if reward goes up.
2: Thinking out loud - hypotheses about the current problem - what to work on next - how can I verify	Just see tensorboard graph
3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer	- did reward goes up -does the answer look okay?
4: Results of runs and conclusion	-reward was flat didn't go up -
5: Next steps
6: mega.nz

In the middle, there were actually good results.

ありがとうございます🙇🏻‍♀️💓こちらこそありがとうございます❇︎これからよろしくお願いします(^^)[PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD] [seq2] : こちら♡です！！！！！！！！！！！！！！！！！！！！！！！！ -0.43 => (-16.16) <= -15.73 [RL greedy] : いえいえ(^^)よろしくお願いします🙇 -2.77 => [RL sample]: いえいえ(^^)よろしくお願いします🙇 -6.82 => (-20.74) <= -13.92

seq2seq

RL

hparam| {'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large'}dst{'machine': 'client2', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large_rl'}|

logs:33 Long RL run

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally