Skip to content

logs16:RL test shorter reply is better

Higepon Taro Minowa edited this page May 16, 2018 · 6 revisions
Log Type Detail
1: What specific output am I working on right now? Main purpose is seeing if RL works as expected. This should be reproducible.
2: Thinking out loud
- hypotheses about the current problem
- what to work on next
- how can I verify
If average len goes down and it's reproducible, we can conclude RL is working.
3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer Did avg len go down? If not, why do you think it didn't
4: Results of runs and conclusion Yes! it quickly converged to len == 1.
5: Next steps - (1) change this to len == 2 (most reward) to see if this is reproducible. (2) train seq2seq then see if this converge
6: mega.nz 20180516_test_medium25
RL

{'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.5, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 22, 'model_path': 'model/tweet_large'} dst {'machine': 'client1', 'batch_size': 64, 'num_units': 512, 'num_layers': 2, 'vocab_size': 5000, 'embedding_size': 256, 'learning_rate': 0.1, 'learning_rate_decay': 0.99, 'use_attention': True, 'encoder_length': 28, 'decoder_length': 28, 'max_gradient_norm': 5.0, 'beam_width': 0, 'num_train_steps': 1560, 'model_path': 'model/tweet_large_rl'} |

Clone this wiki locally