Text generation problem training Pipeline based on Transformer
Project Open source Address
main requirements
Python 3.6 pytorch 1.6.0 + cu101
Project description
Text generation problem Pipeline based on Transformer. (The small talk model is trained and tested based on the conversation data.) The training method is teacher Forcing (based on the lower triangle mask, please refer to the code of loss for details).
Model training
python train.py
reasoning
python inference.py
Training details Reference
Training data (LCCC_base, source: github.com/thu-coai/CD…
Initial learning rate: 1E-4
batch_size:96
nheads_transformer:15
Embed_dim :300 (used pre-trained word vector, source :github.com/Embedding/C… Link, use the word vector is: pan.baidu.com/s/1hJKTAz6P…).
encode_layers=6
Training effect Preview
Training to 16 epochs (approximately 2 million +steps, approximately 10 days)