Tensor2Tensor: Organizing models and data in the world. Our speaker is Laurence Moroney.
Review Tensorflow
First, a review of Tensorflow:
Tensorflow can run anywhere. Tf.data can help you build an efficient data input pipeline, tf.Layers and tf.keras.Model can help you quickly build a neural network, and tD.Estimator and DistributionStrategy can help you quickly build distributed training.
Tensor2Tensor
However, for frontier AI, this is not enough. For example, in areas such as image recognition, text translation, and text analysis, many people do not have the knowledge or experience to master these best practices, making it difficult for them to enjoy the latest AI research results. Tensor2Tensor was created to give the community a good sharing platform.
Tensor2Tensor publishes with some data sets and models and their hyperparameters:
Through various investigations and studies, we found that the Settings of these hyperparameters have the best performance for the corresponding models and data sets. Without Tensor2Tensor, you can only adjust the parameters by yourself and experiment constantly, which is very inefficient. That’s what Tensor2Tensor was designed for.
To make things better out of the box, Tensor2Tensor comes with a set of tools, such as hyperset Settings, distributed training on gpus or Tpus, which are available in Tensor2Tensor.
Tensor2Tensor open source
Tensor2Tensor is fully open source on GitHub:
Tensor2Tensor keeps up with the academic cutting edge
Tensor2Tensor keeps up with the academic cutting edge.
Here’s an interesting example of a person who tweeted:
AMSGrad algorithm is the latest SGD optimization algorithm.
Then, another user replied:
This is no longer the latest SGD optimization, the latest is AdaFactor which was implemented in Tensor2Tensor three weeks ago.
The person was quickly hired by Google. Smile: -d
Of course, Laurence also has a screenshot of the pseudo-code AdaFactor algorithm for those interested in taking a closer look:
In addition, Tensor2Tensor also implemented the Transformer model:
The Transformer model is a brand new model proposed by Google in its 2017 paper Attention Is All You Need. The Transformer model is at the top of its day by using only the Attention mechanism instead of the traditional CNN and RNN. This model is widely used in NLP fields such as machine translation, question answering systems, text summarization and speech recognition.
At the moment, we have a lot of people involved in our Tensor2Tensor project:
We strongly encourage researchers to use Tensor2Tensor to help with their studies.
Meet t2t-trainer
Let’s take a look at the T2T-trainer, a tool from Tensor2Tensor that allows people who don’t understand code to do things with machine learning.
With Tensor2Tensor, you just need to define a few parameters and you’ll be done.
pip install tensor2tensor & t2t-trainer \ --problem=$PROBLEM \ --model=$MODEL \ --hparams_set=$HPARAMS \ --generate_data \ --data_dir=$DATA_DIR \ --output_dir=$TRAIN_DIR \ --train_steps=$TRAIN_STEPS \ --eval_steps=$EVAL_STEPSCopy the code
There are three main parameters:
- –problem: problem or task
- –model: The selected model
- –hparams_set: indicates the super parameter set
The hyperparameter set is easy to explain. For the hyperparameters in the model, we can change some parameters and construct a new set of hyperparameters.
Here are a few common examples.
Text in this paper,
The text summary task is to extract key information from a long text.
Here’s what you can do:
pip install tensor2tensor & t2t-trainer \
--problem=summarize_cnn_dailymail32k \
--model=transformer \
--hparams_set=transformer_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
With just a few lines like this, you’ll have a pretty good text summary model at the end of your training!
Image classification
All you need to do is run a command like this:
pip install tensor2tensor & t2t-trainer \
--problem=image_cifar10 \
--model=shake_shake \
--hparams_set=shake_shake_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
The model selected here and the set of parameters trained by the model was the best model a year ago!
translation
To implement an EN-DE (English-German) translation model, all you need is:
pip install tensor2tensor & t2t-trainer \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
Achieved results:
>29 BLEU
, current best results!
Speech recognition
If you want to implement a speech recognition model, all you need is a few lines of command:
pip install tensor2tensor & t2t-trainer \
--problem=librispeech \
--model=tranformer \
--hparams_set=transformer_librispeech \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
Achieved results:
< 7.5 WER
This is close to the best result!
Image generation
pip install tensor2tensor & t2t-trainer \
--problem=librispeech \
--model=tranformer \
--hparams_set=transformer_librispeech \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
Achieved results:
~ 2.92 bits/dim
, current best
scale
For a lot of data, training on a regular laptop is impractical. We need training at scale. Such as clustering machines using gpus or even the cloud. Tensor2Tensor supports this kind of scale-up training very well.
In a multi-GPU environment, all you need is:
t2t-trainer \
--worker_gpu=8 \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
By adding just one line –worker_gpu=8 — your model can be trained in parallel on 8 Gpus!
In the Cloud TPU environment, all you need is:
t2t-trainer \
--use_tpu --cloud_tpu_name=$TPU_NAME \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
In the Cloud ML engine with hyperparametric tuning, all you need is:
t2t-trainer \
--cloud_mlengine --worker_gpu=8 \
--autotune --autotune_maximize \
--autotune_objective='metrics/neg_log_perplexity' \
--autotune_max_trails=100 \
--autotune_parallel_trials=20 \
--hparams_range=transformer_base_range \
--problem=translate_ende_wmt32k \
--model=transformer \
--hparams_set=transformer_big \
--generate_data \
--data_dir=$DATA_DIR \
--output_dir=$TRAIN_DIR \
--train_steps=$TRAIN_STEPS \
--eval_steps=$EVAL_STEPS
Copy the code
Want more control?
Tensor2Tensor has a lot of very handy tools, but what if I want more elaborate control?
Datasets data set
The first thing a lot of people want to control is the data set. For example, a lot of people don’t want to use the data set in Tensor2Tensor, but just some of them, so what should I do?
First, we create the corresponding problem, specify a data directory data_dir, and generate the data.
Now, we have this data, so you can do whatever you want, so you have more fine-grained control over the data set.
Implement the model with Keras
Others want to implement the model with Keras Layers.
Tensor2Tensor already implements a lot of models and if someone wants to build a better model on top of that, they need to do this (for example) :
# Select the hyperparameter
hparams = registry.hparams('bytenet_base')
# Instantiate the model
model = tensor2tensor.models.byte_net.ByteNet(hparams,mode='train')
# Call model
features = {'inputs':embedded_inputs,'targets':embedded_targets}
outputs,_ = model(feature)
Copy the code
You get the hyperparameters, you build the model, and then you get the output by calling it.
What the speaker said and the title seem a little far-fetched, there is no direct relationship with Keras.
Implement your own data sets and models
To implement your own data set and model, you can do this:
- inheritance
Problem
Or its subclasses to create custom datasets - inheritance
T2TModel
To implement their own model
conclusion
For now, our Tensor2Tensor contains the following:
- Datasets data set
- Models model
- Scripts script
Next, we will improve our work mainly from the following aspects:
Thank you! Read more tech tips from Google Developer Conference 2018