- This article is excerpted from Lovemiss-Y’s blog at blog.csdn.net/qq_27825451…
- It helps to see the difference between dynamically built and statically built RNNS
Introduction: TensorFlow provides a good encapsulation of RNN implementations, but do you really understand how they work? This time share a few pictures for their own handwritten notes, there are incorrect also hope big guy support pointed out, because every time edit article also have to draw is really a little laborious, I uploaded handwritten pictures.
Notice that there is a y0–>y0,y1–>y1,y2–>y2… [batch_size,y_max_length, VOCab_size] [batch_size,y_max_length, VOCab_size] [batch_size,y_max_length, VOCab_size] But if we add a layer of Softmax on top of that, it becomes the ratio for each word in VOCab, so y0 is not the same
Basic RNN, the direct input parameter is the number of hidden neurons (equivalent to the number of output/incoming states); Static RNN: inputs _cell and inputs
X = tf.placeholder(tf.float32,[None,timesteps,n_inputs]) xx = tf.unstack(tf.transpose(x,perm=[1,0,2])) basic_cell = BasicalRNNCell(num_units=3) outputs,states = tf.nn.static_rnn(basical_cell,xx)Copy the code
Inputs rnN_cell and INPUTS for dynamic RNN
x = tf.placeholder(tf.float32,[None,timesteps,n_inputs]) basic_cell = BasicalRNNCell(num_units=3) outputs,states = tf.nn.dynamic_rnn(basical_cell,x) Copy the code
The details of the code are circumvented and the input conversion steps are done by the application itself. Basicalrnncell can be used, but it is not recommended
New Understanding of RNN:
- The first is for sequences, the correct sequence form [X0,X1,X2,X3…] Where X0 = [X0,x1,x2,..] , so the complete sequence should be [[X01, X02,x03..],[X11, X12,x13..]. It is a two-tier structure, and a three-tier structure if batCH_size is included
- We used the higher-order API, which only exposed the number of one hidden neuron for NUM_units (the output dimension of state), while the rest were encapsulated. We went deeper into the principle layer and found that the loop was actually implemented by the framework, so we didn’t need to pay attention to it. It was better to understand the mechanism.