This is the third and final article in the Tensorflow SavedModel series. In Saving and Loading the Tensorflow SavedModel Model, we talked about how to save the Tensorflow model in SavedModel format and how to load it. In how to View Information about the TensorFlow SavedModel format model, we demonstrated how to view the model’s signature and computational graph structure. In this article, we will explore how to combine the two models. In simple terms, we will concatenate the outputs of the first model as inputs of the second model to form a new model.

background

Why do you need to merge the two models?

Let’s take the code in “Tensorflow SavedModel Saving and Loading” as an example, this handwritten number recognition model receives the input of Shape is [?, 784], here? I can batch accept input, so I can ignore it for a second, so I’ll just fix it to 1. 784 is the result of 28 x 28 expansion, that is, the result of 28 x 28 gray image expansion.

The problem is, we usually send the model pictures, maybe from a file, maybe from a camera. To complicate matters, if we invoke the model deployed to the server side over HTTP, binary data is actually not easy to transport over HTTP, and we usually need to base64 encode image data. The server receives the data as a Base64 string, and the model accepts the binary vector.

Naturally, we can think of two solutions:

  1. Retrain model A model that accepts base64 strings.

    The problem with this solution is that retraining the model is time-consuming or even not feasible. Because the examples in this article are relatively simple, retraining is fine. If it’s a deep convolutional neural network, it can take days to train at a time, and it’s very costly to retrain. More generally, we use models trained by others, such as Mobilenet and InceptionV3 commonly used in image recognition, which are trained by companies like Google and Microsoft with a lot of resources, and we do not have the condition to retrain.

  2. Add base64 to binary data conversion on the server side

    This solution isn’t complicated to implement, but what if we deployed it using a solution like Tensorflow Model Server? Of course, we can also open another server to accept the base64 image data of the client and forward it to the Tensorflow Model Server after processing, but this undoubtedly increases the workload and complexity of the server.

In this article, we will present a third solution: write a Tensorflow model, receive base64 image data, output binary vector, then take the output of the first model as the input of the second model, concatenate it, save it as a new model, and finally deploy the new model.

Base64 decodes the Tensorflow model

Tensorflow contains a large number of image processing and array processing methods, so it is relatively simple to implement this model. The model includes base64 decoding, decoding PNG images, scaling to 28 * 28, and finally expanding to (1, 784) array output, which conforms to the input of handwritten number recognition model, the code is as follows:

with tf.Graph().as_default() as g1:
  base64_str = tf.placeholder(tf.string, name='input_string')
  input_str = tf.decode_base64(base64_str)
  decoded_image = tf.image.decode_png(input_str, channels=1)
  Uint8 to range [0,1] of float32
  decoded_image_as_float = tf.image.convert_image_dtype(decoded_image,
                                                        tf.float32)
  decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
  resize_shape = tf.stack([28.28])
  resize_shape_as_int = tf.cast(resize_shape, dtype=tf.int32)
  resized_image = tf.image.resize_bilinear(decoded_image_4d,
                                           resize_shape_as_int)
  Expand to a 1-dimensional array
  resized_image_1d = tf.reshape(resized_image, (- 1.28 * 28))
  print(resized_image_1d.shape)
  tf.identity(resized_image_1d, name="DecodeJPGOutput")

g1def = g1.as_graph_def()
Copy the code

In this model, there are no variables, they are fixed operations, so no training is required.

Load the handwriting recognition model

For handwriting recognition model, please refer to the article “Tensorflow SavedModel Saving and Loading “. The model is saved under “./model”, and the loading code is as follows:

with tf.Graph().as_default() as g2:
  with tf.Session(graph=g2) as sess:
    input_graph_def = saved_model_utils.get_meta_graph_def(
        "./model", tag_constants.SERVING).graph_def

    tf.saved_model.loader.load(sess, ["serve"]."./model")

    g2def = graph_util.convert_variables_to_constants(
        sess,
        input_graph_def,
        ["myOutput"],
        variable_names_whitelist=None,
        variable_names_blacklist=None)
Copy the code

Here g2 is used to define another graph, separate from the graph of the previous model. Note that graph_util.convert_variables_to_constants is called to convert variables in the model to constants, the so-called freeze graph operation.

I got stuck on this problem for a long time while working on how to connect the two models. The first idea was to merge the model and then load the variable values, but after trying, it didn’t work. The later idea was to iterate over the variables of the handwriting recognition model, get their variable values, and copy the variable values to the variables of the merged model, but this operation always prompts that there are variables that are not initialized when using the model.

Finally, I got inspiration from the transformation of Tensorflow model to Tensorflow Lite model, and fixed the variables in the model, so that there would be no loading problem of variables, and there would not be the problem of uninitialized model variables.

With convert_variabLES_to_constants, you can see that there are two variables converted to constant operations, namely W and B in the handwritten number recognition model:

Converted 2 variables to const ops.
Copy the code

Connect two models

Using the tf.import_graph_def method, we can import the graph into an existing graph. Note that the second import_graph_def, whose input is the output of the first graph_def, concatenates and saves the two computed graphs. The code is as follows:

with tf.Graph().as_default() as g_combined:
  with tf.Session(graph=g_combined) as sess:

    x = tf.placeholder(tf.string, name="base64_input")

    y, = tf.import_graph_def(g1def, input_map={"input_string:0": x}, return_elements=["DecodeJPGOutput:0"])

    z, = tf.import_graph_def(g2def, input_map={"myInput:0": y}, return_elements=["myOutput:0"])
    tf.identity(z, "myOutput")

    tf.saved_model.simple_save(sess,
              "./modelbase64",
              inputs={"base64_input": x},
              outputs={"myOutput": z})
Copy the code

Since the first model contains no variables and the second model converts variables to constant operations, the last saved model file contains no variables:

├── ├─ ├─ ├─ ├─ ├─ ├─ ├─ ├─ ├─Copy the code

test

Let’s write a test code to see if the model works after the merge:

with tf.Session(graph=tf.Graph()) as sess:
  sess.run(tf.global_variables_initializer())

  tf.saved_model.loader.load(sess, ["serve"]."./modelbase64")
  graph = tf.get_default_graph()

  with open("./5.png"."rb") as image_file:
    encoded_string = str(base64.urlsafe_b64encode(image_file.read()), "utf-8")

  x = sess.graph.get_tensor_by_name('base64_input:0')
  y = sess.graph.get_tensor_by_name('myOutput:0')

  scores = sess.run(y,
           feed_dict={x: encoded_string})
  print("predict: %d, actual: %d" % (np.argmax(scores, 1), 5))
Copy the code

Here, the input of the model is base64_input, and the output is still myOutput. Two images are used to test, and both work well.

summary

The last three articles are actually summarized in the study of my wechat small program, in order to better illustrate the problem, I used a very simple model to illustrate the problem, but also suitable for complex model.

The complete code for this article can be found at: github.com/mogoweb/aie…

Hope this article was helpful to you, thanks for reading! Meanwhile, please pay attention to my wechat official account: Yunshui Mushi.