This is the fourth article in the TensorFlow series when wechat applet meets TensorFlow. Reading this article, you will learn:
- How do I view the tensorflow SavedModel signature
- How to load tensorFlow SavedModel
- How to modify the existing TensorFlow model to add an input layer
If you want to learn more about this project, check out the first three articles in this series:
- When wechat applets meet TensorFlow: Server implementation
- When wechat applet meets TensorFlow: Server implementation supplement
- When wechat applet meets TensorFlow: applet implementation
For the Tensorflow SavedModel format model, please refer to the previous article:
- Tensorflow SavedModel saving and loading
- How to view information about the TensorFlow SavedModel format model
- How do I merge two TensorFlow models
The problem
So far, we have implemented a Simple wechat applet deployed on the server side using the open source Simple TensorFlow Serving. However, there is still a major problem in this implementation scheme: the image data transmitted between the applet and the server is the Json-representation of (299, 299, 3) binary array. The biggest disadvantage of json-representation of binary data is the large amount of data, a simple 299 x 299 image, so the representation is about 3 ~ 4 M. In fact, HTTP binary data transmission is commonly used for binary data base64 encoding, after base64 encoding, although the amount of data will be larger than binary, but compared with JSON representation, it is much smaller.
So now the question is, how do you get the server to receive base64 encoded image data?
View the signature of the model
To solve this problem, let’s first look at the input and output of the model and see what the signature is. The signature here is not an electronic signature to ensure that the model is not modified. Similar to the input and output information of a module in a programming language, such as function names, input parameter types, output parameter types and so on. With the help of the saved_model_cli.py tool provided by Tensorflow, we can clearly check the signature of the model:
python ./tensorflow/python/tools/saved_model_cli.py show --dir /data/ai/workspace/aiexamples/AIDog/serving/models/inception_v3/ --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['image'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 299, 299, 3)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['prediction'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 120)
name: final_result:0
Method name is: tensorflow/serving/predict
Copy the code
It can be seen from this that the input parameter of the model is named image, and its shape is (-1, 299, 299, 3). Here, -1 represents batch input. Usually, we only input one image, so this dimension is usually 1. The output parameter is called prediction, and its shape is (-1, 120). -1 corresponds to the input, and 120 represents the probability of 120 groups of dog categories.
The question now is, can we add a layer of base64 and decoding in front of the model’s input?
Perhaps you think you can write a bit of code on the server side that decodes the Base64 string and then delivers it to Simple Tensorflow Serving, or modify the logic of Simple Tensorflow Serving, However, this modification scheme increases the workload of the server side, making the server deployment scheme no longer common, give up!
Modify the model and add an input layer
In fact, in the previous article “How to Merge two TensorFlow models”, we have talked about how to connect the two models. Here we repeat a little. First, we write a Base64 decoding, PNG decoding, image scaling model:
base64_str = tf.placeholder(tf.string, name='input_string')
input_str = tf.decode_base64(base64_str)
decoded_image = tf.image.decode_png(input_str, channels=input_depth)
Uint8 to range [0,1] of float32
decoded_image_as_float = tf.image.convert_image_dtype(decoded_image,
tf.float32)
decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
resize_shape = tf.stack([input_height, input_width])
resize_shape_as_int = tf.cast(resize_shape, dtype=tf.int32)
resized_image = tf.image.resize_bilinear(decoded_image_4d,
resize_shape_as_int)
tf.identity(resized_image, name="DecodePNGOutput")
Copy the code
Next load the retrain model:
with tf.Graph().as_default() as g2:
with tf.Session(graph=g2) as sess:
input_graph_def = saved_model_utils.get_meta_graph_def(
FLAGS.origin_model_dir, tag_constants.SERVING).graph_def
tf.saved_model.loader.load(sess, [tag_constants.SERVING], FLAGS.origin_model_dir)
g2def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
["final_result"],
variable_names_whitelist=None,
variable_names_blacklist=None)
Copy the code
The graph_util.convert_variables_to_constants is called to convert variables in the model to constants, known as the freeze graph operation.
Using the tf.import_graph_def method, we can import the graph into an existing graph. Note that the second import_graph_def, whose input is the output of the first graph_def, concatenates and saves the two computed graphs. The code is as follows:
with tf.Graph().as_default() as g_combined:
with tf.Session(graph=g_combined) as sess:
x = tf.placeholder(tf.string, name="base64_string")
y, = tf.import_graph_def(g1def, input_map={"input_string:0": x}, return_elements=["DecodePNGOutput:0"])
z, = tf.import_graph_def(g2def, input_map={"Placeholder:0": y}, return_elements=["final_result:0"])
tf.identity(z, "myOutput")
tf.saved_model.simple_save(sess,
FLAGS.model_dir,
inputs={"image": x},
outputs={"prediction": z})
Copy the code
If you don’t know what the input node of the retrain model is (note that you can’t use the signature information deployed by the model)? Graph node names can be traversed using the following code:
for n in g2def.node:
print(n.name)
Copy the code
Model deployment and testing
/models/inception_v3/1/./models/inception_v3/1/./models/inception_v3/1/
We modify the original test_client.py code to add a model_version parameter to determine which version of the model to communicate with:
with open(file_name, "rb") as image_file:
encoded_string = str(base64.urlsafe_b64encode(image_file.read()), "utf-8")
if enable_ssl :
endpoint = "https://127.0.0.1:8500"
else:
endpoint = "http://127.0.0.1:8500"
json_data = {"model_name": model_name,
"model_version": model_version,
"data": {"image": encoded_string}
}
result = requests.post(endpoint, json=json_data)
Copy the code
summary
After more than a week of research and repeated attempts, the base64 coding communication problem of image data is finally solved. The difficulty is that although the model is retrained by writing a retrain script, the code is not that easy to understand and attempts to add an input layer to the retrain are unsuccessful. Finally, we got inspiration from the Freezing graph in the conversion from Tensorflow model to Tensorflow Lite model, solidified the variables in the graph into constants, and solved the problem of loading variables in the merged model. There are some methods for restoring variables available online, but they don’t really work. It’s possible that Tensorflow has evolved so quickly that some of the previous methods are outdated.
The full code for this article can be found at: github.com/mogoweb/aie…
Click to read the original article directly to the project on Github.
So far, the key problems have been solved. Next, we need to continue to improve the presentation of wechat mini programs and how to provide recognition rate. Please pay attention to my wechat public number: Yunshui Mushi, for the latest developments.
reference
- How to Show Signatures of Tensorflow Saved Model
- Serving Image-Based Deep Learning Models with TensorFlow-Serving’s RESTful API
- Tensorflow: How to replace a node in a calculation graph?