Whether you work in computer vision or not, it’s always good to know about it, because even if you don’t work in AI right now, it’s still possible to make your application better with some open apis. For example, Baidu development platform provides many AI-related apis, such as the current popular “white stroke” and other applications, in fact, use Baidu’S API. So, you can also consider whether you can use some voice and text recognition features to power your app.
Because the computer vision stuff we do is more about image processing, which involves OpenCV and Tensorflow on Android, as well as camera and other image processing logic on Android. This will inevitably involve something in JNI and the NDK. Therefore, in this article, we want to discuss the following aspects:
- Android image compression library package
- Android terminal library encapsulation and performance optimization
- JNI and NDK calls, and CMake applications in Android
- OpenCV integration and application in Android
- Tensorflow integration and Application on Android
In fact, we mentioned some of the things we want to talk about today in previous articles. So in this case, the underlying knowledge of the relevant technology can be carried directly. Links to relevant technical articles will be provided, and if you are interested, you can go to the specific article for more detailed knowledge.
1. Image compression library packaging on Android
Why do we do image compression? Because too large picture upload speed is slow, will affect the user experience of the program; Excessive compression of images will lead to low efficiency of program recognition. For every 1 percentage point improvement in recognition efficiency, annotation teams may have to label tens of thousands more images. After testing, it was found that the short edge of the image was about 1100, so we need to develop our own compression strategy.
We’ve already discussed this in previous articles and introduced Bitmap compression on Android. You can read about how we encapsulate the image compression library and the underlying principles of image compression in Android in the following article:
Open source an Android image compression framework
Of course, the above article was based on the first version of our library, which satisfied the basic functionality. In later versions, we improved our library and added more features. Here we’ll focus on the new framework related apis and how we subsequently designed for compatibility, extending functionality from the first release.
In practice, we find that more often than not you need to process bitmaps rather than files. At this point, the first version of the library is no longer available. On second thought, we wanted to extend our library to support more application scenarios. This includes:
- Retrieving the compressed Bitmap directly in the current thread rather than passing the results through Callback or RxJava: Since some of our code is itself executed asynchronously in RxJava, callbacks or using RxJava affect the logical structure of our program.
- Use Bitmap directly or
Compress as a parameter instead of writing to File first and then reading and compressing the File: There are specific application scenarios, such as when you get data from the camera, the actual retrieval isbyte[]
In the case of continuous shooting, continuous writing to the file and then reading and compression will affect the performance of the program. - Support to return Bitmap directly instead of File only: sometimes we need to optimize parts of the application, such as the preview of image processing results, if we return File, it will affect the performance and logical structure of our application.
Originally, we wanted Glide to support compression of custom data structures and custom image retrieval logic, but time and compatibility issues led us to reject this idea in favor of a simpler, more straightforward approach:
- For input arguments, use overloaded functions directly to accept different parameter types.
- The process of compression can be directly used to return the compressed intermediate results to the caller using the get() method;
- Add the asBitmap() method to convert the output parameter type to Bitmap instead of File.
So, in later versions, you can get Bitmap results directly like this:
Bitmap result = Compress.with(this, bytes)
.setScaleMode(ScaleMode.SCALE_SMALLER) // Specify the scaled edges
.asBitmap() // Calling this method indicates that the expected return type is Bitmap instead of File
.subscribe(b -> {
// do somthing
Copy the code
The design of the asBitmap() method can be explained briefly here.
Image links: www.processon.com/view/link/5…
In the first version, we used the design of the first figure above. Both compression strategies here are derived from AbstractStrategy. In fact, the strategy() method above, you can think of it as turning a corner, that is, the concrete strategy that it returns, and the next methods you can call are limited to the concrete strategy.
Later designs return a concrete builder in the asBitmap() method and continue with the logic of returning a Bitmap. At this point we return the BitmapBuilder object in the second figure directly, while Abstrategy still follows the logic of returning File. This allows us to easily transfer the subsequent build logic to the BitmapBuilder object by taking a turn from the original. At the same time, we introduced the generic RequestBuilder
for code reuse purposes. So AbstractStrategy and BitmapBuilder just need to implement it and specify their own resource types. And since AbstractStrategy objects are always being built according to the previous logic, we just need to pass AbstractStrategy as a parameter into the specific RequestBuilder to get bitmaps directly from it. (Bitmaps are “common currencies” strung together.) So we both reuse a lot of code and compatible with the original version on the basis of the expansion of the function, wonderful!
2. Android camera library packaging and performance optimization
Intuitive user experience is important for a ToC application. In our business scenario, there seems to be no need for ARTIFICIAL intelligence if the efficiency of using photo recognition is less than that of human operation. After all, our goal is to improve the productivity of others, so the camera has to be responsive.
At the beginning of the project, we used Google’s open source CameraView. However, in the actual application process, we gradually found that there were a large number of bad designs in this inventory, which affected the application performance of our program:
The illustration shows the performance analysis of the program execution process by using TraceView
Unnecessary data structure construction that affects the camera startup rate: First, it uses these parameters to build a size’s aspect ratio to the size list hash structure when reading the supported dimensions from the camera’s properties. And then the actual operation reads the size from the hash table and calculates it. That’s a bad design! Because it might not take too much time to go through the size list when calculating the size, and the hash table structure built is not used frequently, unnecessary calculations during the camera startup phase can affect the camera startup rate.
The operation of opening the camera is carried out in the main thread, affecting the interface response rate: on the premise that the interface can respond to the user quickly, it is easier to accept even opening a black waiting interface than pressing no response. TraceView shows that the camera open() process takes up about 25% of the camera startup rate. Therefore, it is not appropriate to place this method call in the main thread.
Camera does not support video shooting and preview: This library does not support camera video shooting and preview. After all, real-time processing is an important part of computer vision. Even if it is not currently in the project, we should consider supporting it. (This aspect is basically OpenGL + Camera)
As a result, a camera library developed by ourselves was born. Of course, one of the reasons we developed it was to support OpenGL. But time is too limited, there is not much time to pay attention to these issues:
Image links: www.processon.com/view/link/5…
As for the library, I just implemented all of its logic and had no problem debugging it on my phone. More testing and verification is needed if it is applied to a living environment. About Android camera development knowledge, mainly covering Camera1 and Camera2 two content. The implementation of a method to understand the logic, the implementation of other methods and similar, specific content can refer to the project source. Due to my limited time and energy, I cannot explain the use of camera API in detail.
We can briefly summarize some of the contents of this design:
There are three main design patterns:
Facade mode: Considering the compatibility of Camera1 and Camera2, we need to provide a unified API call externally, so we considered using facade mode to do a unified encapsulation. There are two implementations of Camera1Manager and Camera2Manager defined, which correspond to two different camera apis. They all inherit from the CameraManager facade interface. The advantage of this design pattern is that it is externally uniform, so that in combination with the specific factory + Policy pattern, we can give users the freedom to choose between Camera1 and Camera2.
Policy mode + factory mode: Considering the compatibility of different application scenarios, we want to provide maximum freedom for users. Therefore, we adopt a strategic approach and provide external interfaces for users to calculate the parameters such as the camera size they ultimately want to obtain. So here we define a class called ConfigurationProvider, which is a singleton class that takes care of the in-memory cache in addition to the calculation policy that gets the camera parameters. In this way, the calculation of many parameters, including preview size, photo size, video size, etc., allows the user to specify the specific size freely.
Three main optimization points:
Memory cache optimization: In fact, the size and other information supported by the camera as an attribute of the camera is unchanged. Using memory cache to cache these data does not need to be retrieved and processed again next time, which can significantly improve the response rate of the program when the camera starts up next time.
Lazy initialization, no use, no calculation: To improve the response rate of the program, we even optimized the number calculation. This optimization point may not be obvious, but it can also be used as an optimization point if you are willing to refine. Floating-point calculations are still used in the program, and in the early days we even used shift budgeting directly for fields that were the keys of the hash table mapping. Of course, the effect of this optimization depends on the overall amount of data, and the larger the amount of data, the more obvious the optimization effect.
Asynchronous thread optimization: In earlier versions, we used private locks for thread optimization. You inevitably run into thread-safety issues because you have to put the thread’s open() and setup parameters in two threads. The so-called private lock is actually similar to the synchronous container returned by collections.syncxxx (). That’s locking every method in the container. This can solve the problem, but the structure of the program is not pretty. So, in later versions, we used HandlerThread directly to make asynchronous calls. A HandlerThread, as its name implies, is a combination of a Handler and a Thread.
(For more information, see Android Camera Library Development Practices)
3. JNI and NDK calls, and CMake applications in Android
Previously we thought calling C++ code in Java or Android would be complicated. This requires either dynamic or static registration. For static registration, you need to compile step by step; For dynamic registration, you need to register methods one by one in the Native layer. But then CMake made everything much easier. Of course, CMake is familiar to students who have done native. For the general application layer development of students, in fact, you can also understand it. You can easily put some of your implementation logic in the native layer, which is safer than the Java layer, and you can do more interesting things with C++ and the NDK.
To use CMake on Android, you need to install the SDK tools in AS:
Next, you need to do a simple configuration in Gradle:
Of course, while it’s easy for us to do this, configuration may require a lot of expertise.
This will be configured to a CMake path, which is the cmakelists.txt configuration file pointing to CMake. This is where the third-party native libraries used in our applications are configured. For example, the following is the configuration of CMake in the previous project. OpenCV libraries are configured and our own code is located, and some related libraries are referenced in NDK:
Set the minimum version of CMake required
cmake_minimum_required(VERSION 3.4.1)
# specify the directory for the header file
src/main/cpp/include. /.. /common)add_library(opencv_calib3d STATIC IMPORTED)
add_library(opencv_core STATIC IMPORTED)
#if(EXISTS ${PROJECT_SOURCE_DIR}/opencv/libs/${ANDROID_ABI}/libtbb.a)
# add_library(tbb STATIC IMPORTED)
set_target_properties(opencv_calib3d PROPERTIES IMPORTED_LOCATION ${PROJECT_SOURCE_DIR}/opencv/libs/${ANDROID_ABI}/libopencv_calib3d.a)
set_target_properties(opencv_core PROPERTIES IMPORTED_LOCATION ${PROJECT_SOURCE_DIR}/opencv/libs/${ANDROID_ABI}/libopencv_core.a)
set_target_properties(opencv_features2d PROPERTIES IMPORTED_LOCATION ${PROJECT_SOURCE_DIR}/opencv/libs/${ANDROID_ABI}/libopencv_features2d.a)
add_library(everlens SHARED src/main/cpp/img_proc.cpp src/main/cpp/img_cropper.cpp src/main/cpp/android_utils.cpp .. /.. /common/EdgeFinder.cpp .. /.. /common/ImageFilter.cpp)find_library(log-lib
if(EXISTS ${PROJECT_SOURCE_DIR}/opencv/libs/${ANDROID_ABI}/libtbb.a)
endif(a)Copy the code
Some of the CMake directives have been summarized before, and the address of the official document is specified. If you want to know more, you can go to the article below to learn more:
Summary of commonly used CMake instructions
The main benefits of using CMake are that AS is well supported:
According to the relationship between native layer code and Java layer code, the left mouse button + Ctrl can directly complete the jump between Native layer method and Java layer method.
No need to do tedious dynamic registration and static registration, just need to configure in CMake and Gradle, you can pay more attention to the logical implementation of your code.
Of course, even with CMake, there are times when you need to know about dynamic registrations in JNI, because sometimes you need to do dynamic registrations when retrieving information from objects passed in from the Java layer in the Native layer. For instance,
#include <jni.h>
#include <string>
#include <android_utils.h>
// Define a structure and instance gPointInfo
static struct {
jclass jClassPoint;
jmethodID jMethodInit;
jfieldID jFieldIDX;
jfieldID jFieldIDY;
} gPointInfo;
// Initialize the Class information, notice how the mapping is expressed, just like a decomcompiled comment
static void initClassInfo(JNIEnv *env) {
gPointInfo.jClassPoint = reinterpret_cast<jclass>(env -> NewGlobalRef(env -> FindClass("android/graphics/Point")));
gPointInfo.jMethodInit = env -> GetMethodID(gPointInfo.jClassPoint, "<init>"."(II)V");
gPointInfo.jFieldIDX = env -> GetFieldID(gPointInfo.jClassPoint, "x"."I");
gPointInfo.jFieldIDY = env -> GetFieldID(gPointInfo.jClassPoint, "y"."I");
// Dynamic registration, initialized here
extern "C"
JNI_OnLoad(JavaVM* vm, void* reserved) {
JNIEnv *env = NULL;
if (vm->GetEnv((void**) &env, JNI_VERSION_1_4) ! = JNI_OK) {return JNI_FALSE;
return JNI_VERSION_1_4;
Copy the code
(More on this: Summary of using JNI on Android)
4. OpenCV integration and application in Android: picture clipping and perspective transformation
4.1 Integration of OpenCV
Of course, it is possible to use Java libraries wrapped by others without referencing OpenCV’s C++ libraries, depending on the application scenario. For example, if you don’t need to implement particularly complex functions and just need simple image processing, then someone else’s wrapped Java library will suffice. But if, like us, you need to wrap and compile C++ algorithms from your algorithm classmates, or even use OpenCV’s extended library, then the Java wrapped library may not meet your needs.
Here is the Github address for OpenCV and its extensions:
- OpenCV
- OpenCV-contrib
With these libraries you still can’t directly apply them to your application. Because the above project is the source code of OpenCV, mainly the source code and some header files, but also need to compile them and then apply to their own projects.
Build OpenCV 3.3 Android SDK on Mac OSX
Of course, there are some compiled OpenCV and its extended libraries that we can configure in CMake and reference directly to our project:
So the structure of the final project is as follows:
Circled on the left are some configurations of OpenCV and CMake, and on the right are wrapped Java methods.
4.2 Application of OpenCV
OpenCV can be used to do a lot of image processing that you can’t do with Android native Bitmaps. For example, the picture after irregular cutting perspective transformation, gray processing. In fact, no matter how you process images in the Native layer, in Android, the input to the Native layer and the output to the Native layer are bitmaps. OpenCV::Mat is like the universal currency for native layer image processing. Therefore, a complete picture processing process is roughly as follows:
- Step1: Bitmap of Java layer is converted into Mat of native layer;
- Step2: Use Mat for image processing;
- Step3: convert Mat of native layer into Bitmap of Java layer and return.
To convert a Java layer Bitmap to a native layer Mat you can use the following methods:
#include <jni.h>
#include <android/bitmap.h>
#include "android_utils.h"
void bitmap_to_mat(JNIEnv *env, jobject &srcBitmap, Mat &srcMat) {
void *srcPixels = 0;
AndroidBitmapInfo srcBitmapInfo;
try {
// Call the method in AndroidBitmap to get the bitmap information
AndroidBitmap_getInfo(env, srcBitmap, &srcBitmapInfo);
AndroidBitmap_lockPixels(env, srcBitmap, &srcPixels);
uint32_t srcHeight = srcBitmapInfo.height;
uint32_t srcWidth = srcBitmapInfo.width;
srcMat.create(srcHeight, srcWidth, CV_8UC4);
// Build Mat of different channels according to bitmap format
if (srcBitmapInfo.format == ANDROID_BITMAP_FORMAT_RGBA_8888) {
Mat tmp(srcHeight, srcWidth, CV_8UC4, srcPixels);
cvtColor(tmp, srcMat, COLOR_RGBA2RGB);
} else {
Mat tmp = Mat(srcHeight, srcWidth, CV_8UC2, srcPixels);
cvtColor(tmp, srcMat, COLOR_BGR5652RGBA);
AndroidBitmap_unlockPixels(env, srcBitmap);
} catch (cv::Exception &e) {
AndroidBitmap_unlockPixels(env, srcBitmap);
// Build a Java layer exception and throw it
jclass je = env->FindClass("java/lang/Exception");
env -> ThrowNew(je, e.what());
} catch(...). {AndroidBitmap_unlockPixels(env, srcBitmap);
jclass je = env->FindClass("java/lang/Exception");
env -> ThrowNew(je, "unknown");
return; }}Copy the code
Here is mainly from the Bitmap to obtain the specific information of the image, here called in NDK image related methods. Then, the size information and color information of the picture are used to construct Mat in OpenCV. Mat is similar to the matrix in MATLAB, which contains the pixel information of the image and provides similar methods like Eye (), Zeros () to construct special matrices.
After converting bitmaps to Mat, it’s time to use them. Here is an algorithm used to crop and perspective images:
// Convert Java layer vertices into native layer Point objects
static std::vector<Point> pointsToNative(JNIEnv *env, jobjectArray points_) {
int arrayLength = env->GetArrayLength(points_);
std::vector<Point> result;
for(int i = 0; i < arrayLength; i++) {
jobject point_ = env -> GetObjectArrayElement(points_, i);
int pX = env -> GetIntField(point_, gPointInfo.jFieldIDX);
int pY = env -> GetIntField(point_, gPointInfo.jFieldIDY);
result.push_back(Point(pX, pY));
return result;
// Clipping and perspective changes
Java_xxxx_MyCropper_nativeCrop(JNIEnv *env, jclass type, jobject srcBitmap, jobjectArray points_, jobject outBitmap) {
std::vector<Point> points = pointsToNative(env, points_);
if (points.size() != 4) {
// Get four vertices
Point leftTop = points[0], rightTop = points[1], rightBottom = points[2], leftBottom = points[3];
// Get the Mat corresponding to the source graph and the result graph
Mat srcBitmapMat, dstBitmapMat;
bitmap_to_mat(env, srcBitmap, srcBitmapMat);
AndroidBitmapInfo outBitmapInfo;
AndroidBitmap_getInfo(env, outBitmap, &outBitmapInfo);
int newHeight = outBitmapInfo.height, newWidth = outBitmapInfo.width;
dstBitmapMat = Mat::zeros(newHeight, newWidth, srcBitmapMat.type());
// Put the vertex of the image into the collection to call the perspective method
std::vector<Point2f> srcTriangle, dstTriangle;
srcTriangle.push_back(Point2f(leftTop.x, leftTop.y));
srcTriangle.push_back(Point2f(rightTop.x, rightTop.y));
srcTriangle.push_back(Point2f(leftBottom.x, leftBottom.y));
srcTriangle.push_back(Point2f(rightBottom.x, rightBottom.y));
dstTriangle.push_back(Point2f(newWidth, 0));
dstTriangle.push_back(Point2f(0, newHeight));
dstTriangle.push_back(Point2f(newWidth, newHeight));
// Get a mapping transformation matrix
Mat transform = getPerspectiveTransform(srcTriangle, dstTriangle);
warpPerspective(srcBitmapMat, dstBitmapMat, transform, dstBitmapMat.size());
// Convert Mat to Bitmap output to Java layer
mat_to_bitmap(env, dstBitmapMat, outBitmap);
Copy the code
The final output of the algorithm is:
5. Tensorflow integration and application on Android: Image edge detection
When I detected the edge of the picture before, I found that the effect of OpenCV algorithm was not very rational, so I chose to use TensorFlow to detect the edge of the picture later. This involves integrating Tensorflow Lite on the Android side. Some time ago also see iQiyi SmartVR introduction. Using TF on Android is not difficult, thanks to some official sources. Several samples are available in the Open source repository for Tensorflow:
- Tensorflow github repository
- Tensorflow lite offical
Edge detection is certainly overdone, but for us Android developers, this opportunity to learn how to integrate some Tensorflow on the Android side can also be a stretch. After all, this kind of thing belongs to the current hot thing, maybe one day whim to train a model
Introducing Tensorflow on Android is not complicated, just need to add related repositories and dependencies:
allprojects {
repositories {
maven { url 'https://google.bintray.com/tensorflow' }
dependencies {
// ...
// tensorflow
api 'org. Tensorflow: tensorflow - lite: 0.0.0 - the nightly'
Copy the code
The difficulty lies in what to do with the inputs and outputs of the trained model. Because the so-called model, you can think of it as a function that has been honed out, you just have to follow the required input, and it will give you an output in a fixed format. So, what the exact inputs and outputs are is up to the student who exercises the model.
In our previous development, we initially trained the model students to use Python code to call the model. Python, while clean, is a nightmare for client development. Because of libraries like NumPy and Pillow, you can “translate” a one-line task for a long time. Later, we used the C++ OpenCV form. For iOS development, it’s easier because they can mix object-C with C++. There are a few things you need to do for Android development.
Here are some of the processing done in the Java layer before loading the model and calling Tensorflow:
public class TFManager {
private static TFManager instance;
private static final float IMAGE_MEAN = 128.0 f;
private static final float IMAGE_STD = 128.0 f;
private Interpreter interpreter;
private int[] intValues;
private ByteBuffer imgData;
private int inputSize = 256;
public static TFManager create(Context context) { // DCL
if (instance == null) {
synchronized (TFManager.class) {
if (instance == null) {
instance = newTFManager(context); }}}return instance;
// Load the model from Assets to initialize the TF
private static MappedByteBuffer loadModelFile(AssetManager assets, String modelFilename) throws IOException {
AssetFileDescriptor fileDescriptor = assets.openFd(modelFilename);
FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
FileChannel fileChannel = inputStream.getChannel();
long startOffset = fileDescriptor.getStartOffset();
long declaredLength = fileDescriptor.getDeclaredLength();
return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
// Initialize TF
private TFManager(Context context) {
try {
interpreter = new Interpreter(loadModelFile(context.getAssets(), "Model.tflite"));
interpreter.resizeInput(0.new int[] {});
intValues = new int[inputSize * inputSize];
imgData = ByteBuffer.allocateDirect(inputSize * inputSize * 4 * 3);
} catch(IOException e) { e.printStackTrace(); }}// Edge recognition
public EdgePoint[] recognize(Bitmap bitmap) {
long timeStart = System.currentTimeMillis();
Bitmap scaledBitmap = Bitmap.createScaledBitmap(bitmap, inputSize, inputSize, true);
scaledBitmap.getPixels(intValues, 0, scaledBitmap.getWidth(), 0.0, scaledBitmap.getWidth(), scaledBitmap.getHeight());
for (int i = 0; i < inputSize; ++i) {
for (int j = 0; j < inputSize; ++j) {
int pixelValue = intValues[i * inputSize + j];
// Normalize RBG to between -1 and 1
imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD); // R
imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD); // G
imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD); // B
LogUtils.d("----------TFManager prepare imgData cost : " + (System.currentTimeMillis() - timeStart));
timeStart = System.currentTimeMillis();
Map<Integer, Object> outputMap = new HashMap<>();
outputMap.put(0.new float[1] [256] [256] [5]);
Object[] inputArray = {imgData};
// Call TF for identification
interpreter.runForMultipleInputsOutputs(inputArray, outputMap);
// The result of recognition is processed, mainly on the image pixel processing
float[][][][] arr = (float[][][][]) outputMap.get(0);
int[][] colors = new int[5] [256 * 256];
for (int i=0; i<5; i++) {
for (int j=0; j<256; j++) {
for (int k=0; k<256; k++) {
colors[i][j*256 + k] = (int) (arr[0][j][k][i] * 256);
LogUtils.d("----------TFManager handle TF result cost : " + (System.currentTimeMillis() - timeStart));
timeStart = System.currentTimeMillis();
// Deliver the obtained image pixels to native layer in a fixed format for edge recognition
EdgePoint[] points = ImgProc.findEdges(bitmap, colors);
LogUtils.d("----------TFManager" + Arrays.toString(points));
LogUtils.d("----------TFManager find edges cost : " + (System.currentTimeMillis() - timeStart));
returnpoints; }}Copy the code
The main execution flow of the program is as follows:
Get the open input stream from Assets’ model file, and then open a pipe from the input stream using some classes in NIO. Then, a byte cache is retrieved from the pipe. The pipe interacts directly with the buffer when a file is read or written. In addition to being a cache, this cache also has memory mapping capabilities. Similar to MMAP, the main purpose is to improve the efficiency of file reading and writing.
Initialize and configure Tensorflow. Some of the above parameters are used to set information such as the number of threads. The following parameters are mainly used to adjust TF according to the requirements of the model. For example, when we use the model to determine the vertices of an image, we use a 256 * 256 image containing only three RGB latitudes. So, the following lines of code are used:
// inputSize = 256;
interpreter.resizeInput(0.new int[] {});
intValues = new int[inputSize * inputSize];
// A 256 * 256 image, 3 latitude, 4 images
imgData = ByteBuffer.allocateDirect(inputSize * inputSize * 4 * 3);
Copy the code
To identify the image processing. The image needs to be enlarged and shrunk to a size of 256 x 256. Then, a Bitmap method is used to get the pixels of the image. The following lines of code are to extract the three colors of RBG from the picture and normalize them respectively. After processing, the results are uniformly written to imgData as the model pair input.
Build a Java object as the output parameter of the model in the model-to-output/file format. Call the model’s methods for identification.
Process the output of the model. Print new float[1][256][256][5] to the model as defined above. Because the model output data is not the original pixel information, you need to multiply it by 256 to get the pixels of the real image. Finally, these pixels and Bitmap methods are used to get the final Bitmap.
The model is called but the process is not finished. Because you just call the model and you get five recognized images. These five images are just the edges of the image, so you need to continue processing these five images to get the vertices of the image. This part requires some algorithms, although it is possible to determine in the Java layer, but in the Native layer, with the help of some OpenCV libraries can make the whole process easier. Therefore, there is another JNI call involved:
extern "C"
Java_com_xxx_ImgProc_nativeFindEdges(JNIEnv *env, jclass type, jintArray mask1_, jintArray mask2_, jintArray mask3_, jintArray mask4_, jintArray mask5_, jobject origin, jobjectArray points) {
// An array element passed in from Java
jint *mask1 = env->GetIntArrayElements(mask1_, NULL);
jint *mask2 = env->GetIntArrayElements(mask2_, NULL);
jint *mask3 = env->GetIntArrayElements(mask3_, NULL);
jint *mask4 = env->GetIntArrayElements(mask4_, NULL);
jint *mask5 = env->GetIntArrayElements(mask5_, NULL);
// Convert the original image to Native layer mat
Mat originMat;
bitmap_to_mat(env, origin, originMat);
// Build a collection
std::vector<jint*> jints;
// Get the corresponding Mat from the pixels and put it into a set
std::vector<cv::Mat> masks;
for (int k = 0; k < 5; ++k) {
Mat mask(256.256, CV_8UC1);
for (int i = 0; i < 256; ++i) {
for (int j = 0; j < 256; ++j) {
mask.at<uint8_t>(i, j) = (char)(*(jints[k] + i * 256 + j));
try {
// Call the algorithm for edge detection
EdgeFinder finder = ImageEngine::EdgeFinder(
originMat, masks[0], masks[1], masks[2], masks[3], masks[4]);
vector<cv::Point2d> retPoints = finder.FindBorderCrossPoint(a);// Convert the resulting "points" into Java layer objects
jclass class_point = env->FindClass("com/xxx/EdgePoint");
jmethodID method_point = env->GetMethodID(class_point, "<init>"."(FF)V");
// Return the vertices as a Java array
for (int i = 0; i < 4; ++i) {
jobject point = env->NewObject(class_point, method_point, retPoints[i].x, retPoints[i].y);
env->SetObjectArrayElement(points, i, point); }}catch (cv::Exception &e) {
jclass je = env->FindClass("java/lang/Exception");
env -> ThrowNew(je, e.what());
} catch(...). { jclass je = env->FindClass("java/lang/Exception");
env -> ThrowNew(je, "unknown");
// Release resources
env->ReleaseIntArrayElements(mask1_, mask1, 0);
env->ReleaseIntArrayElements(mask2_, mask2, 0);
env->ReleaseIntArrayElements(mask3_, mask3, 0);
env->ReleaseIntArrayElements(mask4_, mask4, 0);
env->ReleaseIntArrayElements(mask5_, mask5, 0);
Copy the code
The main logic here is to uniformly transmit the previously obtained pixels and the Bitmap of the original image to the native layer, and then obtain the Mat corresponding to OpenCV from these pixels, and then call the algorithm together as the parameters of the algorithm to obtain the vertex information. Finally, the obtained vertex information is mapped to the Java layer class, and it is returned in an array.
As you can see from the above flow, the entire invocation process actually makes multiple JNI calls:
- The method of calling Bitmap itself is a JNI call, which calls the Skia library at the bottom of Android to implement image processing;
- Tf-lite itself is an encapsulation of native layer methods, and calling its methods also involves JNI calls.
- Finally, the results of model recognition are processed, which also involves JNI calls.
When JNI calls, additional conversion operations need to be carried out. Objects in the Java layer need to be converted into objects in the Native layer at the beginning of the function, and then objects in the Native layer are converted into objects in the Java layer after the algorithm is called. This is an area we can optimize in the future.
This is the application of computer vision in Android, mainly involving some content of JNI, as well as some applications of OpenCV and Tensorflow. Before that, I introduced image compression and camera library encapsulation, if you need some image processing capabilities in your application, I think these things are definitely useful
Thanks for reading