[Pure front end] Automatic extraction of rectangular objects in images based on Opencv. js

preface

When I was in college, I studied OpenCV for a while and made a little tool like RunFace:

There are two photos, the first photo with white paper, the composition will convert the second photo to the white paper of the first photo

Just recently saw that OpenCV has a corresponding implementation library in the front end, so I wanted to do a pure front end version. This time, we will achieve the reverse effect and extract the rectangular image in the composite image

See Rectangle -extract- Opencvjs

Online address

The main application scenarios are: removing the background of the ID card to get the scanned copy

Of course, this is easy to do with Photoshop

Realize the point

Generate opencv.js

You can choose to compile it yourself, using the LLVM tool chain

See the Build OpenCV. Js

I used ready-made JS files directly, about 8M

I wonder if I can choose to compile only some modules, so that the resulting package is smaller

Second, the algorithm process

Take the original image below as an example

pretreatment

Divided into size adjustment and filtering

Size adjustment is to reduce the image width and height ratio to less than 200px, in order to improve the processing efficiency, and make the filtering effect better

The role of filtering is edge protection (edge sharpening) and denoising (texture removal), which is convenient for subsequent target image extraction

Filtering can be bilateral filtering or meanshift filtering

Extraction foreground

In the current application scenario, the target rectangle image occupies a large proportion, so we can directly take the midpoint for flooding filling

That is, gray scale image is obtained by floodFill algorithm

Apply median filtering to denoise

So now we have a binary image

Then consider a more general scenario where you can introduce user interaction: flood fill with user touch points

Straight line detection

First, Canny operator is used for edge detection

And then the HoughLines transform to get the line

HoughLinesP detection line segment is not selected here. The main reason is that all the obtained line segments are short segments and the direction changes greatly, which is not conducive to subsequent calculation. Another reason is that you can’t handle partial missing images

Of course, the scene here does not need to consider the partial absence of the target image.

Vertex coordinate calculation

Calculate the intersection of all lines

Filter out coordinates that are out of range

I was going to use Kmeans for clustering, but I found that OpencV was not very useful (maybe my posture was wrong).

Cheating, first calculate the midpoint, and then according to the polar coordinates of all the intersection of sorting, Euclidean distance within a certain range of the same category

We get multiple clusters, take the four categories with the most elements, and get four mean coordinates

Finally, take the upper left corner (both horizontal and vertical coordinates are less than the midpoint) as the first point, and sort clockwise

Matrix transformation

Transform the four coordinates of the original image to four new points of the target image

There is a problem that the width and height of the new target image should be. Here we take the width and height of the original image directly

If we have time later, we will study the algorithm of automatic recognition of the width and height of the target image

Finally, zoom in half horizontally and vertically (compress) to get the final result

3. Interface layout

[OpenCV Web] Should You Use OpenCV JS?

Use bootstrap for layout

Four, code implementation

Interfaces are used in much the same way as in other languages

If you’re not sure, check the API documentation

Or you can play some of these graphics demos

In general, documentation is sparse, and many interfaces are not explained, and analogies can only be made through apis in other languages

The code is shown below. For more, check out the Github repository

const g_nLowDifference = 35
const g_nUpDifference = 35; // Maximum negative difference, maximum positive difference
const UNCAL_THETA = 0.5;
class Line {
  constructor(rho, theta) {
    this.rho = rho
    this.theta = theta
    let a = Math.cos(theta);
    let b = Math.sin(theta);
    let x0 = a * rho;
    let y0 = b * rho;
    this.startPoint = { x: x0 - 400 * b, y: y0 + 400 * a };
    this.endPoint = { x: x0 + 400 * b, y: y0 - 400* a }; }}/** * @param {Object} srcMat */
function itemExtract (srcMat, name) {
  let scale = getScale(Math.max(srcMat.rows, srcMat.cols))
  let preMat = preProcess(srcMat, scale)
  let grayMat = getSegmentImage(preMat)
  let lines = getLinesWithDetect(grayMat)
  let points = getFourVertex(lines, scale, { height: srcMat.rows, width: srcMat.cols })
  let result = getResultWithMap(srcMat, points)
  cv.imshow(name, result);
  preMat.delete()
  grayMat.delete()
  srcMat.delete()
  result.delete()
}
@param {*} len */
function getScale (len) {
  let scale = 1
  while (len > 200) {
    scale /= 2
    len >>= 1
  }
  return scale
}
/** * preprocessing * @param {*} SRC */
function preProcess (src, scale) {
  let smallMat = resize(src, scale)
  let result = filter(smallMat)
  smallMat.delete()
  return result
}
@param {*} SRC * @param {*} scale Scale */
function resize (src, scale = 1) {
  let smallMat = new cv.Mat();
  let dsize = new cv.Size(0.0);
  cv.resize(src, smallMat, dsize, scale, scale, cv.INTER_AREA)
  return smallMat
}
@param {*} mat */
function filter (src) {
  let dst = new cv.Mat();
  cv.cvtColor(src, src, cv.COLOR_RGBA2RGB, 0);
  // Bilateral filtering
  cv.bilateralFilter(src, dst, 9.75.75, cv.BORDER_DEFAULT);
  return dst
}
@param {*} SRC */
function getSegmentImage (src) {
  const mask = new cv.Mat(src.rows + 2, src.cols + 2, cv.CV_8U, [0.0.0.0])
  const seed = new cv.Point(src.cols >> 1, src.rows >> 1)
  let flags = 4 + (255 << 8) + cv.FLOODFILL_FIXED_RANGE
  let ccomp = new cv.Rect()
  let newVal = new cv.Scalar(255.255.255)
  // Select a midpoint and floodFill it with floodFill
  cv.threshold(mask, mask, 1.128, cv.THRESH_BINARY);
  cv.floodFill(src, mask, seed, newVal, ccomp, new cv.Scalar(g_nLowDifference, g_nLowDifference, g_nLowDifference), new cv.Scalar(g_nUpDifference, g_nUpDifference, g_nUpDifference), flags);
  // Perform filtering again to remove noise
  cv.medianBlur(mask, mask, 9);
  return mask
}


function getLinesFromData32F (data32F) {
  let lines = []
  let len = data32F.length / 2
  for (let i = 0; i < len; ++i) {
    let rho = data32F[i * 2];
    let theta = data32F[i * 2 + 1];
    lines.push(new Line(rho, theta))
  }
  return lines
}
@param {*} mat */
function getLinesWithDetect (src) {
  let dst = cv.Mat.zeros(src.rows, src.cols, cv.CV_8UC3);
  let lines = new cv.Mat();
  // Canny operator for edge detection
  cv.Canny(src, src, 50.200.3);
  cv.HoughLines(src, lines, 1.Math.PI / 180.30.0.0.0.Math.PI);
  // draw lines
  for (let i = 0; i < lines.rows; ++i) {
    let rho = lines.data32F[i * 2];
    let theta = lines.data32F[i * 2 + 1];
    let a = Math.cos(theta);
    let b = Math.sin(theta);
    let x0 = a * rho;
    let y0 = b * rho;
    let startPoint = { x: x0 - 400 * b, y: y0 + 400 * a };
    let endPoint = { x: x0 + 400 * b, y: y0 - 400 * a };
    cv.line(dst, startPoint, endPoint, [255.0.0.255]);
  }
  let lineArray = getLinesFromData32F(lines.data32F)
  // drawLineMat(src.rows, src.cols, lineArray)
  return lineArray
}
/ computing the intersection point between the two straight line * * * * @ param *} {l1 * @ param l2 * / {*}
function getIntersection (l1, l2) {
  // If the Angle difference is too small,
  let minTheta = Math.min(l1.theta, l2.theta)
  let maxTheta = Math.max(l1.theta, l2.theta)
  if (Math.abs(l1.theta - l2.theta) < UNCAL_THETA || Math.abs(minTheta + Math.PI - maxTheta) < UNCAL_THETA) {
    return;
  }
  // Calculate the intersection of two lines
  let intersection;
  //y = a * x + b;
  let a1 = Math.abs(l1.startPoint.x - l1.endPoint.x) < Number.EPSILON ? 0 : (l1.startPoint.y - l1.endPoint.y) / (l1.startPoint.x - l1.endPoint.x);
  let b1 = l1.startPoint.y - a1 * (l1.startPoint.x);
  let a2 = Math.abs((l2.startPoint.x - l2.endPoint.x)) < Number.EPSILON ? 0 : (l2.startPoint.y - l2.endPoint.y) / (l2.startPoint.x - l2.endPoint.x);
  let b2 = l2.startPoint.y - a2 * (l2.startPoint.x);
  if (Math.abs(a2 - a1) > Number.EPSILON) {
    let x = (b1 - b2) / (a2 - a1)
    let y = a1 * x + b1
    intersection = { x, y }
  }
  return intersection
}
@param {*} lines */
function getAllIntersections (lines) {
  let points = []
  for (let i = 0; i < lines.length; i++) {
    for (let j = i + 1; j < lines.length; j++) {
      let point = getIntersection(lines[i], lines[j])
      if (point) {
        points.push(point)
      }
    }
  }
  return points
}
@param {*} points * @param {*} param1 */
function getClusterPoints (points, { width, height }) {
  points.sort((p1, p2) = > {
    if(p1.x ! == p2.x) {return p1.x - p2.x
    } else {
      return p1.y - p2.y
    }
  })
  const distance = Math.max(40, (width + height) / 20)
  const isNear = (p1, p2) = > Math.abs(p1.x - p2.x) + Math.abs(p1.y - p2.y) < distance
  let clusters = [[points[0]]]
  for (let i = 1; i < points.length; i++) {
    if (isNear(points[i], points[i - 1])) {
      clusters[clusters.length - 1].push(points[i])
    } else {
      clusters.push([points[i]])
    }
  }
  // Remove the least number of clusters, keep only four clusters
  clusters = clusters.sort((c1, c2) = > c2.length - c1.length).slice(0.4)
  const result = clusters.map(cluster= > {
    const x = ~~(cluster.reduce((sum, cur) = > sum + cur.x, 0) / cluster.length)
    const y = ~~(cluster.reduce((sum, cur) = > sum + cur.y, 0) / cluster.length)
    return { x, y }
  })
  return result
}
/** * order clockwise with the first point * @param {*} points */
function getSortedVertex (points) {
  let center = {
    x: points.reduce((sum, p) = > sum + p.x, 0) / 4.y: points.reduce((sum, p) = > sum + p.y, 0) / 4
  }
  let sortedPoints = []
  sortedPoints.push(points.find(p= > p.x < center.x && p.y < center.y))
  sortedPoints.push(points.find(p= > p.x > center.x && p.y < center.y))
  sortedPoints.push(points.find(p= > p.x > center.x && p.y > center.y))
  sortedPoints.push(points.find(p= > p.x < center.x && p.y > center.y))
  return sortedPoints
}

/** * Get the coordinates of the four vertices according to the cluster */
function getFourVertex (lines, scale, { width, height }) {
  // Zoom + filter
  let allPoints = getAllIntersections(lines).map(point= > ({
    x: ~~(point.x / scale), y: ~~(point.y / scale)
  })).filter(({ x, y }) = >! (x <0 || x > width || y < 0 || y > height))
  const points = getClusterPoints(allPoints, { width, height })
  const sortedPoints = getSortedVertex(points)
  return sortedPoints
}
/ * * * cutout, map * @ param {*} SRC * @ param points * / {*}
function getResultWithMap (src, points) {
  let array = []
  points.forEach(point= > {
    array.push(point.x)
    array.push(point.y)
  })
  console.log(points, array)
  let dst = new cv.Mat();
  let dsize = new cv.Size(0.0);
  let dstWidth = src.cols
  let dstHeight = src.rows
  let srcTri = cv.matFromArray(4.1, cv.CV_32FC2, array);
  let dstTri = cv.matFromArray(4.1, cv.CV_32FC2, [0.0, dstWidth, 0, dstWidth, dstHeight, 0, dstHeight]);
  let M = cv.getPerspectiveTransform(srcTri, dstTri);
  cv.warpPerspective(src, dst, M, dsize);
  let resizeDst = resize(dst, 0.5)
  M.delete(); srcTri.delete(); dstTri.delete(); dst.delete()
  return resizeDst
}
function drawLineMat (rows, cols, lines) {
  let dst = cv.Mat.zeros(rows, cols, cv.CV_8UC3);
  let color = new cv.Scalar(255.0.0);
  for (let line of lines) {
    cv.line(dst, line.startPoint, line.endPoint, color);
  }
  cv.imshow("canvasOutput", dst);
}
Copy the code

Note: Use up Mat objects to manually empty, otherwise you will run out of WebAssembly memory

conclusion

Through a simple example, contact the use of Opencv.js

Opencv provides many interfaces that make it easy to process images in the front end, and there may be more application scenarios in the future

future

Performance: Try using AssemblyScript to write the relevant algorithm module, generate WASM and replace the 8M opencv.js file

Function: increased touch interaction, more intelligent recognition of the target rectangle; Target picture wide university is;

Coding: using the idea of middleware for reference

Write an article after optimization

Also welcome to try, and mention PR

Reference documentation

The official demo
OpenCV is used to detect the rectangular canvas or paper in the image and extract the image content
How can you use K-Means clustering to posterize an image using opencv javascript?