1, the origin

Last week, I received a demand that the product manager wanted to silently obtain the detailed information of the picture, such as the time when the photo was taken, geographical location, device fingerprint, etc., from the H5 embedded in wechat. On the one hand, too much work recently affected my fish. On the other hand, THIS demand could not be fully realized, because some picture information was not available and I was too lazy to explain. Therefore, I told the product manager:

You can’t do this requirement.

I thought that would be the end of the matter, but unexpectedly, I met a product manager who can write code……

The next morning, my direct technical lead came to me and said,

“XXX (product manager)” wrote out the Demo of obtaining picture information, this requirement can be done, then you come.

This is what I was thinking:

Take the wrong one:

Anyway, the last requirement fell to me.

Hence this article: Getting Exif data for images.

2. Start the text

Exif is an Exchangeable image file format that includes information such as when the image was taken and GPS location.

Previously, there was a stir of concern that the original picture uploaded by wechat would leak private data. In fact, this is the Exif pot… The Exif data is actually generated by the camera and can be modified and deleted by the user. In other words, you can modify the Exif data as you like. Like this:

Of course, uploading the original photo on wechat can be risky, as you might accidentally tell others where you took the photo, but this is not entirely due to wechat, in theory, all photo uploading programs have the risk of privacy leakage. You can also delete the Exif data directly by clicking on the delete attribute and personal information TAB in the lower left corner:

Because Exif data can be forged and deleted, it can theoretically be used only as reference, not as exact information.

So how do you get the Exif data?

Exif-js is a 4.1K star open-source project on Github: github.com/exif-js/exi…

It is very smooth to use, I wrote a Demo to get the Exif data of the image:

jsrun.net/kwTKp/edit

Take a photo from your phone and upload it:

It can be seen that the Exif data of the picture has been obtained, and the specific address has been found through baidu Map API.

However, Exif data may not be available for images that are not taken directly by the device, or images that have been compressed, such as images downloaded directly from the Internet.

As an ambitious front end, at least not to the product manager we need to know not only why, but why.

So, check out the exif-js source code:

More than a thousand lines. Goodbye

Oh, just a thousand lines, not much to say, is liver.


The following is the source code analysis.

The analysis is longer, so you can skip it

Just give me a like at the end 🙂


The first is an anonymous self-executing function, which has the advantage of having its own scope and avoiding contaminating or being contaminated by external code.

The EXIF object is then defined and exported so that EXIT can be exposed to the outside world so that it can get and use the EXIF object.

Exports is the commonJS specification. If the current environment supports exports, export the module as exports. If the current environment does not support exports, export EXIF to the global object.

EXIF defines a number of properties and methods, such as the getData method we used in the demo:

// exif-js
EXIF.getData = function(img, callback) {
  / /... Omit some non-core code
  
  // Check whether the img object has the value of the exifData attribute
  if(! imageHasData(img)) {// Local upload usually does not have exifData, need to call getImageData to obtain
      getImageData(img, callback);
  } else {
      // If there is an exifData attribute value directly callback
      if(callback) { callback.call(img); }}})Copy the code

If the img object has an exifData attribute value, callback is called directly. If the img object does not have an exifData attribute value, a getImageData method is called.

Because images can be defined or uploaded locally via SRC attributes, and SRC values can be base64, BLOb, HTTP/HTTPS, getImageData handles image objects in different situations. All are processed as ArrayBuffer objects.

An ArrayBuffer is an array of bytes used to represent a generic, fixed-length buffer of raw binary data.

ArrayBuffer is an object that can only be read but not written, so DataView object is used to write the ArrayBuffer in the source code.

Several of these objects is, in effect, the processing of JS provide binary data object, if you want to further understand, recommend reading this article: “talk about JS binary family: Blob, ArrayBuffer and Buffer” : https://zhuanlan.zhihu.com/p/97768916

Let’s continue looking at the source code:

Once you have the ArrayBuffer object, you can parse the binary data.

// Parse the image's Exif data
function findEXIFinJPEG(file) {
        var dataView = new DataView(file);

        // Determine whether the file is an image file by the first two bytes of the header
        if ((dataView.getUint8(0) != 0xFF) || (dataView.getUint8(1) != 0xD8)) {
            return false; // not a valid jpeg
        }

        var offset = 2,
            length = file.byteLength,
            marker;

        // iterate over the Exif data information string
        while (offset < length) {
            // divide 0xFFE1 into two bytes. Check whether the first byte is equal to 0xFF
            if(dataView.getUint8(offset) ! =0xFF) {
                return false; // not a valid marker, something is wrong
            }

            marker = dataView.getUint8(offset + 1);
            // we could implement handling for other markers here,
            // but we're only looking for 0xFFE1 for EXIF data

            // Determine if the second byte is equal to 0xE1, i.e. 255
            if (marker == 225) {
                return readEXIFData(dataView, offset + 4, dataView.getUint16(offset + 2) - 2);
            } else {
                offset += 2 + dataView.getUint16(offset+2); }}}Copy the code

Here’s a pre-point: The Exif message starts with 0xFFE1, so the while in the source code is to find the beginning of the Exif message string and then call the readEXIFData method.

ReadEXIFData is to parse the Exif information string, that is, by traversing the binary string, matching the attributes and corresponding values, and putting them into an object tags, and returning the object after parsing.

Apart from Exif data, IPTC data (author, copyright, subtitles, details, etc.) and XMP data (Extensible Metadata Platform, a set of standards proposed by Adobe for Metadata creation, processing and exchange) are also analyzed in the source code.


Extended information

Exif – js API, attributes, such as the instructions you can refer to this blog: code.ciaoca.com/javascript/…


The product manager also has a requirement to get a device fingerprint, which I’ll write next time due to space constraints. Here’s a Demo: jsrun.net/2wTKp/edit, using Fingerprintjs.


3, summarize

As a bald programmer in a harmonious society, I reflected that I couldn’t actually go to the product manager and say “this requirement can’t be done”. If you really can’t do it, you should give specific technical feasibility analysis, and then come to the conclusion that you can’t do it, otherwise you should say: “I go to see the feasibility”, and then carefully understand the difficulties in the implementation of the technology, and then communicate with the product manager.

To sum up:

You can’t just say, “This requirement can’t be done.”

All the above are nonsense, we just have a look.

In fact, the focus of this paper is the analysis of obtaining Exif data in the middle part.

Shoot the breeze or two

Mid-Autumn festival is coming, my hometown has been seriously affected by the epidemic recently, so I can’t go home. I can only enjoy the moon and eat cakes with the clouds during the Mid-Autumn Festival. I hope the epidemic can end as soon as possible.

I wish you a happy Mid-Autumn Festival in advance, reunion, happiness!