About this article

This article mainly summarizes the problems encountered in the transmission of JSON data since the website was written and the solutions adopted at present. The database is MongoDB, the backend is Python, and the front end is Riot. Js in the form of “semi-detached”, meaning that the first page of data is rendered directly to HTML by a server-side template engine to avoid the problem of the home page loading twice, while other dynamic content is loaded using Ajax. In the whole process, data is transmitted through JSON format, but different ways need to be adopted in different links and some different problems are encountered. This paper mainly makes records and summaries.



1. What is JSON?

JSON (JavaScript Object Notation) is a lightweight data interchange language conceived and designed by Douglas Crockford, and its predecessor, XML, is probably better known. JSON isn’t meant to replace XML, of course, but it’s much smaller and better suited for data transfer in web development than XML (JSON is to JavaScript what XML is to Lisp). As you can see from the name, the JSON format conforms to the syntax of “object” in JavaScript. In addition to JavaScript, many other languages have similar types, such as dict in Python. Some NoSQL non-relational databases based on document storage also choose JSON as their data storage format, such as MongoDB. In general, JSON defines a markup format that makes it very easy to convert variable data to and from string text data in a programming language. JSON describes data structures in the following forms:

  1. Object: {key: value}

  2. List: [obj, obj…]

  3. String: “string”

  4. Number: number

  5. Boolean value: true/false

Now that you understand the basic concepts of JSON, let’s summarize the data interactions shown in the figure above.

2. Python <=> MongoDB

The interaction between Python and MongoDB is mainly supported by the existing driver libraries, including PyMongo, Motor, etc., and the interfaces provided by these drivers are very friendly. We do not need to understand any underlying implementation, just need to operate on Python native dictionary types:

import motor
client = motor.motor_tornado.MotorClient()
db     = client['test']
user_col = db['user']
user_col.insert(dict(
  name = 'Yu',
  is_admin = True,
))Copy the code

The only thing to note is that the id in MongoDB is stored in the ObjectId(“572df0b78a83851d5f24e2c1”) Python object bson.objectid.objectid. Therefore, an instance of this object needs to be queried:

from bson.objectid import ObjectId
user = db.user.find_one(dict(
  _id = ObjectId("572df0b78a83851d5f24e2c1")
  ))Copy the code

3. Python <=> Ajax

The data exchange between the front end and the back end is more commonly done through Ajax, where we encountered the first minor pitfall. In a previous article, I summarized a Python coding pitfall. We know that there is absolutely no JSON/XML in HTTP. Everything is binary data, but we can choose how we want the front-end to interpret that data. That is, by setting the content-type in the Header, it is generally set as Content-Type: Application/JSON when transferring JSON data. In The latest Tornado version, it is only necessary to directly write the dictionary Type:

# Handler
async def post(self):
  user = await self.db.user.find_one({})
  self.write(user)Copy the code

TypeError: ObjectId(‘ 572DF0b58A83851d5f24e2B1 ‘) is not JSON serializable. Dumps (User) is still required to go through a Js.dumps (user) operation when writing dictionary types like HTTP. For Js.dumps, the ObjectId type is illegal. So I went for the most intuitive solution:

import json
from bson.objectid import ObjectId

class JSONEncoder(json.JSONEncoder):
  def default(self, obj):
    if isinstance(obj, ObjectId):
      return str(obj)
    return super().default(self, obj)
    
# Handler
async def post(self):
  user = await self.db.user.find_one({})
  self.write(JSONEncoder.encode(user))Copy the code

There will be no errors this time, and our own JSONEncoder can handle ObjectId, but another problem arises:



After JSONEncoder. Encode, the dictionary Type is converted to a string, and the content-type is changed to text/ HTML after HTTP writing, at which point the front end will consider the received data to be a string rather than an available JavaScript Object. Of course, there is a further remedy, that is, the front end of the conversion again:

$.post(API, {}, function(res){
  data = JSON.parse(res);  console.log(data._id);
})Copy the code

The problem is solved for the time being, and the JSON transformation looks like this:

Python => json.dumps => HTTP   => JavaScript  => JSON.parse
dict   => str        => binary => string      => ObjectCopy the code

Json. parse gets an error when there are special characters in the data:

JSON.parse("{'abs': '\n'}"); // VM536:1 Uncaught SyntaxError: Unexpected token 'in JSON at position 1(...)Copy the code

This is the downside of trying to fix what went wrong in the moment and lead to a series of changes. Let’s follow the chain of JSON transformations above to see if there is a better solution. Quite simply, follow the traditional rules, and when exceptions arise, change the rules of adaptation, not the rules:

# Handler

async def post(self):
  user = await self.db.user.find_one({})
  user['_id'] = str(user['_id'])
  self.write(user)Copy the code

Of course, if it is in the form of a list of multiple data, further modification is needed:

# DB
async def get_top_users(self, n = 20):
  users = []
  async for user in self.db.user.find({}).sort('rank', -1).limit(n):
    user['_id'] = str(user['_id'])
    users.append(user)
  return usersCopy the code

4. Python <=> HTML+Riot.js

If the above problem can be solved by following the rules, then this next problem is a story of challenging the rules. Except for the Ajax dynamic loading part, the rest of the data on the web page is rendered by a back-end template engine, which is hard-coding to HTML. The HTML files are just plain text files until the browser loads and parses them, and what we need is to simply stuff the data

This is correct, but solving the ObjectId() problem mentioned above will require some additional processing (especially the quotation mark problem). I also tried a silly approach to solving the ObjectId problem (before the json.parse error above) :

# Handler
async def get(self):
  users = self.db.get_top_users()
  render_data = dict(
    users = JSONEncoder.encode(users)
  )
  self.render('users.html', **render_data)Copy the code



Copy the code

In fact, the template engine rendering process is similar to the HTTP transfer process, except that the string variables in the template are pure values (without quotes). So it’s perfectly possible to render variables as generated JavaScript script files without worrying about special characters ({% raw… %} is the Tornado template syntax used to prevent special symbols from being encoded in HTML) :



Copy the code

conclusion

JSON is a great data format to use, but there are a lot of details to pay attention to when switching between different locales. In addition, follow the traditional rules, and when there are special cases, change your own rules, rather than trying to change the rules. This rule may not be applicable to all problems, but for those accepted rules, do not easily challenge.



Source: Pyhub.cc