JSON (JavaScript Object Notation) is a lightweight data interchange format that uses a combination of objects and arrays to represent data. In this section, we’ll learn how to use Python to save data to JSON files.

1. Objects and arrays

In JavaScript, everything is an object. Therefore, any of the supported types, such as strings, numbers, objects, arrays, and so on, can be represented by JSON, but objects and arrays are two of the more special and commonly used types, which are briefly described below.

  • object: it uses curly braces in JavaScript{}The data structure of the wrapped content is{key1: value1, key2: value2,... }Key value pair structure of. In an object-oriented language,keyIs the property of the object,valueIs the corresponding value. Key names can be represented as integers and strings. The value can be of any type.
  • An array of: Arrays are square brackets in JavaScript[]The data structure of the wrapped content is["java", "javascript", "vb", ...]Index structure of. In JavaScript, arrays are a special data type that can use key-value pairs just like objects, but with more indexing. Again, the type of the value can be any type.

So, a JSON object can be written as follows:

[{
    "name": "Bob"."gender": "male"."birthday": "1992-10-18"
}, {
     "name": "Selina"."gender": "female"."birthday": "1995-10-18"
}]
Copy the code

Enclosed by braces is the equivalent of a list type, and each element in a list can be of any type. In this case, it is a dictionary type, enclosed by braces.

JSON can be freely composed of the above two forms, nested infinitely many times, and has a clear structure, which is an excellent way to exchange data.

2. Read the JSON

Python provides libraries that are easy to use to write and write JSON files. Loads () are called to convert JSON text strings into JSON objects, and dumps() is used to convert JSON objects into text strings.

For example, here is a JSON string of type STR, which we use Python to convert into an operable data structure, such as a list or dictionary:

import 

str = ' '' [{ "name": "Bob", "gender": "male", "birthday": "1992-10-18" }, { "name": "Selina", "gender": "female", "birthday": "1995-10-18"}] '' '
print(type(str))
data = .loads(str)
print(data)
print(type(data))
Copy the code

The running results are as follows:

<class 'str'>
[{'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'}, {'name': 'Selina'.'gender': 'female'.'birthday': '1995-10-18'}]
<class 'list'>
Copy the code

Loads () is used to convert the string into a JSON object. Since the outermost layer is a bracket, the final type is a list type.

In this way, we can use the index to retrieve the corresponding content. For example, if you want to fetch the name attribute from the first element, you can do this:

data[0]['name']
data[0].get('name')
Copy the code

The results are:

Bob
Copy the code

You get the first dictionary element by indexing it with 0 in brackets, and then call its key name to get the corresponding key value. There are two ways to get a key value, either by enclosing the key name in brackets or passing in the key name through the get() method. The get() method is recommended, so that if the key name does not exist, no error is reported and None is returned. Alternatively, the get() method can pass in a second argument (the default value), as shown in the following example:

data[0].get('age')
data[0].get('age', 25)
Copy the code

The running results are as follows:

None
25
Copy the code

Here we’re trying to get the age, but it doesn’t exist in the original dictionary, so None is returned by default. If a second argument (the default value) is passed, the default value is returned if it does not exist.

Note that JSON data needs to be enclosed in double quotes, not single quotes. For example, an error occurs if you use the following expression:

import 

str = ' ''[{'name':'Bob', 'gender':'male', 'birthday':'1992-10-18'}] '' '
data = .loads(str)
Copy the code

The running results are as follows:

.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 5 (char 8)
Copy the code

You will be prompted with JSON parsing errors. This is because the data is surrounded by single parentheses, so be sure to use double quotes to express the JSON string, otherwise the method fails to parse.

Loads if loads are being read from JSON text, such as a data. text file with the JSON string being defined, then loads() are being read from the text file:

import 

with open('data.'.'r') as file:
    str = file.read()
    data = .loads(str)
    print(data)
Copy the code

The running results are as follows:

[{'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'}, {'name': 'Selina'.'gender': 'female'.'birthday': '1995-10-18'}]
Copy the code

3. The output JSON

Alternatively, we can call the dumps() method to convert JSON objects to strings. For example, write the list in the above example back to text:

import 

data = [{
    'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'
}]
with open('data.'.'w') as file:
    file.write(.dumps(data))
Copy the code

With the dumps() method, you can turn a JSON object into a string and then call the file’s write() method to write text, as shown in Figure 5-2.

Figure 5-2 Writing results

In addition, if you want to save the JSON format, you can add an indent parameter that represents the number of indent characters. The following is an example:

with open('data.'.'w') as file:
    file.write(.dumps(data, indent=2))
Copy the code

Figure 5-3 shows the writing result.

Figure 5-3 Writing results

The resulting content will be automatically indented and formatted more clearly.

Also, what happens if JSON contains Chinese characters? For example, we change some of the previous JSON values to Chinese and write them to the text using the same method:

import 

data = [{
    'name': 'wang wei'.'gender': 'male'.'birthday': '1992-10-18'
}]
with open('data.'.'w') as file:
    file.write(.dumps(data, indent=2))
Copy the code

Figure 5-4 shows the write result.

Figure 5-4 Writing results

As you can see, Chinese characters become Unicode characters, which is not what we want.

To output Chinese, you also need to specify ensure_ASCII as False and specify the encoding of the file output:

with open('data.'.'w', encoding='utf-8') as file:
    file.write(.dumps(data, indent=2, ensure_ascii=False))
Copy the code

Figure 5-5 shows the write result.

Figure 5-5 Writing results

As you can see, this will output JSON in Chinese.

In this section, we learned how to read and write JSON files using Python, which is often used in data parsing later.

This resource starting in Cui Qingcai personal blog still find: Python3 tutorial | static find web crawler development practical experience

For more crawler information, please follow my personal wechat official account: Attack Coder

Weixin.qq.com/r/5zsjOyvEZ… (Qr code automatic recognition)