JSON (JavaScript Object Notation) is a lightweight data interchange format that uses a combination of objects and arrays to represent data. In this section, we’ll learn how to use Python to save data to JSON files.
1. Objects and arrays
In JavaScript, everything is an object. Therefore, any of the supported types, such as strings, numbers, objects, arrays, and so on, can be represented by JSON, but objects and arrays are two of the more special and commonly used types, which are briefly described below.
- object: it uses curly braces in JavaScript
{}
The data structure of the wrapped content is{key1: value1, key2: value2,... }
Key value pair structure of. In an object-oriented language,key
Is the property of the object,value
Is the corresponding value. Key names can be represented as integers and strings. The value can be of any type. - An array of: Arrays are square brackets in JavaScript
[]
The data structure of the wrapped content is["java", "javascript", "vb", ...]
Index structure of. In JavaScript, arrays are a special data type that can use key-value pairs just like objects, but with more indexing. Again, the type of the value can be any type.
So, a JSON object can be written as follows:
[{
"name": "Bob"."gender": "male"."birthday": "1992-10-18"
}, {
"name": "Selina"."gender": "female"."birthday": "1995-10-18"
}]
Copy the code
Enclosed by braces is the equivalent of a list type, and each element in a list can be of any type. In this case, it is a dictionary type, enclosed by braces.
JSON can be freely composed of the above two forms, nested infinitely many times, and has a clear structure, which is an excellent way to exchange data.
2. Read the JSON
Python provides libraries that are easy to use to write and write JSON files. Loads () are called to convert JSON text strings into JSON objects, and dumps() is used to convert JSON objects into text strings.
For example, here is a JSON string of type STR, which we use Python to convert into an operable data structure, such as a list or dictionary:
import
str = ' '' [{ "name": "Bob", "gender": "male", "birthday": "1992-10-18" }, { "name": "Selina", "gender": "female", "birthday": "1995-10-18"}] '' '
print(type(str))
data = .loads(str)
print(data)
print(type(data))
Copy the code
The running results are as follows:
<class 'str'>
[{'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'}, {'name': 'Selina'.'gender': 'female'.'birthday': '1995-10-18'}]
<class 'list'>
Copy the code
Loads () is used to convert the string into a JSON object. Since the outermost layer is a bracket, the final type is a list type.
In this way, we can use the index to retrieve the corresponding content. For example, if you want to fetch the name attribute from the first element, you can do this:
data[0]['name']
data[0].get('name')
Copy the code
The results are:
Bob
Copy the code
You get the first dictionary element by indexing it with 0 in brackets, and then call its key name to get the corresponding key value. There are two ways to get a key value, either by enclosing the key name in brackets or passing in the key name through the get() method. The get() method is recommended, so that if the key name does not exist, no error is reported and None is returned. Alternatively, the get() method can pass in a second argument (the default value), as shown in the following example:
data[0].get('age')
data[0].get('age', 25)
Copy the code
The running results are as follows:
None
25
Copy the code
Here we’re trying to get the age, but it doesn’t exist in the original dictionary, so None is returned by default. If a second argument (the default value) is passed, the default value is returned if it does not exist.
Note that JSON data needs to be enclosed in double quotes, not single quotes. For example, an error occurs if you use the following expression:
import
str = ' ''[{'name':'Bob', 'gender':'male', 'birthday':'1992-10-18'}] '' '
data = .loads(str)
Copy the code
The running results are as follows:
.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 5 (char 8)
Copy the code
You will be prompted with JSON parsing errors. This is because the data is surrounded by single parentheses, so be sure to use double quotes to express the JSON string, otherwise the method fails to parse.
Loads if loads are being read from JSON text, such as a data. text file with the JSON string being defined, then loads() are being read from the text file:
import
with open('data.'.'r') as file:
str = file.read()
data = .loads(str)
print(data)
Copy the code
The running results are as follows:
[{'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'}, {'name': 'Selina'.'gender': 'female'.'birthday': '1995-10-18'}]
Copy the code
3. The output JSON
Alternatively, we can call the dumps() method to convert JSON objects to strings. For example, write the list in the above example back to text:
import
data = [{
'name': 'Bob'.'gender': 'male'.'birthday': '1992-10-18'
}]
with open('data.'.'w') as file:
file.write(.dumps(data))
Copy the code
With the dumps() method, you can turn a JSON object into a string and then call the file’s write() method to write text, as shown in Figure 5-2.
Figure 5-2 Writing results
In addition, if you want to save the JSON format, you can add an indent parameter that represents the number of indent characters. The following is an example:
with open('data.'.'w') as file:
file.write(.dumps(data, indent=2))
Copy the code
Figure 5-3 shows the writing result.
Figure 5-3 Writing results
The resulting content will be automatically indented and formatted more clearly.
Also, what happens if JSON contains Chinese characters? For example, we change some of the previous JSON values to Chinese and write them to the text using the same method:
import
data = [{
'name': 'wang wei'.'gender': 'male'.'birthday': '1992-10-18'
}]
with open('data.'.'w') as file:
file.write(.dumps(data, indent=2))
Copy the code
Figure 5-4 shows the write result.
Figure 5-4 Writing results
As you can see, Chinese characters become Unicode characters, which is not what we want.
To output Chinese, you also need to specify ensure_ASCII as False and specify the encoding of the file output:
with open('data.'.'w', encoding='utf-8') as file:
file.write(.dumps(data, indent=2, ensure_ascii=False))
Copy the code
Figure 5-5 shows the write result.
Figure 5-5 Writing results
As you can see, this will output JSON in Chinese.
In this section, we learned how to read and write JSON files using Python, which is often used in data parsing later.
This resource starting in Cui Qingcai personal blog still find: Python3 tutorial | static find web crawler development practical experience
For more crawler information, please follow my personal wechat official account: Attack Coder
Weixin.qq.com/r/5zsjOyvEZ… (Qr code automatic recognition)