A solution to Django exported CSV garbled problem

Output a CSV sample using Django

Environment: Python3 + Django 2.2.24

The code editor default encoding is UTF-8

import csv
from django.http import HttpResponse

def some_view(request) :
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    writer = csv.writer(response)
    writer.writerow(['name'.'gender'.'age'.'address'])
    writer.writerow(['Joe'.'male'.'18'.'Beijing'])
    return response
Copy the code

In this way, even if the encoding of the exported file is UTF-8, the CSV file opened in Excel will still be garbled. But it works fine when opened with a text editor such as Notepad.

A reason for

By default, the UTF-8 encoding on The Windows platform has a Byte Order Mark (BOM). Therefore, when you use Excel on Windows to open a document with pure UTF-8 encoding, the file content is not identified because there is no BOM, resulting in garbled characters.

Utf-8 does not require a BOM, although the Unicode standard allows its use in UTF-8. So UTF-8 without a BOM is the standard form, and it is mostly Microsoft’s custom to place a BOM in a UTF-8 file (incidentally, it is also Microsoft’s custom to call the little enenteed UTF-16 with a BOM “Unicode” without specifying it). A BOM (Byte Order Mark) is intended for UTF-16 and UTF-32 and is used to mark byte orders. Microsoft uses a BOM in UTF-8 because it clearly distinguishes UTF-8 from encodings such as ASCII, but such files can cause problems in operating systems other than Windows.

When you open the exported file using Windows Notepad, you can see that the encoding is UTF-8

The solution

The encoding format with BOM can be set when the Response object is constructed: UTF-8-SIG

import csv
from django.http import HttpResponse

def some_view(request) :
    response = HttpResponse(content_type='text/csv')
    # Set code with BOM
    response.charset = 'utf-8-sig'
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    writer = csv.writer(response)
    writer.writerow(['name'.'gender'.'age'.'address'])
    writer.writerow(['Joe'.'male'.'18'.'Beijing'])
    return response
Copy the code

This export file can be directly opened with Excel without garbled; When opened using Windows Notepad, utF-8 with BOM is displayed.

Further reflection

You can use the user-agent in the Request to determine the client system. If the User’s system is Windows, set the encoding of the exported file to UTF-8 with BOM; otherwise, use UTF-8.

Example of implementation:

import csv
from django.http import HttpResponse

def some_view(request) :
    response = HttpResponse(content_type='text/csv')
    # Determine the client system through user-agent, and then set the code with BOM
    response.charset = 'utf-8-sig' if "Windows" in request.headers.get('User-Agent') else 'utf-8'
    response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
    writer = csv.writer(response)
    writer.writerow(['name'.'gender'.'age'.'address'])
    writer.writerow(['Joe'.'male'.'18'.'Beijing'])
    return response
Copy the code

Example Change the encoding of UTF-8 to UTF-8 with BOM

If you already have a CSV file with utF-8 encoding, you need to save it as a new file and set the encoding.

As follows, save as using Windows Notepad: