<pre name="code" class="python">        wordList = textParse(open('email/ham/%d.txt' % i).read())
Copy the code


\

Copy the code

\

UnicodeDecodeError: ‘GBK’ codec can’t decode byte 0xAE in position 199: Illegal multibyte sequence

All kinds of information on the Internet showed that the problem was file encoding, so I tried utF-8, GBK, ASICC and other encoding methods, but still failed to solve the problem.

Decode byte 0xae in Position 199 decode byte 0xae in Position 199 decode byte 0xae in position 199 decode byte 0xae in position 199

When I opened the file, the second line was peppered with “�”, a common greeting “? “. For some reason it changed when you put it in Eclipse, but after you delete it, everything is fine.