This article is participating in Python Theme Month. See the link for details
1. Introduction
Today we share a third party library: Xpinyin to transfer Chinese characters to pinyin. You can see if you fail in Chinese.
🔗 github link: xpinyin
2. Install
python3 -m pip install xpinyin
Copy the code
3. Example – Use
3.1 Get pinyin, default delimiter is ‘-‘
>>> from xpinyin import Pinyin
>>> p = Pinyin()
>>> p.get_pinyin('Beijing')
'bei-jing'
Copy the code
3.2 Pinyin has tones
# display tone
>>> p.get_pinyin('Beijing', tone_marks='marks')
'b ě I - j and ng'
# Display the tone number of tones
>>> p.get_pinyin('Beijing', tone_marks='numbers')
'bei3-jing1'
Copy the code
3.3 Changing different delimiters
By changing thesplitter
Parameter control
The # delimiter is a space
>>> p.get_pinyin('Beijing', tone_marks='marks', splitter=' ')
'b ě I j and ng'
>>> p.get_pinyin('Beijing', tone_marks='numbers', splitter=' ')
'bei3 jing1'
# undelimited
>>> p.get_pinyin('Beijing', tone_marks='marks', splitter=' ')
'b ě ij and ng'
>>> p.get_pinyin('Beijing', tone_marks='numbers', splitter=' ')
'bei3jing1'
Copy the code
3.4 Obtaining initials
- Flat lingual
>>> p.get_initials("Shanghai", splitter=The '-')
'S-H'
Copy the code
- Become warped lingual
>>> p.get_initials("Shanghai", splitter=The '-', with_retroflex=True)
'SH-H'
Copy the code
3.5 Get the polyphonic combination of Chinese characters
Since Chinese characters tend to be polyphonic, the library can also display the situation of a character with multiple sounds
>>> p.get_pinyins('look like')
['mo-yang'.'mo-xiang'.'mu-yang'.'mu-xiang']
>>> p.get_pinyins('look like', tone_marks='marks', splitter=' ')
['mó yáng'.'mó yàng'.'mó xiàng'."Mu Yang '."Mu Yang '.'mú xiàng']
>>> p.get_pinyins('harm', tone_marks='marks', splitter=' ')
['sh ā ng hai'.'sh ā ng he']
Copy the code
4. The last
This library is quite good, such as appearance, I do not know the kind of xiang this sound, harm he this sound.
Shame 😂, Chinese culture extensive and profound, feel that the language can be rebuilt.
If you are interested, you can go to Github to explore. The author is maintaining a mandarin. Dat file, which records the hexadecimal and pinyin of Chinese characters and tones.
Post the link again: 🔗 xpinyin