This article is participating in Python Theme Month. See the link for details

1. Introduction

Today we share a third party library: Xpinyin to transfer Chinese characters to pinyin. You can see if you fail in Chinese.

🔗 github link: xpinyin

2. Install

python3 -m pip install xpinyin
Copy the code

3. Example – Use

3.1 Get pinyin, default delimiter is ‘-‘

>>> from xpinyin import Pinyin
>>> p = Pinyin()
>>> p.get_pinyin('Beijing')
'bei-jing'
Copy the code

3.2 Pinyin has tones

# display tone
>>> p.get_pinyin('Beijing', tone_marks='marks')
'b ě I - j and ng'

# Display the tone number of tones
>>> p.get_pinyin('Beijing', tone_marks='numbers')
'bei3-jing1'
Copy the code

3.3 Changing different delimiters

By changing thesplitterParameter control

The # delimiter is a space
>>> p.get_pinyin('Beijing', tone_marks='marks', splitter=' ')
'b ě I j and ng'

>>> p.get_pinyin('Beijing', tone_marks='numbers', splitter=' ')
'bei3 jing1'

# undelimited
>>> p.get_pinyin('Beijing', tone_marks='marks', splitter=' ')
'b ě ij and ng'

>>> p.get_pinyin('Beijing', tone_marks='numbers', splitter=' ')
'bei3jing1'
Copy the code

3.4 Obtaining initials

  • Flat lingual
>>> p.get_initials("Shanghai", splitter=The '-')
'S-H'
Copy the code
  • Become warped lingual
>>> p.get_initials("Shanghai", splitter=The '-', with_retroflex=True)
'SH-H'
Copy the code

3.5 Get the polyphonic combination of Chinese characters

Since Chinese characters tend to be polyphonic, the library can also display the situation of a character with multiple sounds

>>> p.get_pinyins('look like')
['mo-yang'.'mo-xiang'.'mu-yang'.'mu-xiang']

>>> p.get_pinyins('look like', tone_marks='marks', splitter=' ')
['mó yáng'.'mó yàng'.'mó xiàng'."Mu Yang '."Mu Yang '.'mú xiàng']

>>> p.get_pinyins('harm', tone_marks='marks', splitter=' ')
['sh ā ng hai'.'sh ā ng he']
Copy the code

4. The last

This library is quite good, such as appearance, I do not know the kind of xiang this sound, harm he this sound.

Shame 😂, Chinese culture extensive and profound, feel that the language can be rebuilt.

If you are interested, you can go to Github to explore. The author is maintaining a mandarin. Dat file, which records the hexadecimal and pinyin of Chinese characters and tones.

Post the link again: 🔗 xpinyin