There is one thing in life that everyone doesn’t care about until it happens, but once it happens it becomes extremely important and requires a big decision to be made in a very short period of time. That is choosing a name for their new baby.
It is estimated that many people are just like me. At the beginning, they were very flustered. Although they felt that Chinese characters were too many, they could just find a word to do the name. So everywhere turns over the dictionary, the net searches, turns over the Tang poem song ci, the book of Songs, even the martial arts novel, thought for a long time however the name that gets, often however is subjected to the opinion and the objection of the family member, for example is not smooth, and the relative names and so on the problem of stress, so fell into the cycle of repeated search and negation, more and more chaotic.
So we went back to the Internet to search for a variety of online, found a lot of online given “good boy baby names” and other articles, these articles give hundreds of names at a time, see the dazzling can not use. And there are many CeMing words website or APP, can enter the name of eight or five grade, the function of feeling quite good can give a reference, however we need names, either the input test, either the website or the name of the APP itself rarely, or can’t meet our requirements such as limited words, or start charging, I couldn’t find one that worked.
So I wanted to do a program like this:
- The main function is to give a batch name to provide reference, these names are combined with the baby’s birth eight figure out;
- They can expand the name library, such as online found a number of good names in the Book of Poetry, want to see how, add it can be used;
- Can limit the name of the use of characters, for example, some family tree is limited, the current generation is “guo”, the name must have “guo”;
- A list of names gives a score so that names can be inverted from high to low;
This gives us a list of names that match our children’s birth dates, our genealogical limits and preferences, and a score that we can use as a baseline from which to hone our ideas. Of course, if you have new ideas, you can always add new names to the lexicon and recalculate.
The code structure of a program
Code introduction:
- / Chinese-name-score code root directory
- /chinese-name-score/main code directory
- / Chinese-name-score /main/dicts dictionary file directory
- /chinese-name-score/main/dicts/ names_boys_double-.txt
- /chinese-name-score/main/dicts/names_boys_single. TXT
- /chinese-name-score/main/dicts/names_girls_single. TXT
- /chinese-name-score/main/dicts/names_grils_double. TXT
- /chinese-name-score/main/outputs output data directory
- Outputs /chinese-name-score/main/outputs/names_girls_source_wxy. TXT
- /chinese-name-score/main/scripts Scripts that preprocess dictionary files
- / Chinese – name – score/main/scripts/unique_file_lines. Py set the dictionary file, name of dictionary to weight and blank lines
- /chinese-name-score/main/sys_config.py system configuration, including crawling the target URL, dictionary file path
- /chinese-name-score/main/user_config.py user configuration, including the baby’s year, month, day, time, gender, etc
- /chinese-name-score/main/get_name_score.py Is the entry to run the program
Method of using code:
- TXT and names_grils_double. TXT. You can add a list of names you found here, and add them at the end of the line.
- If you have qualified words, go to the dictionary files names_boys_single. TXT and names_girls_single.
- Open user_config.py and configure it, as shown in the next section.
- Run the script get_name_score.py
- Outputs directory, view their output files, you can copy to Excel, sorting and other operations;
Configuration entry for the program
The configuration of the program is as follows:
Setting = {} # coding:GB18030 """ setting = {} # Setting ["limit_world"] = "limit_world" # setting["name_prefix"] = "limit_world" # setting["limit_world"] = "limit_world" Setting ["sex"] = "male" # province setting["area_region"] = "Beijing" # city setting["area_region"] = "Haidian" # Calendar year of birth Setting ['year'] = "2017" # setting['month'] = "1" # setting['day'] = "11" # setting['hour'] = Setting ['output_fname'] = "names_girls_source_xxx.txt"Copy the code
According to the configuration item setting[” limit_world “], the system automatically determines whether to select a single-word dictionary or a multi-word dictionary:
- If this item is set, for example, equal to “guo”, then the program will combine all words as names for calculation, such as guo Hao and Hao Guo will be calculated;
- If this is not set and the string is left empty, the program will only read the * _double-.txt double word dictionary
Principle of program
This is a simple crawler. You can open the http://life.httpcn.com/xingming.asp, this is a POST form, fill in the required parameters, are presented, opens a results page, the results of the bottom of the page contains eight points and five points.
If you want to get the score, you need to do two things. One is to submit the form automatically and get the result page. Extracting scores from the results page;
For the first thing, it’s easy to do urllib2 (code in/ chinese-name-score/main/get_name_score.py) :
post_data = urllib.urlencode(params)
req = urllib2.urlopen(sys_config.REQUEST_URL, post_data)
content = req.read()Copy the code
In this case, params is a dict parameter, and a POST with data is submitted, and the result data is retrieved from the Content.
Params parameters are set as follows:
Params = {} # date type, 0 indicates Gregorian calendar, ['data_type'] = "0" params['year'] = "%s" % STR (user_config.setting["year"]) params['month'] = "%s" % str(user_config.setting["month"]) params['day'] = "%s" % str(user_config.setting["day"]) params['hour'] = "%s" % str(user_config.setting["hour"]) params['minute'] = "%s" % str(user_config.setting["minute"]) params['pid'] = "%s" % STR (user_config.setting["area_region"]) params['cid'] = "%s" % STR (user_config.setting["area_region"]) Params ['xing'] = "%s" % (user_config.setting["name_prefix"]) params[' Ming '] = name_postfix If user_config.setting["sex"] == "male ": else: params['sex'] = "0" params['act'] = "submit" params['isbz'] = "1"Copy the code
The second thing, which is to extract the score you want from a web page, is BeautifulSoup4, which has a very simple syntax:
soup = BeautifulSoup(content, 'html.parser', From_encoding ="GB18030") full_name = get_full_name(name_postfix) # print soup. Find (string=re.compile(u" name five grid score ")) for Find_all ("div", class_="chaxun_b"): node_cont = node.get_text() if u' in node_cont: Result_data ['wuge_score'] = name_wuge.next_sibling.b.get_text() if U 'iN node_cont: Result_data ['bazi_score'] = name_wuge.next_sibling.b.get_text()Copy the code
Through this method, it can parse HTML and extract the score of eight characters and five squares.
Run result story
1/1287 Li Guojin name octet score =61.5 name octet score =78.6 total score = 140.2/1287 Li Guotie name Octet score =61 name octet score =89.7 Total score = 150.73/1287 Li Guojing name Octet score =21 name octet score =81.6 Total score =102.6 4/1287 Li Mingguo name eight-character score =21 Name five-character score =90.3 total score =111.3 5/1287 Li Ruoguo name eight-character score =64 name five-character score =78.3 total score =142.3 6/1287 Li Guojing name eight-character score =21 Name five-grid score =89.8 total score = 110.87/1287 Name five-grid score =22 Name five-grid score =87.2 total score = 109.2/1287 Name five-grid score =21 name five-grid score =81.6 total score = 102.6/1287 Name Five-grid score = 102.9/1287 Name score =21 name score =83.7 total score =104.7 10/1287 Li Guotian name score =21 name score =81.6 total score =102.6 11/1287 Li Guotian name score =22 name score =83.7 total score =105.7 12/1287 Li Guotian name eight-character score =22 Name five-character score =93.7 total score =115.7Copy the code
With these scores, we can sort them, which is a useful reference.
Helpful hints
- Scores depend on many factors, such as the time of birth, the qualified word, the stroke of the qualified word and so on. These conditions determine that some names will not score high. Do not be affected by this.
- At present the content of the program can only crawl a web site, http://life.httpcn.com/xingming.asp
- This list is for reference only, I have read some articles, many famous and great people in history, their names are rated very low but they all made great achievements, names do have some influence but sometimes catchy is the best;
- After selecting the name from this list, you can check it in Baidu, Renren.com and other places, in case some negative people have the same name, or there are too many people with this name.
- The score of eight is inherited in China, and the score of five is invented by the Japanese in modern times. Sometimes you can try the naming method of the western constellation, and the strange thing is that the score of eight is very different from the score of five, which indicates that this thing is for reference only.
The code for this article has been uploaded to github: github.com/peiss/chine…
This paper address: www.crazyant.net/2076.html, reprint please indicate the source.