Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

In the process of project development and testing, it is inevitable that a batch of test data is needed to test whether the code runs normally. At this time, we need to manually create some data. It is good that the amount of data required is relatively small, but if it is large, it will be very complicated and error-prone.

Also, manually generated data tends to be unfriendly data, such as:

test1.11111 123456789test2.22222 123456789test3.33333 987654321.Copy the code

It’s all nonsense, fake data, and to be honest, I do it occasionally.

However, if you think about it, this kind of test data, is too chicken, waste of time not to say, the point is meaningless.

Alas, there is a third party library in Python that can help us solve this problem. It is the Faker library, which can be used to generate all kinds of fake data that looks like real data.

This can be done using the PIP install Faker command.

Simple to use

Faker is also very simple to use. Import the Faker class from the Faker module, then instantiate the class and call the related methods to use it.

from faker import Faker


fake = Faker()
name = fake.name()
address = fake.address()
print(name)
print(address)
Copy the code

The result is as follows:

Alicia Miller
3944 Steven Forges
Johnfort, FL 57383
Copy the code

In the code above we generate a name and an address. The Faker library generates English data by default. We can use the locale argument to specify the language, for example:

from faker import Faker


fake = Faker(locale='zh_CN')
name = fake.name()
address = fake.address()
print(name)
print(address)
Copy the code
Block P, Shenhe Hu Street, Taiwan County, Beijing309525
Copy the code

Is not looking good, is the Beijing Taiwan county to me the whole of the unexpected, in fact, the generated address is not the real address, but a random combination, that is, the province, city, road and so on the random combination together, there are also normal, does not affect the use, but what bicycle.

Of course. Other languages are also supported, some of the language codes are as follows:

  • Simplified Chinese: zh_CN
  • Traditional Chinese: zh_TW
  • American English: en_US
  • British English: en_GB
  • Korean: ko_KR
  • Japanese: ja_JP

Of course, we don’t normally use these languages.

Commonly used functions

In addition to the two fake.name() and fake.address() functions described above that generate names and addresses, there are many ways to generate additional information.

Character information correlation

Name () : randomly generates full name name_female() : male name_male() : female first_name_female() : male last_name_female() : Female last_name_male() : male SSN () : generates id number phone_number() : generates mobile phone number randomlyCopy the code

Geographic information

Country_code () : country fake.country_code() : country code fake.province() : province fake.city_suffix() : city, county fake.district() : District fake.street_address() : Street address fake.street_suffix() : street, road Fake.address () : detailed address fake.geo_coordinate() : geographic coordinates fake.latitude() : Geographic coordinates (latitude) fake.longitude() : fake.postcode() : postcodeCopy the code

The company related to

Bs () : company service name Company () : company name company_prefix() : company_suffix() : company natureCopy the code

Network information correlation

Ascii_company_email () : ASCII company_email() : ASCII company_email() : company email() : safe_email() : secure email domain_name() : Domain name domain_word() : secondary domain name ipv4() : ipv4 address ipv6() : ipv6 address MAC_address () : random MAC address TLD () : domain name suffix (.com,.net.cn, etc.) Uri_extension () : url file extension uri_page() : url file extension() uri_path() : url file path() url() : random URL user_name() : Random username image_URL () : random URL addressCopy the code

Text, encrypted information class

What's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more, what's more? More than one sentence text() words() more than one sentenceCopy the code

Coding related

Binary () : binary code MD5 () : MD5 Password () : password SHA1 () : sha256() : sha256 uuid4() : UUIDCopy the code

Faker currently supports 304 types of data. Use dir(fake) to check which types of data the Faker library supports.

Custom Faker data types

Faker also supports the creation of custom Provider generated data, which can be customized if none of the data types supported by the Faker library is what we want.

from faker import Faker
from faker.providers import BaseProvider


Create a custom Provider
class MyProvider(BaseProvider) :
    def my_data_type(self) :
        return 'my_data_type'


Add the Provider #
fake = Faker()
fake.add_provider(MyProvider)
print(fake.my_data_type())
Copy the code

conclusion

The above mentioned is just some common data, Faker library can generate far more than these data, in addition, Faker library source code is also very worth studying and learning, interested partners can understand it.

Original is not easy, if small partners feel helpful, please click a “like” and then go ~

Finally, thank my girlfriend for her tolerance, understanding and support in work and life!