Home>>Data Science>>Fake (almost) everything with Faker
crowd
Data Science

Fake (almost) everything with Faker

I was recently tasked with creating some random customer data, with names, phone numbers, addresses, and the usual other stuff. At first, I thought I’ll just generate random strings and numbers (some gibberish) and call it a day. But then I remembered my colleagues using a package for that. I know, there’s always a package for everything, well, almost everything.

Anyway, I thought I’ll give it a shot. I’ve started to write some serious Python code these days and thought it to be a good idea to explore the various packages available for Python. I executed the pip command, downloaded the package, and started generating some random people in my CSV files. It was fun. So I thought I’ll document this process because given my history, I’ll definitely forget about Faker.


Installing Faker

Installing Faker is no different than installing any other Python package using pip. You can use any one of the following commands to install Faker.

pip install faker
pip3 install faker
python -m pip install faker
python3 -m pip install faker

Depending on the version of Python you have installed, use the appropriate command to install the Faker package. It shouldn’t take more than a couple minutes.


Importing Faker into your code and initialising it

Importing the Faker package into your code is also nothing different. Simply add the following import statement at the beginning of your Python file and you should be good to go.

from faker import Faker

Once you have imported the package, you need to create a object of the Faker class. You can do that using the following command. The locale parameter is optional though. You can skip that and you’ll totally be fine.

faker = Faker(locale='en_US')

Let’s look at what it can do first

Before we dive into the code, let’s have a look at what it can do for us first.

My name is Mx. Linda Dunn III , I'm a gender neutral person. You can call me at 001-099-311-6470, or email me at caroljohnson@hotmail.com, or visit my home at 2703 Fitzpatrick Squares Suite 785
New Crystal, MN 18112

My name is Dr. John Harris MD , I'm a male. You can call me at (276)611-1727, or email me at combstiffany@brown-rivers.org, or visit my home at 7409 Peterson Locks Apt. 270
South Kimfurt, IL 79246

My name is Dr. Ann Huynh DVM , I'm a female. You can call me at 543.024.8936, or email me at timothy30@shea-poole.com, or visit my home at 5144 Rubio Island
South Kenneth, WI 22855

This is the output of a simple Python script that I wrote to generate fake customer data, or fake people. Looking at this, it’s amazing how realistic it looks. And the code I used to get this output is the following:

from faker import Faker

faker = Faker(locale='en_US')

print("My name is %s %s %s , I'm a gender neutral person. You can call me at %s, or email me at %s, or visit my home at %s" % 
    (faker.prefix_nonbinary(), faker.name_nonbinary(), faker.suffix_nonbinary(), faker.phone_number(), faker.ascii_free_email(), faker.address())
)

print("My name is %s %s %s , I'm a male. You can call me at %s, or email me at %s, or visit my home at %s" % 
    (faker.prefix_male(), faker.name_male(), faker.suffix_male(), faker.phone_number(), faker.ascii_company_email(), faker.address())
)

print("My name is %s %s %s , I'm a female. You can call me at %s, or email me at %s, or visit my home at %s" % 
    (faker.prefix_female(), faker.name_female(), faker.suffix_female(), faker.phone_number(), faker.company_email(), faker.address())
)

You can see now how easy it is to generate large amounts of fake customers, for testing of course. And the fun doesn’t end here. There’s a lot more where that came from. You can generate a whole company with for example:

The company I just created!
David PLC
Providers of Horizontal value-added knowledge user
Phone: 001-891-255-4642x93803
Email: ksanchez@cochran.com
234 Torres Ports
West Rhonda, AL 96210

As you can see from the output above, we provide some great horizontal value-added knowledge user. That’s supposed to be the company’s catch phrase.

And I kid you not, there’s a method called bs(). I don’t know when you’d ever use, but you call Faker’s bs() any time you want. See what I did there?


How does this help?

Well, I thought you’d have already figured that part out. Anyway, when you need data to test, and you need that data to be as true to reality as possible (or as realistic) as possible, you can use Faker to easily and quickly generated test data.

Actually, I’m not sure about the “quickly” part of my last sentence. It’s definitely easy to generate the data. But to generate one million customer records with first name, last name, email, phone, etc., it took almost 350 seconds on a 2019 16-inch base model MacBook Pro. So make of it what you will.


Summary

Nonetheless, this is definitely a very handy and fun package to have in your arsenal. You can generate any number of customers or friends (swing how ever you swing) very easily with a complete offline and online profile for each person. You can generate home phone and email, work phone and email, home address, work address, interests, profiles, credit cards, license plate numbers, and a lot more. So do head over to the package’s Github repo, take a look around, and take it for a spin. The source code is pretty easy to understand as well.

And if you like what you see here, or on my Medium blog, and would like to see more of such helpful technical posts in the future, consider supporting me on Patreon and Github.

Become a Patron!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: