Python is one of the most popular and powerful programming languages. Since it is free and open source, everyone can use it. Most Fedora systems have the language installed. Python can be used for a variety of tasks, including processing comma-separated value (CSV) data. CSV files often start out as tables or spreadsheets. This article explained how to work with CSV data in Python 3.

CSV data is exactly what it sounds like. CSV files place data in rows, with values separated by commas. Each row is defined by the same field. Short CSV files are usually easy to read and understand. But longer data files or data files with more fields can be hard to parse with the naked eye, so computers do better in that case.

This is a simple example where the fields are Name, Email, and Country. In this case, the CSV data has the field definition as the first line, although this is not always the case.

Name,Email,Country
John Q. Smith,[email protected],USA
Petr Novak,[email protected],CZ
Bernard Jones,[email protected],UK
Copy the code

Read the CSV from the spreadsheet

Python includes a CSV module that reads and writes CSV data. Most spreadsheet applications, whether native (such as Excel or Numbers) or Web-based (such as Google Sheet), can export CSV data. In fact, many other services that publish tabular reports can also be exported to CSV (for example, PayPal).

The Python CSV module has a built-in reader method called DictReader that treats each row of data as an OrderedDict. It requires a file object to access the CSV data. So, if the file above is example.csv in the current directory, the following code snippet is one way to get this data:

f = open('example.csv'.'r')
from csv import DictReader
d = DictReader(f)
data = []
for row in d:
    data.append(row)
Copy the code

Now, the in-memory data object is a list of OrderedDict objects:

[OrderedDict([('Name'.'John Q. Smith'),
               ('Email'.'[email protected]'),
               ('Country'.'USA')]),
  OrderedDict([('Name'.'Petr Novak'),
               ('Email'.'[email protected]'),
               ('Country'.'CZ')]),
  OrderedDict([('Name'.'Bernard Jones'),
               ('Email'.'[email protected]'),
               ('Country'.'UK')])]
Copy the code

It is easy to refer to these objects:

>>> print(data[0]['Country'])
USA
>>> print(data[2]['Email'])
[email protected]
Copy the code

By the way, if you need to work with CSV files without field names and header lines, the DictReader class lets you define them. In the above example, add the fieldNames parameter and pass a list of names:

d = DictReader(f, fieldnames=['Name'.'Email'.'Country'])
Copy the code

The real example

I recently tried to pick a random winner from a long list of people. The CSV data I extracted from the spreadsheet was a simple list of names and email addresses.

Fortunately, Python has a useful random module that does a good job of generating random values. The Randrange function in the module’s Random class is exactly what I need. You can give it a regular range of numbers (such as integers) and the step values between them. The function then produces a random result, which means I can get a random integer (or row number) within the total number of rows of the data.

This little program works fine:

from csv import DictReader
from random import Random

d = DictReader(open('mydata.csv'))
data = []
for row in d:
    data.append(row)

r = Random()
winner = data[r.randrange(0, len(data), 1)]
print('The winner is:', winner['Name'])
print('Email address:', winner['Email'])
Copy the code

Obviously, this is a very simple example. Spreadsheets themselves contain sophisticated methods for analyzing data. But if you want to do something outside of a spreadsheet application, Python might be the trick!

Photograph by Isaac Smith, published by Unsplash.


Via: fedoramagazine.org/using-data-…

By Paul w. Frields, lujun9972

This article is originally compiled by LCTT and released in Linux China