The author | ANIRUDDHA BHANDARI
Compile | vitamin k
Source | Analytics Vidhya
An overview of the
- Python-style tutorials will enable you to write neat Python code
- Learn the different Python conventions and other nuances of Python programming in this style tutorial
introduce
Have you ever come across a badly written piece of Python code? I know many of you will nod.
Writing code is part of the role of a data scientist or analyst. Writing nice, clean Python code, on the other hand, is another matter entirely. As a programmer with expertise in analytics or data science (or even software development), this will most likely change your image.
So how do we write this supposedly beautiful Python code?
Welcome to python-style tutorials
Many people in data science and analytics come from non-programming backgrounds. We started by learning the basics of programming, followed by understanding the theory behind machine learning, and then started conquering data sets.
In the process, we often didn’t practice core programming and didn’t pay attention to programming conventions.
That’s what this Python-style tutorial will tackle. We’ll review the Python programming conventions described in the PEP-8 documentation and you’ll be a better programmer!
directory
- Why is this Python-style tutorial important for data science?
- What is PEP8?
- Understand Python naming conventions
- Code layout for python-style tutorial
- Be familiar with correct Python comments
- Whitespace in Python code
- General programming advice for Python
- Automatic formatting of Python code
Why is this Python-style tutorial important for data science
Formatting is an important aspect of programming for several reasons, especially for data science projects:
- readability
A good code format will inevitably improve the readability of your code. This will not only make your code more organized, but also make it easier for the reader to understand what is going on in the program. This is especially useful if your program runs thousands of lines.
You’ll have lots of data frames, lists, functions, drawings, etc., and if you don’t follow the right formatting guidelines, you can easily lose track of your own code!
- collaboration
If you’re collaborating on a team project, as most data scientists do, good formatting becomes an essential task.
This ensures that the code is understood correctly without causing any trouble. In addition, following a common format pattern maintains program consistency throughout the project lifecycle.
- Bug fix
Having well-formed code will also help you when you need to fix bugs in your programs. Wrong indentation, improper naming, and so on can easily make debugging a nightmare!
Therefore, it is best to start your program in the right writing style!
With that in mind, let’s take a quick overview of the PEP-8 style tutorial that this article will cover!
The PEP – 8 is what
Pep-8 or Python Enhancement Suggestions are style tutorials for Python programming. It was written by Guido Van Rosen, Barry Warsaw and Nick Coglan. It describes the rules for writing beautiful and readable Python code.
Following peP-8’s coding style will ensure consistency in Python code, making it easier for other readers, contributors, or yourself to understand.
This article covered the most important aspects of peP-8 guidelines, such as how to name Python objects, how to structure code, when to include comments and whitespace, and finally some general programming advice that is important but easily overlooked by most Python programmers.
Let’s learn to write better code!
The official PEP-8 documentation can be found here.
www.python.org/dev/peps/pe…
Understand Python naming conventions
Shakespeare famously said, “What’s in a name?” If he had met a programmer, he would have gotten a quick “a lot!” .
Yes, when you write a piece of code, the names you choose for variables, functions, etc., have a big impact on the understandability of the code. Look at the following code:
# function1
def func(x):
a = x.split()[0]
b = x.split()[1]
return a, b
print(func('Analytics Vidhya'))# function2
def name_split(full_name):
first_name = full_name.split()[0]
last_name = full_name.split()[1]
return first_name, last_name
print(name_split('Analytics Vidhya'))
Copy the code
# output ('Analytics'.'Vidhya')
('Analytics'.'Vidhya')
Copy the code
Both functions do the same thing, but the latter provides a better intuition of what’s going on, even without any comments!
That’s why choosing the right name and following the right naming convention can make a huge difference when writing a program. That being said, let’s look at how to name objects in Python!
Start naming
These techniques can be applied to naming any entity and should be strictly followed.
- Follow the same pattern
thisVariable, ThatVariable, some_other_variable, BIG_NO
Copy the code
- Avoid long names, and avoid names that are too frugal
This_could_be_a_bad_name = "get this!" T = "This isn't good either"Copy the code
- Use reasonable and descriptive names. This will help you remember the purpose of the code later
X = "My Name" # prevent this full_name = "My Name" # this is betterCopy the code
- Avoid names that start with a number
1_name = "This is bad!"Copy the code
- Avoid special characters, such as @ and! , #, $, etc
Bad phone_ #Copy the code
Variable naming
- Variable names should always be lowercase
blog = "Analytics Vidhya"
Copy the code
- For longer variable names, separate words with underscores. This improves readability
awesome_blog = "Analytics Vidhya"
Copy the code
- Try not to use single-character variable names such as “I” (uppercase “I”), “O” (uppercase “O”), and “L” (lowercase “L”). They are indistinguishable from the numbers 1 and 0. Take a look at:
O = 0 + l + I + 1
Copy the code
- Naming global variables follows the same convention
The function name
- Follow the lowercase and underscore naming conventions
- Use expressive names
# avoid def con():... Def connect():...Copy the code
- If the function parameter name conflicts with the keyword, use trailing underscores instead of abbreviations. For example, convert break to break_u instead of BRK
Def break_time(break_):print(" YourbreakBreak_, time is money ", "long")Copy the code
Named after the name of the class
- Follow the CapWord (or camelCase or StudlyCaps) naming convention. Start each word with a capital letter and don’t underline between words
Follow CapWord class MySampleClass: passCopy the code
- If the class contains subclasses with the same attribute name, consider adding a double underscore to the class attribute
This ensures that the attribute __age in class Person is accessed as _Person__age. This is Python name clutter, which ensures that there are no name conflicts
class Person:
def __init__(self):
self.__age = 18Obj = Person() obj.__age # error obj._person__age # CorrectCopy the code
- Use the suffix “Error” for exception classes
Class CustomError(Exception): "" "custom Exception class ""Copy the code
Class method naming
- The first argument to an instance method (a basic class method without strings attached) should always be self. It points to the calling object
- The first argument to a class method should always be CLS. This points to the class, not the object instance
class SampleClass:
def instance_method(self, del_):
printDef class_method(CLS):print(" the Class method ")Copy the code
Package and module naming
- Keep your name short and clear
- Follow the lowercase naming convention
- For long module names, underscores are preferred
- Avoid underscores in package names
Testpackage # sample_module.py # Module nameCopy the code
Constant named
- Constants are usually declared and assigned in modules
- Constant names should be all uppercase
- Use underscores for longer names
# The following constant variables are in the global.py module PI =3.14
GRAVITY = 9.8
SPEED_OF_Light = 3*10六四运动8
Copy the code
Code layout for python-style tutorial
Now that you know how to name entities in Python, the next question should be how to construct code in Python!
Honestly, this is very important, because without proper structure, your code can go wrong, which is the biggest hurdle for any reviewer.
So without further ado, let’s take a look at the basics of code layout in Python.
The indentation
It is one of the most important aspects of code layout and plays a critical role in Python. Indentation tells the code block what lines to include for execution. The lack of indentation can be a serious mistake.
Indentation determines which code block the code statement belongs to. Imagine trying to write a nested for loop. Writing a line of code outside the respective loops may not give you a syntax error, but you will certainly end up with a logic error that can be time-consuming in debugging.
Follow the indentation style mentioned below for a consistent Python scripting style.
- Always follow the four-space indent rule
# sampleif value<0:
print(" Negative value ") # Another examplefor i in range(5) :print(" Follow this rule religiously!" )Copy the code
- It is recommended to use Spaces instead of tabs
It is recommended to use Spaces instead of tabs. But tabs can be used when code has already been indented with tabs.
if True:
print('4 spaces of indentation used! ')
Copy the code
- Break a large expression into several lines
There are several ways to deal with this situation. One way is to align subsequent statements with the initial delimiter.
# align with the start delimiter. Def name_split(first_name, middle_name, last_name) # another example. ans = solution(value_one, value_two, value_three, value_four)Copy the code
The second method uses the four-space indentation rule. This will require an additional level of indentation to distinguish parameters from other code within the block.
# Use extra indentation. def name_split( first_name, middle_name, last_name):print(first_name, middle_name, last_name)
Copy the code
Finally, you can even use “hanging indent”. Suspended indentation in Python context refers to the text style in which lines containing parentheses end with opening parentheses, and the following lines are indent until the parentheses end.
Ans = solution(value_one, value_two, value_three, value_four)Copy the code
- Indented if statements can be a problem
An if statement with multiple conditions naturally contains four Spaces. As you can see, this can be a problem. Subsequent lines are also indented, and the if statement cannot be distinguished from the block of code it executes. Now, what do we do?
Well, there are a few ways we can get around it:
# This is a problem.if (condition_one and
condition_two):
print(" Implement this ")Copy the code
One way is to use extra indentation!
# Use extra indentationif (condition_one and
condition_two):
print(" Implement this ")Copy the code
Another approach is to add comments between an if statement condition and a code block to distinguish the two:
# Add comments.ifCondition_one and condition_two: this condition is validprint(" Implement this ")Copy the code
- Closing of parentheses
Suppose you have a very long dictionary. You put all the key-value pairs on a single line, but where do you put the closing bracket? Is it on the last line? Or does it follow the last key-value pair? If I put it on the last line, what is the indentation of the close bracket position?
There are several ways to solve this problem.
One way is to align the closing parenthesis with the first non-space character on the previous line.
#
learning_path = {
‘Step 1':' Learn Programming ', 'Step2':' Learn machine learning ', 'Step'3':' Crack on the hackathons'}Copy the code
The second way is to make it the first character of a new line.
Learning_path = {' Step1':' Learn Programming ', 'Step2':' Learn machine learning ', 'Step'3':' Crack on the hackathons'}Copy the code
- Newline before binary operator
If you try to put too many operators and operands on a single line, this can be very troublesome. Instead, break it up into several lines for better readability.
The obvious question now is — do you interrupt before or after the operator? The convention is to break lines before operators. This helps to identify the operator and the operand it acts on.
GDP = (consumption + government_spending + investment + net_exports)Copy the code
Use a blank line
Putting lines together will only make it harder for the reader to understand your code. A good way to make your code look cleaner and prettier is to introduce a corresponding number of blank lines into your code.
- Top-level functions and classes should be separated by two blank lines
Class SampleClass(): pass def sample_function():print("Top level function")
Copy the code
- Methods in a class should be separated by a single blank line
MyClass(): def method_one(self):print("First method")
def method_two(self):
print("Second method")
Copy the code
- Try not to include blank lines between code segments that have related logic and functions
def remove_stopwords(text):
stop_words = stopwords.words("english")
tokens = word_tokenize(text)
clean_text = [word for word in tokens if word not in stop_words]
return clean_text
Copy the code
- You can use fewer blank lines in functions to separate logical parts. This makes the code easier to understand
def remove_stopwords(text):
stop_words = stopwords.words("english")
tokens = word_tokenize(text)
clean_text = [word for word in tokens if word not in stop_words]
clean_text = ' '.join(clean_text)
clean_text = clean_text.lower()
return clean_text
Copy the code
Maximum line length
- A line contains no more than 79 characters
When you write code in Python, you cannot compress more than 79 characters in a single line. This is a limitation and should be a guideline for keeping statements short.
- You can split statements into multiple lines and convert them into shorter lines of code
Num_list = [yfor y in range(100)
if y % 2= =0
if y % 5= =0]
print(num_list)
Copy the code
Import packages
Part of the reason many data scientists like Python is because there are so many libraries that make it easier to work with data. So let’s assume you’ll end up importing a bunch of libraries and modules to do any task in data science.
- It should always be at the top of the Python script
- Separate libraries should be imported on separate lines
import numpy as np
import pandas as pd
df = pd.read_csv(r'/sample.csv')
Copy the code
- Imports should be grouped in the following order:
-
- Standard library import
- Related third Party imports
- Local application/Kutdine import
- Include an empty line after each group import
import numpy as np
import pandas as pd
import matplotlib
from glob import glob
import spaCy
import mypackage
Copy the code
- You can import multiple classes from the same module in a single line
from math import ceil, floor
Copy the code
Be familiar with correct Python comments
Understanding a piece of uncommented code can be a laborious task. Even the original writers of the code have a hard time remembering exactly what happened in the line of code after a while.
Therefore, it is best to comment the code in a timely manner so that the reader can correctly understand what you are trying to achieve with the code.
General tips
- Comments always begin with a capital letter
- Comments should be complete sentences
- Update comments when updating code
- Avoid commenting on the obvious
Style of comments
- Describe the code snippets that follow them
- Has the same indentation as the code snippet
- Start with a space
Remove non-alphanumeric characters from the user input string.importRe raw_text = input (" EnterstringText: ') = re. Sub (r'\W+'.' ', raw_text)
Copy the code
Inline comments
- These comments are on the same line as the code statements
- At least two Spaces should be separated from code statements
- Start with the usual #, followed by a space
- Don’t use them to state the obvious
- Use them sparingly as they can be distracting
Info_dict = {} # dictionary for storing extracted informationCopy the code
Docstring
- Describes common modules, classes, functions, and methods
- Also known as the “docstrings”
- They stand out from other comments because they are enclosed in triple quotation marks
- If docString ends on a single line, include the terminator “” on the same line
- If the docString is divided into multiple lines, add the terminator “” to the new line.
def square_num(x):
""Return the square of a number.""
return x**2
def power(x, y):
"""Multi-line comment. Returns the x * * y."""
return x**y
Copy the code
Whitespace in Python code
Whitespace is often ignored as a trivial aspect when writing beautiful code. But using whitespace correctly can greatly improve the readability of your code. They help prevent overcrowding of code statements and expressions. This inevitably helps the reader navigate the code with ease.
The key
- Avoid placing Spaces inside parentheses immediately
Df [' text '] = df[' text '].apply(preprocess)Copy the code
- Do not place Spaces before commas, semicolons, or colons
# correct name_split = lambda x: x.split()Copy the code
- Do not contain Spaces between characters and open parentheses
# rightprint(' This is the right way ') # Correctfor i in range(5):
name_dict[i] = input_list[i]
Copy the code
- When multiple operators are used, only Spaces are included around the operator with the lowest priority
Ans = x**2 + b*x + c
Copy the code
- In sharding, the colon acts as a binary operator
They should be considered the lowest-priority operators. Each colon must contain equal Spaces around it
Df_train [lower_bound+5 : upper_bound- 5]
Copy the code
- Trailing whitespace should be avoided
- Function parameter defaults do not have Spaces around the = sign
def exp(base, power=2) :return base**power
Copy the code
- Always enclose the following binary operators with a single space:
-
- Assignment operators (=, +=, -=, etc.)
- Compare (=, <, >! =, <>, <=, >=, input, no, yes, no)
- Boolean values (and, or, not)
Brooklyn = [' Amy ', 'Terry', 'Gina','Jake']
count = 0
for name in brooklyn:
ifName = = "Jake" :printCount + = (" Cool ")1
Copy the code
General programming advice for Python
In general, there are many ways to write a piece of code. When they accomplish the same task, it is best to use the recommended writing method and maintain consistency. I’ve covered some of them in this section.
- Always use “is” or “is not” when comparing with “None” and the like. Do not use the equality operator
# errorifname ! = None:print("Not null") # rightif name is not None:
print("Not null")
Copy the code
- Do not use comparison operators to compare booleans to TRUE or FALSE. While using the comparison operator may be intuitive, it is not necessary. You just write Boolean expressions
# rightif valid:
print("Correct") # errorif valid == True:
print("Wrong")
Copy the code
- Instead of binding lambda functions to identifiers, use generic functions. Because assigning a lambda function to an identifier defeats its purpose. It would also be easier to backtrack
# select this deffunc(x):
returnNone # instead of thisfunc = lambda x: x* * 2
Copy the code
- When you catch an exception, name the exception you want to catch. Don’t just use a bare exception. This will ensure that when you try to interrupt execution, the exception block does not mask other exceptions by interrupting exceptions on the keyboard
try:
x = 1/0
except ZeroDivisionError:
print('Cannot divide by zero')
Copy the code
- Be consistent with your return statement. That is, all return statements in a function should return an expression, or none of them should return an expression. Also, if the return statement returns no value, return None instead of nothing
Error def sample(x):if x > 0:
return x+1
elif x == 0:
return
else:
return x- 1Def sample(x):if x > 0:
return x+1
elif x == 0:
return None
else:
return x- 1
Copy the code
If you want to check for prefixes or suffixes in strings, use “.startswith() “and”.endswith()” instead of slicing the string. They are cleaner and less error-prone
# rightif name.endswith('and') :print('Great! ')
Copy the code
Automatic formatting of Python code
Formatting is not a problem when you write small programs. But imagine having to follow the correct formatting rules for a complex program that runs thousands of lines! This is definitely a daunting task. And, most of the time, you don’t even remember all the formatting rules.
How can we solve this problem? Well, we can do this with some automatic formatters!
An autoformatter is a program that identifies formatting errors and fixes them in place. Black is one such auto-formatter that automatically formats Python code to fit the PEP8 coding style, reducing your load.
BLACK:pypi.org/project/bla…
It can be easily installed using PIP by typing the following command in the terminal:
pip install black
Copy the code
But let’s see how black helps in the real world. Let’s use it to format programs with the following types of errors:
Now, all we need to do is go to the terminal and type the following command:
black style_script.py
Copy the code
When done, Black may have made the changes and you will receive the following message:
Once you try to open the program again, these changes will be reflected in the program:
As you can see, it already formats the code correctly, and it helps in case you accidentally violate formatting rules.
Black can also be integrated with Atom, Sublime Text, visualstudio code, and even Jupyter Notebook! This is definitely a plugin you’ll never miss.
In addition to Black, there are other automatic formatters, such as AutoEP8 and YAPf, which you can also try!
At the end
We’ve covered many of the key points in the Python-style tutorial. If you consistently follow these principles in your code, you’ll end up with cleaner and more readable code.
In addition, when you work as a team on a project, it is beneficial to follow a common standard. It makes it easier for other collaborators to understand. Start adding these style tips to your Python code!
The original link: www.analyticsvidhya.com/blog/2020/0…
✄ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
See here, if you like this article, please click “watching” or “retweet” or “like”.
Highlights of past For beginners entry route of artificial intelligence and data download machine learning and deep learning notes such as printing machine learning online manual deep learning notes album "statistical learning method" code retrieval based album album download AI based machine learning math to get a sale station knowledge star coupons, copy the link directly open: HTTPS://t.zsxq.com/662nyZF site QQ group 1003271085. To join the wechat group, please scan the code to enter the group:
Copy the code