From TowardsDataScience by Perter Nistrup, Compiled by Heart of the Machine, and participated in: The Demon Lord.

This article lists some tips for improving or speeding up your daily data analysis work, including:

1. Pandas Profiling

2. Use Cufflinks and Plotly to draw Pandas data

3. IPython magic command

4. Format arrangement in Jupyter

5. Jupyter Shortcut keys

6. In Jupyter (or IPython), make a unit have multiple outputs at the same time

7. Create slides for Jupyter Notebook in real time

1. Pandas Profiling

This tool is effective. The following figure shows the result of calling the simple method df.profile_report() :

To use the tool, you only need to install and import the Pandas Profiling package.

This article no longer dwelt on this tool, if you would like to learn more, please read: https://towardsdatascience.com/exploring-your-data-with-just-1-line-of-python-4b35ce21a82d

2. Use Cufflinks and Plotly to draw Pandas data

Most “experienced” data scientists or analysts are familiar with Matplotlib and PANDAS. That is, you can quickly draw simple pd.dataframe or pd.series by simply calling the.plot() method:


A little boring?

That’s all well and good, but what about an interactive, scalable, scalable panorama? It’s time for Cufflinks* * to step up! (Cufflinks did a further wrapper based on Plotly.)

To install Cufflinks in your environment, just run it in a terminal! PIP install cufflinks –upgrade See the image below:

Much better!

Note that the only thing that changes in the figure above is the import and setting of Cufflinks cf.go_offline(), which changes the.plot() method to.iplot().

Other methods such as.scatter_matrix() can also provide great visualizations:

For those of you who need to do a lot of data visualization, read Cufflinks and Plotly’s documentation to find out more.


  • Cufflinks documentation: https://plot.ly/ipython-notebooks/cufflinks/

  • The Plotly documentation is https://plot.ly/

3. IPython magic command

IPython’s “magic” is a series of IPython enhancements based on Python’s standard syntax. Magic commands include two methods: Line magics: run on a single input line prefixed with %; Cell magics command: run on multiple input lines prefixed with %%. Here are some useful features provided by the IPython magic command:

%lsmagic: Finds all commands

If you only remember one magic command, it has to be this one. Executing the %lsmagic command provides a list of all available magic commands:


% DEBUG: interactive debug

This is probably the most common magic command I use.

Most data scientists have encountered this situation: the block of code being executed keeps breaking, and you desperately write 20 print() statements, trying to print out the contents of each variable. Then, when you finally fix the problem, you have to go back and delete all print() statements again.

But I don’t have to do that anymore. When you encounter a problem, simply execute the %debug command to execute any part of the code you want to run:

What happened in the picture above?

  1. We have a function that takes a list as input and squares all even numbers.

  2. We ran the function, but something went wrong. But we don’t know how!

  3. Use the %debug command on this function.

  4. Let the debugger tell us the values of x and type(x).

  5. The problem is obvious: we typed ‘6’ into the function as a string!

This is useful for more complex functions.

%store: Passes variables between notebooks

This command is also cool. Suppose you spent some time cleaning the data in the notebook, and now you want to test some functionality in another notebook. Do you implement that functionality in the same notebook, or do you save the data and load it in another notebook? With the %store command, none of this is necessary! This command will store the variable, which you can retrieve from any other notebook:

  • %store [variable] Stores variables.

  • %store -r [variable] Reads/retrieves stored variables.

%who: Lists all global variables.

Have you ever assigned a value to a variable and forgotten its name? Or accidentally delete the cell responsible for assigning values to variables? Using the %who command, you can get a list of all global variables:

% % time:
Timing magic command


You can use this command to obtain all timing information. Simply apply the %%time command to any executable code and you get the following output:

%%writefile: writes cell content to a file

This magic command is useful when writing complex functions or classes in a Notebook that you want to save to your own file. Simply add the %%writefile prefix and the file name you want to save to the cell of a function or class:

As shown above, we can save the created function to the utils.py file and import it at will. This can be done in other notebooks as long as they are in the same directory as the utils.py file.

4. Format arrangement in Jupyter

This tool is cool! Jupyter takes into account the existence of HTML/CSS formats in Markdown. Here are the features I use most often:

Blue, fashionable:

<div class="alert alert-block alert-info">   
This is <b>fancy</b>!
</div>Copy the code

Red, slightly flustered:

<div class="alert alert-block alert-danger">   
This is <b>baaaaad</b>!
</div>Copy the code

Green and calm:
<div class="alert alert-block alert-success"> 
This is <b>gooood</b>!
</div>Copy the code

The following image shows them in action:

This is useful when you want to present some discoveries in Notebook format!


5. Jupyter Shortcut keys

To learn about the keyboard shortcuts, use the command palette Ctrl + Shift + P to get a list of all the Notebook’s features. Here are a few basic commands:

  • Esc: Enters the command mode. In command mode, you can use arrow keys to navigate through the Notebook.

In command mode:
  • A and B: Insert A new cell Above or Below the current cell.

  • M: The current cell enters the Markdown state.

  • Y: The current cell enters the code state.

  • D,D: Deletes the current cell.

  • Enter: The current cell returns to edit mode.

In edit mode:
  • Shift + Tab: Provides document strings for objects you type in the current cell. Use this shortcut continuously to recycle document mode.

  • Ctrl + Shift + – : Splits the current cell at the cursor position.

  • Esc + F: Find and replace code (excluding output).

  • Esc + O: Toggles cell output.

Select multiple cells:

  • Shift + Down and Shift + Up: Select the lower or upper cell.

  • Shift + M: Merges selected cells.

Note that after multiple cells are selected, you can perform delete/copy/cut/paste/run operations in batches.

6. In Jupyter (or IPython), make a unit have multiple outputs at the same time

Pandas DataFrame.head() and.tail() have you ever wanted to show pandas DataFrame.head() and.tail(), but had to give up because it would be too cumbersome to create additional units of code that run the.tail() method? Don’t worry now, you can use the following lines to show the output you want to show:
from IPython.core.interactiveshell import InteractiveShellInteractiveShell.ast_node_interactivity = "all"Copy the code

The following figure shows the results of multiple outputs:

7. Create slides for Jupyter Notebook in real time

With RISE, you can instantly turn Jupyter Notebook into a slide show with a single keystroke. And the Notebook is still active, so you can perform live coding while showing slides!

To use the tool, simply install RISE via Conda or PIP.
conda install -c conda-forge riseCopy the code

or


pip install RISECopy the code

Now you can click on the new button to create a nice slide for notebook: