preface

If you’re still wondering: Is It better to learn data science in Python or R? Now my answer is: absolutely not. Both variables can now call each other. You can use R for data processing (Tidyverse), visualization (GGplot2), and development in Python. R Vs Python: What’s the Difference?

That’s all it takes — the Reticulate bag

The Reticulate package covers a full suite of tools for Python and R collaboration, available in BOTH R and Rstudio, including:

1) Support multiple ways to call Python in R. This includes R Markdown, loading Python scripts, importing Python modules, and using Python interactively in R sessions.

2) Implement conversions between R and Python objects (e.g. between R and Python data boxes, R matrices and NumPy arrays).

Python sessions are embedded in R sessions for seamless, high-performance interoperability. If you’re an R developer doing some work in Python or a member of a bi-lingual data science team, the Reticulate package can greatly simplify your workflow! Official information is available on the Reticulate website

In this paper, the framework

preparation

Install the reticulate bag

Can be installed directly, and load, very convenient.

 install.packages("reticulate")
 library("reticulate")
Copy the code

Download the python library

Assuming we want to use pandas but don’t have it installed, one way is to run the following code under the R engine. You can download the Python third-party library and check whether the installation is successful.

Py_install ("pandas") # Download the scipy library py_module_available("pandas") # check whether the installation is successful #repl_python()# check the Python path #use_python("D:/anaconda/python.exe") # change the python pathCopy the code

Import the python library

Once the third-party library is installed, you can import it, again under the R engine.

 pd <- import("pandas")
Copy the code

As you can see, this differs from pthon code in that:

  • import("os)Instead of aimport os
  • $Instead of a.
  • <-Instead of a=

3. Common operations

Reticulate includes a Python engine for R Markdown with the following features, which can be seen from Reticulate: R interface to Python:

1) Python blocks can be run in a single Python session embedded in an R session. Also share variables/states between Python blocks.

2) Printable Python output, including graphic output from Matplotlib.

3) Objects created in Python blocks in R can be accessed using py objects.

4) Use r objects to access objects created in r blocks from Python.

drawing

You can draw directly with the Python engine.

Import numpy as np import matplotlib.pyplot as PLT # Calculate the x and y coordinates of the points on the sinusoidal curve x = np.arange(0, 3 * np.pi, Sine wave form (x, y) sine wave form (x, y) sine wave form (x, y) sine wave form (x, y) sine wave form (x, y) sine wave form (x, y) sine wave form (x, y) sine wave formCopy the code

Read the file

So what python does, and what we can do here, is basically the same as what we did in Jupter Notebook. Read the following data table in CSV format.

 import pandas as pd
 df = pd.read_csv("test.csv", encoding="gbk")
 df.head()
Copy the code

The df returned is a Python object, and as we can see the table is not pretty, it’s a Python object in R.

Call Python variables in R code blocks

This is all about running Python blocks inside Rmakdown, not running Python code inside R blocks or calling Python variables. Now let’s try calling Python variables inside R blocks.

Py $python variable names

R code block to call the Python method py$Python variable name

  • Py is equivalent to objects in Python
  • $is equivalent to a point in Python
  • Python_variable_name is the name of the variable in the Python code block

For example, the Python variable df, as described above, is called from R

 ```{r}
 py$df
Copy the code

When you call the Python object df, R now converts it to an R object by default, so the content is the same and the style seems to look better.

source_python()

Using source_python(‘py file path ‘) in the Reticulate package, you can import variables from the PY file so that you can use external variables in R blocks. For example, I prepare strings A and B in data.py

A = 'I am Zhuang Shanshan,'B =' From Wenzhou, Zhejiang 'Copy the code

Run data.py in the R block

Paste0 (A, B) # I am Zhuang Shanshan from Wenzhou, Zhejiang ProvinceCopy the code

py_run_file()

Run the test.py file in the project folder in the R code block, using the following code

 library(reticulate)
 py_run_file("test.py")
Copy the code

Data type comparison

,