preface
If you’re still wondering: Is it better to learn data science in Python or IN R? Now my answer is: not necessarily. Now the variables of the two can call each other. You can use R for data processing (Tidyverse), visualization (ggplot2), and development in Python. R Vs Python: What’s the Difference?
All this will do in one piece — reticulate Bag
Reticulate package covers a complete set of tools for collaborating between Python and R, available in both R and R Studio. These include:
1) Supports multiple ways to call Python in R. This includes R Markdown, loading Python scripts, importing Python modules, and using Python interactively in R sessions.
2) Implement the conversion between R and Python objects (such as R and Python data boxes, R matrices and NumPy arrays).
Python sessions are embedded in R sessions for seamless, high-performance interoperability. If you are an R developer doing some work in Python or a member of a data science team that uses two languages, reticulate package can greatly simplify your workflow! The official information is available from reticulate website
In this paper, the framework
preparation
Install the reticulate bag
Can be installed directly, and load, very convenient.
install.packages("reticulate")
library("reticulate")
Copy the code
Download the python library
Let’s say we want to use the Pandas library, but we don’t have it installed. One way to do this is to run the following code under the R engine. You can download the third-party Python library or check whether the installation is successful.
Py_install ("pandas") # download the scipy library py_module_available("pandas") # download the scipy library py_module_available("pandas") # download the scipy library py_module_available("pandas" #use_python("D:/anaconda/python.exe"Copy the code
Import the python library
After installing the third-party libraries, you can import them, again under the R engine.
pd <- import("pandas")
Copy the code
As you can see, this differs from the Pthon code by:
import("os)
Instead of aimport os
$
Instead of a.
<-
Instead of a=
3. Common operations
Reticulate: R interface to Python reticulate: R interface to Python reticulate: R interface to Python
1) Python blocks can be run in a single Python session embedded in an R session. Also shared variables/states between Python blocks.
2) Print Python output, including graphic output from Matplotlib.
3) Objects created in Python blocks in R can be accessed using py objects.
4) Use r objects to access objects created in r blocks from Python.
drawing
You can draw directly with the Python engine.
Import matplotlib.pyplot as PLT import matplotlib.pyplot as PLT 1) y = np.sin(x) plt.title("sine wave form")Copy the code
Read the file
The same thing that python does, the same thing that we can do here, is basically the same thing that we did in jupter Notebook. Now read the data table in CSV format.
import pandas as pd
df = pd.read_csv("test.csv", encoding="gbk")
df.head()
Copy the code
Df returned is a Python object, and we can see that the table is not nice, this is a Python object in R.
Call a Python variable in an R block
So this is all about running Python blocks in Rmakdown, not running Python code or calling Python variables in R blocks. Now let’s try calling Python variables in R blocks.
Py $python variable names
R code block calls Python method py$Python variable name
- Py is equivalent to an object in Python
- $is equivalent to a point in Python
- Python_variable_name is the name of a variable in a Python code block
For example, the Python variable df, as described above, is called in R
```{r}
py$df
Copy the code
Now, when you call the Python object df, R will turn it into an R object by default, so the content is the same, and the style looks a little nicer.
source_python()
Using source_python(‘py file path ‘) in the Reticulate package, variables in the PY file can be imported so that external variables can be used in R code blocks. For example, I prepare two strings A and B in data.py
A = 'I'm Zhuang Shanyi,'B =' From Wenzhou, Zhejiang 'Copy the code
Run data.py in the R block
{r}library(reticce_python ("data.py") paste0(A, B) #Copy the code
py_run_file()
To run the test.py file in the project folder in the R block, use the following code
library(reticulate)
py_run_file("test.py")
Copy the code