R + Python in the same document

Author

Rodrigo Hermont Ozon

Published

June 9, 2023

Quarto

Quarto is a powerful tool that allows you to combine multiple programming languages, including Python and R, in a single document. To use Python and R in the same Quarto document, you’ll need to have both Python and R installed, as well as the reticulate package in R.

Here’s an example of using Python and R together in a Quarto document, and manipulating a pandas DataFrame using the R tidyverse package:

Loading Python and R Libraries

library(reticulate)
library(tidyverse)
library(dplyr)
library(ggplot2)

import pandas as pd
import statsmodels.api as sm
import numpy as np

R and Python in the same script example

First we are gonna read the csv file using R base commands:

lemurs <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-24/lemur_data.csv')

#Data wrangling
#Filter the data to only look at adult male Collared Brown Lemurs, and extract only the age and weight columns:

lemur_data <- lemurs |>
  filter(taxon == "ECOL",
         sex == "M",
         age_category == "adult") %>% 
  select(c(age_at_wt_mo, weight_g)) %>% 
  rename(Age = age_at_wt_mo, 
         Weight = weight_g)

glimpse(lemur_data)
Rows: 1,307
Columns: 2
$ Age    <dbl> 129.90, 132.10, 140.32, 157.94, 164.58, 184.18, 196.64, 208.77,…
$ Weight <dbl> 2805.0, 3001.0, 2429.0, 2597.0, 2497.0, 2225.0, 3223.0, 2433.0,…

#| label: modelling
#| echo: true
#| message: false
lemur_data_py = r.lemur_data # That command converts R dataframe to Python df

y = lemur_data_py[["Weight"]]
x = lemur_data_py[["Age"]]
x = sm.add_constant(x)

mod = sm.OLS(y, x).fit()

lemur_data_py["Predicted"] = mod.predict(x)
lemur_data_py["Residuals"] = mod.resid

Now plotting the residuals with ggplot R package:

lemur_residuals <- py$lemur_data_py # This functions transform residuals in a new Python df

ggplot(data = lemur_residuals,
       mapping = aes(x = Predicted,
                     y = Residuals)) +
  geom_point(colour = "#2F4F4F") +
  geom_hline(yintercept = 0,
             colour = "red") +
  theme(panel.background = element_rect(fill = "#eaf2f2",
                                        colour = "#eaf2f2"),
        plot.background = element_rect(fill = "#eaf2f2",
                                       colour = "#eaf2f2"))


py_arr = np.array([1, 2, 3, 4, 5])

for item in py_arr:
  print(item, end = ', ')
  
1, 2, 3, 4, 5, 

Now we are gonna read this vector with R:

r_arr <- as.vector(py$py_arr)

print(r_arr)
[1] 1 2 3 4 5

 

 


References

< https://www.r-bloggers.com/2023/01/combining-r-and-python-with-reticulate-and-quarto/ >

< https://blog.devgenius.io/heads-up-quarto-is-here-to-stay-aa861ef87491 >