Databricks Community

athos · ‎06-02-2025

Hi! I'm trying to find a way to create a feature table from R and reticulate

Is it possible? Currently I'm not been able to make a pyspark dataframe to be passed from R to the create_table() function.

The code I'm trying to make it work follows:

install.packages("reticulate")
library(reticulate)
os <- import("os")
use_python(os$sys$executable)

library(tidyverse)
library(sparklyr)
# Connect to Spark
spark <- spark_connect(method = "databricks")

fs <- import("databricks.feature_engineering")
fe <- fs$FeatureEngineeringClient()

mtcars_id <- mtcars %>% rownames_to_column("car_id")
mtcars_sdf <- sdf_copy_to(spark, mtcars_id, overwrite = TRUE)
mtcars_sdf <- spark_dataframe(mtcars_sdf)

fe$create_table(
    name="databricks_asn.default.mtcars",
    primary_keys=c("car_id"),
    df=mtcars_sdf,
    description="MTCARS do R"
)

Louis_Frolio · ‎06-03-2025

Using the provided CONTEXT, it can be concluded that:

Creating Databricks Feature Tables using the create_table() function is well-documented for use with PySpark DataFrames. However, passing a PySpark DataFrame generated in R using sparklyr to the create_table() function via reticulate is not directly documented or supported.
The primary challenge is compatibility between the SparkR or sparklyr DataFrame and the PySpark DataFrame expected by the Databricks Feature Store API. This process is not explicitly described in the available documentation.
To work around this limitation, consider creating the feature table directly within PySpark after exporting the relevant data from R. Another approach is to save the DataFrame from R using the Delta table format and load it into a PySpark DataFrame in Python before invoking the create_table() function in the Feature Store API.

Hope this helps, Lou.

Databricks Community

FeatureEngineeringClient and R

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐