<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Combine Python + R in data manipulation in Databricks Notebook in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6757#M2779</link>
    <description>&lt;P&gt;@Oscar CENTENO MORA​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To combine Py and R in a Databricks notebook, you can use the magics command %python and %r&lt;/P&gt;&lt;P&gt;to switch between Python and R cells. Here's an example of how to create a Spark DataFrame in Python and then use it in R:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
# Create a Spark session
spark = SparkSession.builder.appName("CreateDataFrame").getOrCreate()
&amp;nbsp;
# Create a sample DataFrame in Python
data = [("Alice", 25), ("Bob", 30), ("Charlie", 35), ("Oscar",36), ("Hiromi",41), ("Alejandro", 42)]
df = spark.createDataFrame(data, ["Name", "Age"])
&amp;nbsp;
# Use the %python magic to switch to a Python cell
%python
&amp;nbsp;
# Convert the Python DataFrame to an R DataFrame using sparklyr
library(sparklyr)
library(dplyr)
sdf &amp;lt;- spark_dataframe(df)
rdf &amp;lt;- sdf %&amp;gt;% invoke("toDF", "Name", "Age")
&amp;nbsp;
# Use the %r magic to switch to an R cell
%r
&amp;nbsp;
# Print the R DataFrame
print(rdf)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that the sparklyr package must be installed in the R environment using the install.packages()&lt;/P&gt;&lt;P&gt;function, as shown in your example. Also, make sure that the Spark cluster is running and accessible from your notebook.&lt;/P&gt;</description>
    <pubDate>Sun, 02 Apr 2023 16:11:44 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-04-02T16:11:44Z</dc:date>
    <item>
      <title>Combine Python + R in data manipulation in Databricks Notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6756#M2778</link>
      <description>&lt;P&gt;Want to combine Py + R&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;from pyspark.sql import SparkSession&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;spark = SparkSession.builder.appName("CreateDataFrame").getOrCreate()&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;# Create a sample DataFrame&lt;/P&gt;&lt;P&gt;data = [("Alice", 25), ("Bob", 30), ("Charlie", 35), ("Oscar",36), ("Hiromi",41), ("Alejandro", 42)]&lt;/P&gt;&lt;P&gt;df = spark.createDataFrame(data, ["Name", "Age"])&lt;/P&gt;&lt;P&gt;display(df)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And R&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%r&lt;/P&gt;&lt;P&gt;install.packages("sparklyr", version ="1.8.0")&lt;/P&gt;&lt;P&gt;library(sparklyr)&lt;/P&gt;&lt;P&gt;# Connect to the same Spark cluster&lt;/P&gt;&lt;P&gt;sc &amp;lt;- spark_connect(master = "yarn-client", version = "1.8.0"&lt;/P&gt;&lt;P&gt;               )&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I have the error&lt;/P&gt;&lt;P&gt;**Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, : Gateway in localhost:8880 did not respond.&lt;/P&gt;&lt;P&gt;Try running&amp;nbsp;&lt;/P&gt;&lt;P&gt;options(sparklyr.log.console = TRUE)&lt;/P&gt;&lt;P&gt;&amp;nbsp;followed by&amp;nbsp;&lt;/P&gt;&lt;P&gt;sc &amp;lt;- spark_connect(...)&lt;/P&gt;&lt;P&gt;&amp;nbsp;for more debugging info. Some( Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, : Gateway in localhost:8880 did not respond. )**&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any Idea how can I combine both programming Languages in Databricks notebook?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2023 17:24:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6756#M2778</guid>
      <dc:creator>Osky_Rosky</dc:creator>
      <dc:date>2023-03-30T17:24:40Z</dc:date>
    </item>
    <item>
      <title>Re: Combine Python + R in data manipulation in Databricks Notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6757#M2779</link>
      <description>&lt;P&gt;@Oscar CENTENO MORA​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To combine Py and R in a Databricks notebook, you can use the magics command %python and %r&lt;/P&gt;&lt;P&gt;to switch between Python and R cells. Here's an example of how to create a Spark DataFrame in Python and then use it in R:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
# Create a Spark session
spark = SparkSession.builder.appName("CreateDataFrame").getOrCreate()
&amp;nbsp;
# Create a sample DataFrame in Python
data = [("Alice", 25), ("Bob", 30), ("Charlie", 35), ("Oscar",36), ("Hiromi",41), ("Alejandro", 42)]
df = spark.createDataFrame(data, ["Name", "Age"])
&amp;nbsp;
# Use the %python magic to switch to a Python cell
%python
&amp;nbsp;
# Convert the Python DataFrame to an R DataFrame using sparklyr
library(sparklyr)
library(dplyr)
sdf &amp;lt;- spark_dataframe(df)
rdf &amp;lt;- sdf %&amp;gt;% invoke("toDF", "Name", "Age")
&amp;nbsp;
# Use the %r magic to switch to an R cell
%r
&amp;nbsp;
# Print the R DataFrame
print(rdf)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that the sparklyr package must be installed in the R environment using the install.packages()&lt;/P&gt;&lt;P&gt;function, as shown in your example. Also, make sure that the Spark cluster is running and accessible from your notebook.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Apr 2023 16:11:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6757#M2779</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-02T16:11:44Z</dc:date>
    </item>
    <item>
      <title>Re: Combine Python + R in data manipulation in Databricks Notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6758#M2780</link>
      <description>&lt;P&gt;Hello, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I did exactly that, and no, the %r or %python, which indicate in each command what the programming language is, still gives an error. This is the error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="imagen"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/441i807AA9CE59854296/image-size/large?v=v2&amp;amp;px=999" role="button" title="imagen" alt="imagen" /&gt;&lt;/span&gt;What you mentioned was in the guides and forums, but testing it still doesn't give a correct result.&lt;/P&gt;</description>
      <pubDate>Mon, 03 Apr 2023 15:25:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/combine-python-r-in-data-manipulation-in-databricks-notebook/m-p/6758#M2780</guid>
      <dc:creator>Osky_Rosky</dc:creator>
      <dc:date>2023-04-03T15:25:18Z</dc:date>
    </item>
  </channel>
</rss>

