<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic mapInPandas not working in serverless compute in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128123#M48163</link>
    <description>&lt;P&gt;Using `mapInPandas` in serverless compute (Environment version 2) gives the following error,&lt;BR /&gt;```&lt;BR /&gt;&lt;SPAN class=""&gt;Py4JError: &lt;/SPAN&gt;&lt;SPAN&gt;An error occurred while calling o543.mapInPandas. Trace: py4j.Py4JException: Method mapInPandas([class org.apache.spark.sql.catalyst.expressions.PythonUDF, class java.lang.Boolean]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:352) at py4j.Gateway.invoke(Gateway.java:297) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:197) at py4j.ClientServerConnection.run(ClientServerConnection.java:117) at java.lang.Thread.run(Thread.java:750) &lt;/SPAN&gt;&lt;BR /&gt;```&lt;/P&gt;</description>
    <pubDate>Tue, 12 Aug 2025 05:53:37 GMT</pubDate>
    <dc:creator>chinmay0924</dc:creator>
    <dc:date>2025-08-12T05:53:37Z</dc:date>
    <item>
      <title>mapInPandas not working in serverless compute</title>
      <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128123#M48163</link>
      <description>&lt;P&gt;Using `mapInPandas` in serverless compute (Environment version 2) gives the following error,&lt;BR /&gt;```&lt;BR /&gt;&lt;SPAN class=""&gt;Py4JError: &lt;/SPAN&gt;&lt;SPAN&gt;An error occurred while calling o543.mapInPandas. Trace: py4j.Py4JException: Method mapInPandas([class org.apache.spark.sql.catalyst.expressions.PythonUDF, class java.lang.Boolean]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:352) at py4j.Gateway.invoke(Gateway.java:297) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:197) at py4j.ClientServerConnection.run(ClientServerConnection.java:117) at java.lang.Thread.run(Thread.java:750) &lt;/SPAN&gt;&lt;BR /&gt;```&lt;/P&gt;</description>
      <pubDate>Tue, 12 Aug 2025 05:53:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128123#M48163</guid>
      <dc:creator>chinmay0924</dc:creator>
      <dc:date>2025-08-12T05:53:37Z</dc:date>
    </item>
    <item>
      <title>Re: mapInPandas not working in serverless compute</title>
      <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128125#M48164</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/122867"&gt;@chinmay0924&lt;/a&gt;&lt;/P&gt;&lt;P&gt;Good day&lt;/P&gt;&lt;P&gt;According to the documentation - (&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/compute/serverless/limitations" target="_blank"&gt;https://learn.microsoft.com/en-us/azure/databricks/compute/serverless/limitations&lt;/A&gt;) - This is a limitation of databricks connect. Unfortunately, you have to work with spark.sql or dataframes orSwitch to a standard (non-serverless) all-purpose cluster or job cluster,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;I am open to other contributions on this issue.&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Aug 2025 06:26:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128125#M48164</guid>
      <dc:creator>Khaja_Zaffer</dc:creator>
      <dc:date>2025-08-12T06:26:58Z</dc:date>
    </item>
    <item>
      <title>Re: mapInPandas not working in serverless compute</title>
      <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128126#M48165</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/173840"&gt;@Khaja_Zaffer&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;The documentation you linked does not mention anywhere that mapInPandas is not supported. It says `&lt;SPAN&gt;Only Spark connect APIs are supported. Spark RDD APIs are not supported`. I have not used Spark RDD APIs. All I am trying to do is `dataframe.mapInPandas()` on a spark dataframe.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Aug 2025 06:33:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128126#M48165</guid>
      <dc:creator>chinmay0924</dc:creator>
      <dc:date>2025-08-12T06:33:52Z</dc:date>
    </item>
    <item>
      <title>Re: mapInPandas not working in serverless compute</title>
      <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128323#M48209</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/122867"&gt;@chinmay0924&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Are you using&amp;nbsp;&lt;BR /&gt;Serverless compute via notebook UI OR&lt;/P&gt;&lt;P&gt;Serverless compute via Databricks Connect&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Aug 2025 11:30:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/128323#M48209</guid>
      <dc:creator>Khaja_Zaffer</dc:creator>
      <dc:date>2025-08-13T11:30:37Z</dc:date>
    </item>
    <item>
      <title>Re: mapInPandas not working in serverless compute</title>
      <link>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/138563#M50962</link>
      <description>&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;The error you are seeing when using&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in serverless compute with Environment version 2 is due to an incompatibility in the environment's supported Spark features. Specifically, Environment version 2 on serverless compute does&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;not support&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;&lt;/STRONG&gt;, which triggers the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;Py4JException&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;indicating that the method does not exist on the JVM side of Spark in your environment.&lt;/P&gt;
&lt;H2 id="why-this-happens" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;Why This Happens&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Serverless Environment Restrictions&lt;/STRONG&gt;&lt;BR /&gt;The version of Spark or configuration for serverless pools (especially with certain environments like Databricks Runtime 11.x or higher) may not expose certain Apache Spark features, including&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;, for security and resource isolation reasons.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Method Not Available&lt;/STRONG&gt;&lt;BR /&gt;The error message means the Spark JVM backend does not recognize or export the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;method for remote invocation. It's not your code—it's the compute environment not supporting direct Python UDFs with this Spark construct.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="what-you-can-do" class="mb-2 mt-4 font-display font-semimedium text-base first:mt-0 md:text-lg [hr+&amp;amp;]:mt-4"&gt;What You Can Do&lt;/H2&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Switch to Standard Compute&lt;/STRONG&gt;&lt;BR /&gt;If possible, use a non-serverless, standard compute cluster or an environment version (like Databricks Runtime 9.x or below) where&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is supported.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Use Supported APIs&lt;/STRONG&gt;&lt;BR /&gt;In environments where&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE&gt;mapInPandas&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is not supported, use alternatives such as:&lt;/P&gt;
&lt;UL class="marker:text-quiet list-disc"&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;CODE&gt;applyInPandas&lt;/CODE&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;(on newer versions/environments that support only some Pandas UDFs)&lt;/P&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;Explicit Spark SQL or DataFrame operations&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI class="py-0 my-0 prose-p:pt-0 prose-p:mb-2 prose-p:my-0 [&amp;amp;&amp;gt;p]:pt-0 [&amp;amp;&amp;gt;p]:mb-2 [&amp;amp;&amp;gt;p]:my-0"&gt;
&lt;P class="my-2 [&amp;amp;+p]:mt-4 [&amp;amp;_strong:has(+br)]:inline-block [&amp;amp;_strong:has(+br)]:pb-2"&gt;&lt;STRONG&gt;Environment Upgrade/Change&lt;/STRONG&gt;&lt;BR /&gt;Check if a newer version of the serverless environment, or a configuration update, supports this method since feature support evolves frequently in managed Spark environments.&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 11 Nov 2025 10:46:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/mapinpandas-not-working-in-serverless-compute/m-p/138563#M50962</guid>
      <dc:creator>mark_ott</dc:creator>
      <dc:date>2025-11-11T10:46:06Z</dc:date>
    </item>
  </channel>
</rss>

