<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic CREATE FUNCTION from Python file in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13367#M8066</link>
    <description>&lt;P&gt;Is it somehow possible to create an SQL external function using Python code?&lt;/P&gt;&lt;P&gt;the examples only show how to use JARs&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.html" target="test_blank"&gt;https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;something like:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf' USING FILE '/tmp/SimpleUdf.py';&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 14 Oct 2021 20:12:48 GMT</pubDate>
    <dc:creator>gbrueckl</dc:creator>
    <dc:date>2021-10-14T20:12:48Z</dc:date>
    <item>
      <title>CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13367#M8066</link>
      <description>&lt;P&gt;Is it somehow possible to create an SQL external function using Python code?&lt;/P&gt;&lt;P&gt;the examples only show how to use JARs&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.html" target="test_blank"&gt;https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;something like:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf' USING FILE '/tmp/SimpleUdf.py';&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 14 Oct 2021 20:12:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13367#M8066</guid>
      <dc:creator>gbrueckl</dc:creator>
      <dc:date>2021-10-14T20:12:48Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13368#M8067</link>
      <description>&lt;P&gt;I would think the USING FILE would work.&lt;/P&gt;&lt;P&gt;As long as you follow the class_name requirements.&lt;/P&gt;&lt;P&gt;&lt;I&gt;The implementing class should extend one of the base classes as follows:&lt;/I&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;I&gt;Should extend UDF or UDAF in org.apache.hadoop.hive.ql.exec package.&lt;/I&gt;&lt;/LI&gt;&lt;LI&gt;&lt;I&gt;Should extend AbstractGenericUDAFResolver, GenericUDF, or GenericUDTF in org.apache.hadoop.hive.ql.udf.generic package.&lt;/I&gt;&lt;/LI&gt;&lt;LI&gt;&lt;I&gt;Should extend UserDefinedAggregateFunction in org.apache.spark.sql.expressions package.&lt;/I&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also the docs literally state python is possible:&lt;/P&gt;&lt;P&gt;&lt;I&gt;In addition to the SQL interface, Spark allows you to create custom user defined scalar and aggregate functions using Scala, Python, and Java APIs. See &lt;/I&gt;&lt;A href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-functions-udf-scalar.html" alt="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-functions-udf-scalar.html" target="_blank"&gt;&lt;I&gt;User-defined scalar functions (UDFs)&lt;/I&gt;&lt;/A&gt;&lt;I&gt; and &lt;/I&gt;&lt;A href="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-functions-udf-aggregate.html" alt="https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-functions-udf-aggregate.html" target="_blank"&gt;&lt;I&gt;User-defined aggregate functions (UDAFs)&lt;/I&gt;&lt;/A&gt;&lt;I&gt; for more information.&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So it should be possible, maybe your python class does not meet the requirements?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Oct 2021 11:55:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13368#M8067</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-10-15T11:55:06Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13369#M8068</link>
      <description>&lt;P&gt;For python which class to extend then? All of the listed parent classes are java&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jan 2022 22:31:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13369#M8068</guid>
      <dc:creator>Mumu</dc:creator>
      <dc:date>2022-01-27T22:31:41Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13370#M8069</link>
      <description>&lt;P&gt;@Wugang Xu​&amp;nbsp;- My name is Piper, and I'm a moderator here for Databricks. Thanks for coming to us with your question. We'll give the members a bit longer to respond and come back if we need to. Thanks in advance for your patience. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;/P&gt;</description>
      <pubDate>Mon, 31 Jan 2022 23:20:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13370#M8069</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-01-31T23:20:16Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13371#M8070</link>
      <description>&lt;P&gt;for pyspark you can use udf().&lt;/P&gt;&lt;P&gt;&lt;A href="https://sparkbyexamples.com/pyspark/pyspark-udf-user-defined-function/" alt="https://sparkbyexamples.com/pyspark/pyspark-udf-user-defined-function/" target="_blank"&gt;Here is an example&lt;/A&gt; on how to do this.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Feb 2022 07:08:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13371#M8070</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-02-01T07:08:04Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13372#M8071</link>
      <description>&lt;P&gt;Thanks for your response. What I am looking for is to define a view with the UDF. However, a session level UDF as described in this example you provided does not seem to allow that. Maybe I should clarify my question as to define a external UDF like those Hive ones.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Feb 2022 14:43:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13372#M8071</guid>
      <dc:creator>Mumu</dc:creator>
      <dc:date>2022-02-01T14:43:13Z</dc:date>
    </item>
    <item>
      <title>Re: CREATE FUNCTION from Python file</title>
      <link>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13373#M8072</link>
      <description>&lt;P&gt;As a user of your code, I'd find it a less pleasant API because I'd have to some_module.some_func.some_func() rather than just some_module.some_func()&lt;/P&gt;&lt;P&gt;No reason to have "some_func" exist twice in the hierarchy. It's kind of redundant. If some_func is so large that adding any more ocde to the file seems crazy, maybe some_func is too large and you want to refactor it and simplify it.&lt;/P&gt;&lt;P&gt;Having one file serve one purpose makes sense. Having it literally have only a single function and nothing else is pretty unusual.&lt;/P&gt;</description>
      <pubDate>Sat, 05 Feb 2022 02:11:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/create-function-from-python-file/m-p/13373#M8072</guid>
      <dc:creator>pts</dc:creator>
      <dc:date>2022-02-05T02:11:28Z</dc:date>
    </item>
  </channel>
</rss>

