<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Read Write Stream Data from Event Hub to Databricks Delta Lake in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112711#M44304</link>
    <description>&lt;P&gt;I am trying to read streaming data from EventHub which is in JSON Format, able to read data in a data frame but the body type was coming as binary I have converted it to string and decoded but while implementing the write stream I am facing an ongoing error as follows:&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&lt;DIV&gt;&lt;DIV class=""&gt;terminated with exception: Illegal base64 character 7b SQLSTATE: XXKST and another error as Input byte array has wrong 4-byte ending unit&lt;/DIV&gt;&lt;DIV class=""&gt;It also does not accept a JSON input array. I have also tried implementing UTF-8 and base64 decoding logic, but it didn't work.&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Sat, 15 Mar 2025 21:46:45 GMT</pubDate>
    <dc:creator>Meghana89</dc:creator>
    <dc:date>2025-03-15T21:46:45Z</dc:date>
    <item>
      <title>Read Write Stream Data from Event Hub to Databricks Delta Lake</title>
      <link>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112711#M44304</link>
      <description>&lt;P&gt;I am trying to read streaming data from EventHub which is in JSON Format, able to read data in a data frame but the body type was coming as binary I have converted it to string and decoded but while implementing the write stream I am facing an ongoing error as follows:&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&lt;DIV&gt;&lt;DIV class=""&gt;terminated with exception: Illegal base64 character 7b SQLSTATE: XXKST and another error as Input byte array has wrong 4-byte ending unit&lt;/DIV&gt;&lt;DIV class=""&gt;It also does not accept a JSON input array. I have also tried implementing UTF-8 and base64 decoding logic, but it didn't work.&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 15 Mar 2025 21:46:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112711#M44304</guid>
      <dc:creator>Meghana89</dc:creator>
      <dc:date>2025-03-15T21:46:45Z</dc:date>
    </item>
    <item>
      <title>Re: Read Write Stream Data from Event Hub to Databricks Delta Lake</title>
      <link>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112713#M44305</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/153610"&gt;@Meghana89&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Can you please share your code snippet?&lt;/P&gt;</description>
      <pubDate>Sun, 16 Mar 2025 01:42:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112713#M44305</guid>
      <dc:creator>SantoshJoshi</dc:creator>
      <dc:date>2025-03-16T01:42:48Z</dc:date>
    </item>
    <item>
      <title>Re: Read Write Stream Data from Event Hub to Databricks Delta Lake</title>
      <link>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112801#M44334</link>
      <description>&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/152565"&gt;@SantoshJoshi&lt;/a&gt;&amp;nbsp; Thanks for reply please find the code snippet below&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; functions &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; F&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql.types &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; StringType&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; base64&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Define the Event Hubs connection string&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;connectionString &lt;/SPAN&gt;&lt;SPAN&gt;= endpoint (replace with endpoint from event hub)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;ehConf &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; {&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;'eventhubs.connectionString'&lt;/SPAN&gt;&lt;SPAN&gt;: connectionString&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Function to decode base64 with proper error handling&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;try_base64_decode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;data&lt;/SPAN&gt;&lt;SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; data &lt;/SPAN&gt;&lt;SPAN&gt;is&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;None&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;try&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Replace URL-safe characters&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; data.&lt;/SPAN&gt;&lt;SPAN&gt;replace&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'-'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'+'&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;replace&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'_'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'/'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;# Handle padding issues&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; padding_needed &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt; &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;SPAN&gt;len&lt;/SPAN&gt;&lt;SPAN&gt;(data) &lt;/SPAN&gt;&lt;SPAN&gt;%&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;len&lt;/SPAN&gt;&lt;SPAN&gt;(data) &lt;/SPAN&gt;&lt;SPAN&gt;%&lt;/SPAN&gt; &lt;SPAN&gt;4&lt;/SPAN&gt; &lt;SPAN&gt;!=&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt; &lt;SPAN&gt;else&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; data &lt;/SPAN&gt;&lt;SPAN&gt;+=&lt;/SPAN&gt; &lt;SPAN&gt;'='&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt; padding_needed&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; decoded_bytes &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; base64.&lt;/SPAN&gt;&lt;SPAN&gt;b64decode&lt;/SPAN&gt;&lt;SPAN&gt;(data)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; decoded_bytes.&lt;/SPAN&gt;&lt;SPAN&gt;decode&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'utf-8'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;errors&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;'replace'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;except&lt;/SPAN&gt; &lt;SPAN&gt;Exception&lt;/SPAN&gt; &lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; e:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"Failed to decode: &lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;str&lt;/SPAN&gt;&lt;SPAN&gt;(e)&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Apply the UDF for base64 decoding&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;decode_udf &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; F.&lt;/SPAN&gt;&lt;SPAN&gt;udf&lt;/SPAN&gt;&lt;SPAN&gt;(try_base64_decode, &lt;/SPAN&gt;&lt;SPAN&gt;StringType&lt;/SPAN&gt;&lt;SPAN&gt;())&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Read raw data from Event Hubs&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .read \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"eventhubs"&lt;/SPAN&gt;&lt;SPAN&gt;) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;options&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;**&lt;/SPAN&gt;&lt;SPAN&gt;ehConf) \&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; .&lt;/SPAN&gt;&lt;SPAN&gt;load&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Show the schema of the raw data to inspect the types&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df.&lt;/SPAN&gt;&lt;SPAN&gt;printSchema&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Cast the body column to string and apply base64 decoding if necessary&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;withColumn&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"body"&lt;/SPAN&gt;&lt;SPAN&gt;, df[&lt;/SPAN&gt;&lt;SPAN&gt;"body"&lt;/SPAN&gt;&lt;SPAN&gt;].&lt;/SPAN&gt;&lt;SPAN&gt;cast&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"string"&lt;/SPAN&gt;&lt;SPAN&gt;)) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# Cast body to string&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;withColumn&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"decoded_body"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;decode_udf&lt;/SPAN&gt;&lt;SPAN&gt;(F.&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"body"&lt;/SPAN&gt;&lt;SPAN&gt;))) &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;# Apply the decoding UDF&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;# Show the results (first few rows of the raw and decoded body)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"body"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"decoded_body"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;show&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;truncate&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;False&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;error :&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Input byte array has wrong 4-byte ending unit&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 17 Mar 2025 12:48:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/read-write-stream-data-from-event-hub-to-databricks-delta-lake/m-p/112801#M44334</guid>
      <dc:creator>Meghana89</dc:creator>
      <dc:date>2025-03-17T12:48:26Z</dc:date>
    </item>
  </channel>
</rss>

