<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Lakebridge Transpiler Fails with UnicodeDecodeError while Analyzer Works Successfully in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131193#M49001</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hello Team,&lt;/P&gt;&lt;P&gt;I am facing an issue with Lakebridge transpiler.&lt;BR /&gt;The Analyzer step runs successfully and produces the expected analysis files. However, when I run the Transpiler, it fails with the following error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;ERROR [src/databricks/labs/Lakebridge.transpile] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 71: character maps to &amp;lt;undefined&amp;gt; Error: unexpected end of JSON input Lakebridge Transpile failed with exit code 1&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;Command I executed:&lt;/P&gt;&lt;P&gt;&lt;FONT color="#FF0000"&gt;databricks labs lakebridge transpile --input-source "C:\Users\user_name\Downloads\segment_pioneer" --source-dialect synapse --output-folder "C:\Users\user_name\Downloads\segment_pioneer\output\Converted_Code"&lt;/FONT&gt;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&lt;STRONG&gt;What confuses me is that:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;The Analyzer works fine and completes successfully.&lt;/P&gt;&lt;P&gt;The Transpiler fails immediately with encoding-related error.&lt;/P&gt;&lt;P&gt;If there was a code issue in SQL, I would expect the Analyzer to also fail. So it seems related to how files/paths are being read by the transpiler (maybe encoding issue in Windows).&lt;/P&gt;&lt;P&gt;Could you please help clarify:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Why Analyzer runs but Transpiler fails on the same input?&lt;/LI&gt;&lt;LI&gt;Is there a known workaround for the UnicodeDecodeError on Windows (e.g., forcing UTF-8)?&lt;/LI&gt;&lt;LI&gt;Should I try running this with a different CLI encoding setting?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 08 Sep 2025 07:00:37 GMT</pubDate>
    <dc:creator>shashankB</dc:creator>
    <dc:date>2025-09-08T07:00:37Z</dc:date>
    <item>
      <title>Lakebridge Transpiler Fails with UnicodeDecodeError while Analyzer Works Successfully</title>
      <link>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131193#M49001</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hello Team,&lt;/P&gt;&lt;P&gt;I am facing an issue with Lakebridge transpiler.&lt;BR /&gt;The Analyzer step runs successfully and produces the expected analysis files. However, when I run the Transpiler, it fails with the following error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;ERROR [src/databricks/labs/Lakebridge.transpile] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 71: character maps to &amp;lt;undefined&amp;gt; Error: unexpected end of JSON input Lakebridge Transpile failed with exit code 1&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;Command I executed:&lt;/P&gt;&lt;P&gt;&lt;FONT color="#FF0000"&gt;databricks labs lakebridge transpile --input-source "C:\Users\user_name\Downloads\segment_pioneer" --source-dialect synapse --output-folder "C:\Users\user_name\Downloads\segment_pioneer\output\Converted_Code"&lt;/FONT&gt;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&lt;STRONG&gt;What confuses me is that:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;The Analyzer works fine and completes successfully.&lt;/P&gt;&lt;P&gt;The Transpiler fails immediately with encoding-related error.&lt;/P&gt;&lt;P&gt;If there was a code issue in SQL, I would expect the Analyzer to also fail. So it seems related to how files/paths are being read by the transpiler (maybe encoding issue in Windows).&lt;/P&gt;&lt;P&gt;Could you please help clarify:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Why Analyzer runs but Transpiler fails on the same input?&lt;/LI&gt;&lt;LI&gt;Is there a known workaround for the UnicodeDecodeError on Windows (e.g., forcing UTF-8)?&lt;/LI&gt;&lt;LI&gt;Should I try running this with a different CLI encoding setting?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 07:00:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131193#M49001</guid>
      <dc:creator>shashankB</dc:creator>
      <dc:date>2025-09-08T07:00:37Z</dc:date>
    </item>
    <item>
      <title>Re: Lakebridge Transpiler Fails with UnicodeDecodeError while Analyzer Works Successfully</title>
      <link>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131197#M49003</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/121720"&gt;@shashankB&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;Maybe for analyzer they're using encoding-tolerant methods? The code is open-sourced so I guess you can check it in free time.&lt;BR /&gt;&lt;BR /&gt;Could you open your input file in VSCode and check encoding? Also do you have some weird characters in your input file? Maybe some comments?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 07:17:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131197#M49003</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-08T07:17:08Z</dc:date>
    </item>
    <item>
      <title>Re: Lakebridge Transpiler Fails with UnicodeDecodeError while Analyzer Works Successfully</title>
      <link>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131227#M49013</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Root Cause&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The trailing “unexpected end of JSON input” suggests the decoder aborted midway, producing invalid JSON.&lt;/LI&gt;&lt;LI&gt;This mismatch between file content (likely UTF-8 or containing special characters) and default Windows decoding causes the issue.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Suggested Solutions&lt;/STRONG&gt;&lt;BR /&gt;1. Force UTF-8 decoding in the Transpiler&lt;/P&gt;&lt;P&gt;If you have control over the CLI or transpiler's Python code, ensure file opening is done with:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; open(filename, 'r', encoding='utf-8')&lt;/P&gt;&lt;P&gt;2. Set Python's environment to use UTF-8 by default&lt;BR /&gt;You can try running the transpiler in UTF-8 mode using:&lt;/P&gt;&lt;P&gt;py -Xutf8 -m databricks.labs.lakebridge transpile ...&lt;/P&gt;&lt;P&gt;3. Convert files to UTF-8 before transpiling&lt;/P&gt;&lt;P&gt;If possible, ensure your source files are encoded in UTF-8. :&lt;/P&gt;&lt;P&gt;import codecs&lt;BR /&gt;with codecs.open(src, 'r', encoding='cp1252', errors='ignore') as f_in, \&lt;BR /&gt;codecs.open(dst, 'w', encoding='utf-8') as f_out:&lt;BR /&gt;f_out.write(f_in.read())&lt;/P&gt;&lt;P&gt;Pls let me know if any of the above works&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 11:29:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakebridge-transpiler-fails-with-unicodedecodeerror-while/m-p/131227#M49013</guid>
      <dc:creator>ManojkMohan</dc:creator>
      <dc:date>2025-09-08T11:29:13Z</dc:date>
    </item>
  </channel>
</rss>

