<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Bamboolib with databricks, low-code programming is now available on #databricks Now you can prep in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/36161#M26063</link>
    <description>&lt;PRE&gt;I have tried to load parquet file using bamboolib menu, and getting below error that path does not exist&lt;BR /&gt;I can load the same file without no problem using spark or pandas using following path&lt;BR /&gt;citi_pdf = pd.read_parquet(f'/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet', engine='pyarrow')&lt;BR /&gt;&lt;BR /&gt;does it work already or still has some bugs ?&lt;/PRE&gt;&lt;PRE&gt;AnalysisException: [PATH_NOT_FOUND] Path does not exist: dbfs:/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet.



Full stack trace:
-----------------------------
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/helper/gui_outlets.py", line 346, in safe_execution
hide_outlet = execute_function(self, *args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/setup/module_view.py", line 365, in open_parquet
df = exec_code(code, symbols=self.symbols, result_name=df_name)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/helper/utils.py", line 446, in exec_code
exec(code, exec_symbols, exec_symbols)
File "", line 1, in
File "/databricks/spark/python/pyspark/instrumentation_utils.py", line 48, in wrapper
res = func(*args, **kwargs)
File "/databricks/spark/python/pyspark/sql/readwriter.py", line 533, in parquet
return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
File "/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/databricks/spark/python/pyspark/errors/exceptions.py", line 234, in deco
raise converted from None
pyspark.errors.exceptions.AnalysisException: [PATH_NOT_FOUND] Path does not exist: dbfs:/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet.&lt;/PRE&gt;</description>
    <pubDate>Thu, 29 Jun 2023 10:52:12 GMT</pubDate>
    <dc:creator>Palkers</dc:creator>
    <dc:date>2023-06-29T10:52:12Z</dc:date>
    <item>
      <title>Bamboolib with databricks, low-code programming is now available on #databricks Now you can prepare your databricks code without ... coding. Low code ...</title>
      <link>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26527#M18560</link>
      <description>&lt;P&gt;Bamboolib with databricks, low-code programming is now available on #databricks&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now you can prepare your databricks code without ... coding. Low code solution is now available on Databricks. Install and import bamboolib to start (require a version of 11 DBR for Azure and AWS, 11.1 for GCC). %pip can be used to install or cluster settings -&amp;gt; “libraries” tab:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture2"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1332i6769199CA5736BE4/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture2" alt="Picture2" /&gt;&lt;/span&gt;As we see on the above screen, we have a few options.&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;read CSV files,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;read the parquet file,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;read the table from metastore (I bet it will be the most popular option),&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;or use some example dataset with ***** data&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We will use an example titanic dataset.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture3"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1334iD9EF47066FC03770/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture3" alt="Picture3" /&gt;&lt;/span&gt;Now we can make transformations and actions using the wizard. We can see below the auto-generated code:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="bamboolib"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1341i0B3FAE1E248457C3/image-size/large?v=v2&amp;amp;px=999" role="button" title="bamboolib" alt="bamboolib" /&gt;&lt;/span&gt;So, let’s assume that we select only two fields:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture4"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1347iB397608A46CEA619/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture4" alt="Picture4" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Put age to bins (0-10 years old, 10-20, 20-30, etc.):&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture5"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1353i5E4D2FF60EF6FA9C/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture5" alt="Picture5" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Group by and see the result together with the code:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture6"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1345iCCC38AAC99DA2946/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture6" alt="Picture6" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now we can copy our code and use it in our projects. We can remember replacing pandas with pandas on spark so it will be run distributed way.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;These are example transformations available:&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Select or drop columns,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Filter rows,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Sort rows,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Group by and aggregate,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Join / merge,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Change data types,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Change names,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Find and replace,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Conditional replace / if else&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Change DateTime frequency,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Extract DateTime,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Move column,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Bin column,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Concatenatete,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Pivot,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Unpivot,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Window functions,&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Plot creators&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks to the plot creator so we can visualize our data easily.&lt;/P&gt;&lt;P&gt;In the below example, we used a&amp;nbsp;bar plot.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Picture7"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1342i1EDBC25B46AFBD77/image-size/large?v=v2&amp;amp;px=999" role="button" title="Picture7" alt="Picture7" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Auto-generated code from the above example is as below:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import pandas as pd; import numpy as np
df = pd.read_csv(bam.titanic_csv)
&amp;nbsp;
# Step: Select columns
df = df[['Age', 'Seex']]
&amp;nbsp;
# Step: Bin column
df['Age'] = pd.cut(df['Age'], bins=[0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0], right=False, precision=0)
&amp;nbsp;
# Step: Group by and aggregate
df = df.groupby(['Age', 'Seex']).agg(Seex_count=('Seex', 'count')).reset_index()
&amp;nbsp;
# Step: Change data type of Age to String/Text
df['Age'] = df['Age'].astype('string')
&amp;nbsp;
 import plotly.express as px
fig = px.bar(df, x='Age', y='Seex_count', color='Seex')
fig&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2022 13:22:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26527#M18560</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-10-19T13:22:45Z</dc:date>
    </item>
    <item>
      <title>Re: Bamboolib with databricks, low-code programming is now available on #databricks Now you can prepare your databricks code without ... coding. Low code ...</title>
      <link>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26528#M18561</link>
      <description>&lt;P&gt;@Hubert Dudek​&amp;nbsp;Informative article, thanks for creating &lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2022 14:13:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26528#M18561</guid>
      <dc:creator>karthik_p</dc:creator>
      <dc:date>2022-10-19T14:13:09Z</dc:date>
    </item>
    <item>
      <title>Re: Bamboolib with databricks, low-code programming is now available on #databricks Now you can prepare your databricks code without ... coding. Low code ...</title>
      <link>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26529#M18562</link>
      <description>&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2022 19:56:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/26529#M18562</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-10-19T19:56:30Z</dc:date>
    </item>
    <item>
      <title>Re: Bamboolib with databricks, low-code programming is now available on #databricks Now you can prep</title>
      <link>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/36161#M26063</link>
      <description>&lt;PRE&gt;I have tried to load parquet file using bamboolib menu, and getting below error that path does not exist&lt;BR /&gt;I can load the same file without no problem using spark or pandas using following path&lt;BR /&gt;citi_pdf = pd.read_parquet(f'/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet', engine='pyarrow')&lt;BR /&gt;&lt;BR /&gt;does it work already or still has some bugs ?&lt;/PRE&gt;&lt;PRE&gt;AnalysisException: [PATH_NOT_FOUND] Path does not exist: dbfs:/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet.



Full stack trace:
-----------------------------
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/helper/gui_outlets.py", line 346, in safe_execution
hide_outlet = execute_function(self, *args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/setup/module_view.py", line 365, in open_parquet
df = exec_code(code, symbols=self.symbols, result_name=df_name)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/bamboolib/helper/utils.py", line 446, in exec_code
exec(code, exec_symbols, exec_symbols)
File "", line 1, in
File "/databricks/spark/python/pyspark/instrumentation_utils.py", line 48, in wrapper
res = func(*args, **kwargs)
File "/databricks/spark/python/pyspark/sql/readwriter.py", line 533, in parquet
return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
File "/databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/databricks/spark/python/pyspark/errors/exceptions.py", line 234, in deco
raise converted from None
pyspark.errors.exceptions.AnalysisException: [PATH_NOT_FOUND] Path does not exist: dbfs:/dbfs/mnt/orbify-sales-raw/WideWorldImportersDW/Dimension_City_new.parquet.&lt;/PRE&gt;</description>
      <pubDate>Thu, 29 Jun 2023 10:52:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bamboolib-with-databricks-low-code-programming-is-now-available/m-p/36161#M26063</guid>
      <dc:creator>Palkers</dc:creator>
      <dc:date>2023-06-29T10:52:12Z</dc:date>
    </item>
  </channel>
</rss>

