<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Urgent - Use Python Variable in shell command in databricks notebook in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12329#M7144</link>
    <description>&lt;P&gt;Yea I don't think it will work.&lt;/P&gt;&lt;P&gt;When you execute a shell command, you are working on the OS level.  Linux does not know about DBFS or Azure or AWS or ...&lt;/P&gt;&lt;P&gt;If you want to do this, you have to mount the data lake in Linux, and this is not that easy.&lt;/P&gt;&lt;P&gt;Databricks (or the Spark application) bootstraps all this and gives you dbfs so you do not have to worry about connectivity.&lt;/P&gt;&lt;P&gt;May I ask why you want to do this with a shell command?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 12 Jan 2023 15:33:01 GMT</pubDate>
    <dc:creator>-werners-</dc:creator>
    <dc:date>2023-01-12T15:33:01Z</dc:date>
    <item>
      <title>Urgent - Use Python Variable in shell command in databricks notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12328#M7143</link>
      <description>&lt;P&gt;I am trying to read a csv and do an activity from azure storage account using databricks shell script. I wanted to add this shell script into my big python code for other sources as well. I have created widgets for file path in python. I have created variables named file_datepath and get widgets data into it. I want to read csv from azure storage account using file_datepath variable in a shell command.&lt;/P&gt;&lt;P&gt;I tried below, its not working.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;dbutils.widgets.text("source_storage_path", "datafolder/data/")&lt;/P&gt;&lt;P&gt;dbutils.widgets.text("file_datepath", "2022/12/")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;source_storage_path = dbutils.widgets.get("source_storage_path")&lt;/P&gt;&lt;P&gt;file_datepath = dbutils.widgets.get("file_datepath")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%sh tr '\n' ' ' &amp;lt;'/dbfs/mnt/{source_storage_path}/{file_datepath}/*.csv' &amp;gt; '/dbfs/mnt/{source_storage_path}/{file_datepath}/*_new.csv'&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;but this is giving me error no such file or directry. I tried with $ and everything. Nothing seems working. Please help @Sherinus​&amp;nbsp;@Hubert Dudek​&amp;nbsp;@Werner Stinckens​&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Jan 2023 15:10:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12328#M7143</guid>
      <dc:creator>shamly</dc:creator>
      <dc:date>2023-01-12T15:10:55Z</dc:date>
    </item>
    <item>
      <title>Re: Urgent - Use Python Variable in shell command in databricks notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12329#M7144</link>
      <description>&lt;P&gt;Yea I don't think it will work.&lt;/P&gt;&lt;P&gt;When you execute a shell command, you are working on the OS level.  Linux does not know about DBFS or Azure or AWS or ...&lt;/P&gt;&lt;P&gt;If you want to do this, you have to mount the data lake in Linux, and this is not that easy.&lt;/P&gt;&lt;P&gt;Databricks (or the Spark application) bootstraps all this and gives you dbfs so you do not have to worry about connectivity.&lt;/P&gt;&lt;P&gt;May I ask why you want to do this with a shell command?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Jan 2023 15:33:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12329#M7144</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-01-12T15:33:01Z</dc:date>
    </item>
    <item>
      <title>Re: Urgent - Use Python Variable in shell command in databricks notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12330#M7145</link>
      <description>&lt;P&gt;Hi Werners,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have a csv which has double dagger delimitter and UTF-16 encoding. It has extra lines and spaces Some rows ends with CRLF and some ends with LF. So, I have created a shell script to handle this. Now, I wanted to integrate this shell script with my bigger python commands.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%sh tr '\n' ' ' &amp;lt;'/dbfs/mnt/datalake/data/file.csv' &amp;gt; '/dbfs/mnt/datalake/data/file_new.csv'&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;dff = spark.read.option("header", "true") \&lt;/P&gt;&lt;P&gt;.option("inferSchema", "true") \&lt;/P&gt;&lt;P&gt;.option('encoding', 'UTF-16') \&lt;/P&gt;&lt;P&gt;.option("delimiter", "‡‡,‡‡") \&lt;/P&gt;&lt;P&gt;.option("multiLine", True) \&lt;/P&gt;&lt;P&gt;.csv("/mnt/datalake/data/file_new.csv")&lt;/P&gt;&lt;P&gt;dffs_headers = dff.dtypes&lt;/P&gt;&lt;P&gt;for i in dffs_headers:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;columnLabel = i[0]&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;newColumnLabel = columnLabel.replace('‡‡','').replace('‡‡','')&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;dff=dff.withColumn(newColumnLabel,regexp_replace(columnLabel,'^\\‡‡|\\‡‡$|\\ ‡‡',''))&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;if columnLabel != newColumnLabel:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;dff = dff.drop(columnLabel)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#[display(dff)&lt;/P&gt;&lt;P&gt;display(dff)]​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now, I want to parameterise every path thats why I wrote the widgets, and get widgets etc&lt;/P&gt;</description>
      <pubDate>Thu, 12 Jan 2023 16:39:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12330#M7145</guid>
      <dc:creator>shamly</dc:creator>
      <dc:date>2023-01-12T16:39:53Z</dc:date>
    </item>
    <item>
      <title>Re: Urgent - Use Python Variable in shell command in databricks notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12331#M7146</link>
      <description>&lt;P&gt;No need for a shell script.  With spark and regex you can handle the most messed up files.&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jan 2023 08:11:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12331#M7146</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2023-01-13T08:11:47Z</dc:date>
    </item>
    <item>
      <title>Re: Urgent - Use Python Variable in shell command in databricks notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12332#M7147</link>
      <description>&lt;P&gt;You can mount the storage account and then can set env level variable and can do the operation that you want.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Apr 2023 08:31:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/urgent-use-python-variable-in-shell-command-in-databricks/m-p/12332#M7147</guid>
      <dc:creator>SS2</dc:creator>
      <dc:date>2023-04-05T08:31:20Z</dc:date>
    </item>
  </channel>
</rss>

