Re: Urgent - Use Python Variable in shell command ...

shamly · ‎01-12-2023

Hi Werners,

I have a csv which has double dagger delimitter and UTF-16 encoding. It has extra lines and spaces Some rows ends with CRLF and some ends with LF. So, I have created a shell script to handle this. Now, I wanted to integrate this shell script with my bigger python commands.

%sh tr '\n' ' ' <'/dbfs/mnt/datalake/data/file.csv' > '/dbfs/mnt/datalake/data/file_new.csv'

dff = spark.read.option("header", "true") \

.option("inferSchema", "true") \

.option('encoding', 'UTF-16') \

.option("delimiter", "‡‡,‡‡") \

.option("multiLine", True) \

.csv("/mnt/datalake/data/file_new.csv")

dffs_headers = dff.dtypes

for i in dffs_headers:

columnLabel = i[0]

newColumnLabel = columnLabel.replace('‡‡','').replace('‡‡','')

dff=dff.withColumn(newColumnLabel,regexp_replace(columnLabel,'^\\‡‡|\\‡‡$|\\ ‡‡',''))

if columnLabel != newColumnLabel:

dff = dff.drop(columnLabel)

#[display(dff)

display(dff)]

Now, I want to parameterise every path thats why I wrote the widgets, and get widgets etc