Databricks Community

Constantine · 06-02-2022

I get an error when writing dataframe to s3 location Found invalid character(s) among " ,;{}()\n\t=" in the column names of yourI have gone through all the columns and none of them have any special characters. Any idea how to fix this?

Constantine · 04-26-2022

I am reading data from a Kafka topic, say topic_a. I have an application, app_one which uses Spark Streaming to read data from topic_a. I have a checkpoint location, loc_a to store the checkpoint. Now, app_one has read data till offset 90.Can I creat...

Constantine · 04-13-2022

I have a table `demo_table_one` in which I want to upsert the following valuesdata = [ (11111 , 'CA', '2020-01-26'), (11111 , 'CA', '2020-02-26'), (88888 , 'CA', '2020-06-10'), (88888 , 'CA', '2020-05-10'), (88888 , 'WA', '2020-07-10'), ...

Constantine · 04-11-2022

I have data in a Spark Dataframe and I write it to an s3 location. It has some complex datatypes like structs etc. When I create the table on top on the s3 location by using CREATE TABLE IF NOT EXISTS table_name USING DELTA LOCATION 's3://.../...';Th...

Constantine · 04-10-2022

I write data to s3 like data.write.format("delta").mode("append").option("mergeSchema", "true").save(s3_location)and create a partitioned table likeCREATE TABLE IF NOT EXISTS demo_table USING DELTA PARTITIONED BY (column_a) LOCATION {s3_location};whi...

Constantine · 04-13-2022

Can't I do something like this in PySparkdeltaTable.as("orginal_table") .merge(df.as("update_table"), "orginal_table.state_code = update_table.state_code and orginal_table.attom_id = update_table.attom_id") .whenMatched("orginal_table.sell_da...

Constantine · 02-08-2022

Were you able to find a solution for this?

Databricks Community

User Stats

User Activity

Error when writing dataframe to s3 location using PySpark

Can we reuse checkpoints in Spark Streaming?

How to provide UPSERT condition in PySpark

Delta Table created on s3 has all null values

Unable to create a partitioned table on s3 data

Re: How to provide UPSERT condition in PySpark

Re: Report on SQL queries that are being executed