Data Engineering

Forum Posts

Sorted by:

by naveen123 • New Contributor II

12-23-2022 10:59:06 PM

2554 Views
3 replies
3 kudos

Previous data is getting wiped off for delta tables

I am using only insert sql query to insert the hist. load but previous data getting deleted.Tried with python query also but same issue persists.Reading the data from gcp bucket(parquet file)writing the data into gcp bucket(delta file)..the deleted f...

Data Engineering

2554 Views
3 replies
3 kudos

12-23-2022 10:59:06 PM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

12-27-2022 3:04:40 PM

3 kudos

Share your query and also look for any error messages in the driver logs. This might help to undertand better what is happening.

3 kudos

12-27-2022 3:04:40 PM

2 More Replies

by KKo • Contributor III

12-07-2022 7:18:38 AM

2738 Views
2 replies
2 kudos

Not seeing rewards on Canary

Hi @Christy Seto I received the fundamental certificate and have joined the community group but still not showing any rewards on Canary, it's been a week I did those. Could you please have a look on it. Thanks in advance!

Data Engineering

2738 Views
2 replies
2 kudos

12-07-2022 7:18:38 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-20-2022 6:19:38 AM

2 kudos

check today , it is updating every Monday for me

2 kudos

12-20-2022 6:19:38 AM

1 More Replies

by db-avengers2rul • Contributor II

12-25-2022 11:57:58 AM

2175 Views
1 replies
0 kudos

Connect to PostgreSQL to Databricks community edition error

Dear Team,I am trying to establish a connectivity to PostgreSQL to Databricks community edition using sql notebookhowever I am encountering the below errorError in SQL statement: IllegalArgumentException: requirement failed: Host name should not cont...

Data Engineering

2175 Views
1 replies
0 kudos

12-25-2022 11:57:58 AM

View Replies

Latest Reply

db-avengers2rul
Contributor II

12-27-2022 12:52:28 PM

0 kudos

@Teamany suggestions ?

0 kudos

12-27-2022 12:52:28 PM

by KuldeepChitraka • New Contributor III

12-27-2022 6:22:00 AM

1815 Views
1 replies
5 kudos

Lakehosue table structure design

We are in process of implementing a lakehouse using Azure Databricks. We already have a datalake in placeAzure Storage Datalake – Contains containers which has data in its native format.How we are planningBuild Bronze layer by create bronze tables by...

Data Engineering

1815 Views
1 replies
5 kudos

12-27-2022 6:22:00 AM

View Replies

Latest Reply

Rishabh-Pandey
Databricks MVP

12-27-2022 9:23:38 AM

5 kudos

hey @Kuldeep Chitrakar as you are saying that you do not have a partitioning in bronze tables , so according to that statement that is okay . but in silver as you are going to implement partitioning so , what i will recommend to you is that for bett...

5 kudos

12-27-2022 9:23:38 AM

by Rishabh-Pandey • Databricks MVP

12-26-2022 11:53:27 PM

5684 Views
0 replies
6 kudos

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows ...

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows you to connect to various data sources, including Delta Lake. Here's how to do it:Install the Micros...

Data Engineering

5684 Views
0 replies
6 kudos

12-26-2022 11:53:27 PM

by supremefist • New Contributor III

03-16-2022 4:51:08 AM

8798 Views
3 replies
2 kudos

Resolved! New spark cluster being configured in local mode

Hi,We have two workspaces on Databricks, prod and dev. On prod, if we create a new all-purpose cluster through the web interface and go to Environment in the the spark UI, the spark.master setting is correctly set to be the host IP. This results in a...

Data Engineering

8798 Views
3 replies
2 kudos

03-16-2022 4:51:08 AM

View Replies

Latest Reply

scottb
New Contributor II

12-26-2022 7:05:48 PM

2 kudos

I found the same issue when choosing the default cluster setup on first setup that when I went to edit the cluster to add an instance profile, I was not able to save without fixing this. Thanks for the tip

2 kudos

12-26-2022 7:05:48 PM

2 More Replies

by SIRIGIRI • Databricks Partner

12-26-2022 8:07:02 AM

1443 Views
2 replies
2 kudos

sharikrishna26.medium.com

Spark Dataframe MetadataSpark Dataframe is structurally the same as the table. However, it does not store any schema information in the metadata store. Instead, we have a runtime metadata catalog to store the Dataframe schema information. It is simil...

Data Engineering

1443 Views
2 replies
2 kudos

12-26-2022 8:07:02 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-26-2022 5:00:51 PM

2 kudos

this is awesome thanks

2 kudos

12-26-2022 5:00:51 PM

1 More Replies

by SIRIGIRI • Databricks Partner

12-17-2022 6:32:47 AM

3401 Views
3 replies
5 kudos

Availability Zone in Azure

Please Find the content Herehttps://medium.com/@sharikrishna26/availability-zone-in-azure-52e7764357b6

Data Engineering

3401 Views
3 replies
5 kudos

12-17-2022 6:32:47 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-17-2022 10:02:39 PM

5 kudos

yeah this is awesome thanks, but ever you think that what will happen if in that region no instance will left , how our jobs will start , any idea here?

5 kudos

12-17-2022 10:02:39 PM

2 More Replies

by CBull • New Contributor III

03-17-2022 11:38:31 AM

2832 Views
3 replies
2 kudos

Spark Notebook to import data into Excel

Is there a way to create a notebook that will take the SQL that I want to put into the Notebook and populate Excel daily and send it to a particular person?

Data Engineering

2832 Views
3 replies
2 kudos

03-17-2022 11:38:31 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 8:32:00 AM

2 kudos

@Aviral Bhardwaj thanks for this, I was needed this info

2 kudos

12-26-2022 8:32:00 AM

2 More Replies

by bradm0 • New Contributor III

09-20-2022 7:51:42 AM

5145 Views
3 replies
3 kudos

Resolved! Use of badRecordsPath in COPY INTO SQL command

I'm trying to use the badRecordsPath to catch improperly formed records in a CSV file and continue loading the remainder of the file. I can get the option to work using python like thisdf = spark.read\ .format("csv")\ .option("header","true")\ .op...

Data Engineering

5145 Views
3 replies
3 kudos

09-20-2022 7:51:42 AM

View Replies

Latest Reply

bradm0
New Contributor III

09-20-2022 10:10:10 AM

3 kudos

Thanks. It was the inferSchema setting. I tried it with and without the SELECT and it worked both ways when I added inferSchemaBoth of these workeddrop table my_db.t2; create table my_db.t2 (col1 int,col2 int); copy into my_db.t2 from (SELECT cast(...

3 kudos

09-20-2022 10:10:10 AM

2 More Replies

by Meghala • Valued Contributor II

12-26-2022 6:27:48 AM

2321 Views
2 replies
2 kudos

Resolved! Good evening people, can any one guide me flow and / how to work with databricks notebook for sample project like pyspark

Data Engineering

2321 Views
2 replies
2 kudos

12-26-2022 6:27:48 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-26-2022 6:57:54 AM

2 kudos

Hi @S Meghala ,Please go through this Github link you will get good amount of data here ,this way you can learn morehttps://github.com/AlexIoannides/pyspark-example-projectPlease select my answer as best answer if your query is fulfilled ThanksAvira...

2 kudos

12-26-2022 6:57:54 AM

1 More Replies

by hello_world • Databricks Partner

12-24-2022 6:48:37 PM

6064 Views
7 replies
3 kudos

What happens if I have both DLTs and normal tables in a single notebook?

I've just learned Delta Live Tables on Databricks Academy and have no environment to try it out.I'm wondering what happens to the pipeline if the notebook consists of both normal tables and DLTs. For exampleTable ADLT A that reads and cleans Table AT...

Data Engineering

6064 Views
7 replies
3 kudos

12-24-2022 6:48:37 PM

View Replies

Latest Reply

Rishabh-Pandey
Databricks MVP

12-25-2022 11:15:14 PM

3 kudos

hey ,@S L According to you , you have normal table table A and DLT table Table B , so it will give thrown an error that your upstream table is not streaming Live table and you need to create streaming live table Table a , if you want to use the ou...

3 kudos

12-25-2022 11:15:14 PM

6 More Replies

by THIAM_HUATTAN • Valued Contributor

12-26-2022 4:59:22 AM

3730 Views
2 replies
2 kudos

Subquery does not work in Databricks Community version?

I am testing some SQL code based on the book SQL Cookbook Second Edition, available from https://downloads.yugabyte.com/marketing-assets/O-Reilly-SQL-Cookbook-2nd-Edition-Final.pdfBased on Page 43, I am OK with the left join, as shown here:However, w...

Data Engineering

3730 Views
2 replies
2 kudos

12-26-2022 4:59:22 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-26-2022 5:07:57 AM

2 kudos

it must have some github link check there or you cans hare your code and data we can help you

2 kudos

12-26-2022 5:07:57 AM

1 More Replies

by MC006 • New Contributor III

12-19-2022 4:45:42 AM

12284 Views
4 replies
2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi, I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

Data Engineering

12284 Views
4 replies
2 kudos

12-19-2022 4:45:42 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 6:22:10 AM

2 kudos

Hi everyone this data was helped me thanks

2 kudos

12-26-2022 6:22:10 AM

3 More Replies

by uzadude • New Contributor III

12-13-2022 3:11:20 AM

14437 Views
5 replies
3 kudos

Adding to PYTHONPATH in interactive Notebooks

I'm trying to set PYTHONPATH env variable in the cluster configuration: `PYTHONPATH=/dbfs/user/blah`. But in the driver and executor envs it is probably getting overridden and i don't see it.`%sh echo $PYTHONPATH` outputs:`PYTHONPATH=/databricks/spar...

Data Engineering

14437 Views
5 replies
3 kudos

12-13-2022 3:11:20 AM

View Replies

Latest Reply

uzadude
New Contributor III

12-13-2022 11:50:44 PM

3 kudos

Update:At last found a (hacky) solution!in the driver I can dynamically set the sys.path in the workers with:`spark._sc._python_includes.append("/dbfs/user/blah/")`combine that with, in the driver:```%load_ext autoreload%autoreload 2```and setting: `...

3 kudos

12-13-2022 11:50:44 PM

4 More Replies

Databricks Community

Forum Posts

Previous data is getting wiped off for delta tables

Not seeing rewards on Canary

Connect to PostgreSQL to Databricks community edition error

Lakehosue table structure design

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows ...

Resolved! New spark cluster being configured in local mode

sharikrishna26.medium.com

Availability Zone in Azure

Spark Notebook to import data into Excel

Resolved! Use of badRecordsPath in COPY INTO SQL command

Resolved! Good evening people, can any one guide me flow and / how to work with databricks notebook for sample project like pyspark

What happens if I have both DLTs and normal tables in a single notebook?

Subquery does not work in Databricks Community version?

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Adding to PYTHONPATH in interactive Notebooks

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template