cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

naveen123
by New Contributor II
  • 2554 Views
  • 3 replies
  • 3 kudos

Previous data is getting wiped off for delta tables

I am using only insert sql query to insert the hist. load but previous data getting deleted.Tried with python query also but same issue persists.Reading the data from gcp bucket(parquet file)writing the data into gcp bucket(delta file)..the deleted f...

  • 2554 Views
  • 3 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

Share your query and also look for any error messages in the driver logs. This might help to undertand better what is happening.

  • 3 kudos
2 More Replies
KKo
by Contributor III
  • 2738 Views
  • 2 replies
  • 2 kudos

Not seeing rewards on Canary

Hi @Christy Seto​ I received the fundamental certificate and have joined the community group but still not showing any rewards on Canary, it's been a week I did those. Could you please have a look on it. Thanks in advance!

  • 2738 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

check today , it is updating every Monday for me

  • 2 kudos
1 More Replies
db-avengers2rul
by Contributor II
  • 2175 Views
  • 1 replies
  • 0 kudos

Connect to PostgreSQL to Databricks community edition error

Dear Team,I am trying to establish a connectivity to PostgreSQL to Databricks community edition using sql notebookhowever I am encountering the below errorError in SQL statement: IllegalArgumentException: requirement failed: Host name should not cont...

  • 2175 Views
  • 1 replies
  • 0 kudos
Latest Reply
db-avengers2rul
Contributor II
  • 0 kudos

@Teamany suggestions ?

  • 0 kudos
KuldeepChitraka
by New Contributor III
  • 1815 Views
  • 1 replies
  • 5 kudos

Lakehosue table structure design

We are in process of implementing a lakehouse using Azure Databricks. We already have a datalake in placeAzure Storage Datalake – Contains containers which has data in its native format.How we are planningBuild Bronze layer by create bronze tables by...

  • 1815 Views
  • 1 replies
  • 5 kudos
Latest Reply
Rishabh-Pandey
Databricks MVP
  • 5 kudos

hey @Kuldeep Chitrakar​ as you are saying that you do not have a partitioning in bronze tables , so according to that statement that is okay . but in silver as you are going to implement partitioning so , what i will recommend to you is that for bett...

  • 5 kudos
Rishabh-Pandey
by Databricks MVP
  • 5684 Views
  • 0 replies
  • 6 kudos

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows ...

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows you to connect to various data sources, including Delta Lake. Here's how to do it:Install the Micros...

  • 5684 Views
  • 0 replies
  • 6 kudos
supremefist
by New Contributor III
  • 8798 Views
  • 3 replies
  • 2 kudos

Resolved! New spark cluster being configured in local mode

Hi,We have two workspaces on Databricks, prod and dev. On prod, if we create a new all-purpose cluster through the web interface and go to Environment in the the spark UI, the spark.master setting is correctly set to be the host IP. This results in a...

  • 8798 Views
  • 3 replies
  • 2 kudos
Latest Reply
scottb
New Contributor II
  • 2 kudos

I found the same issue when choosing the default cluster setup on first setup that when I went to edit the cluster to add an instance profile, I was not able to save without fixing this. Thanks for the tip

  • 2 kudos
2 More Replies
SIRIGIRI
by Databricks Partner
  • 1443 Views
  • 2 replies
  • 2 kudos

sharikrishna26.medium.com

Spark Dataframe MetadataSpark Dataframe is structurally the same as the table. However, it does not store any schema information in the metadata store. Instead, we have a runtime metadata catalog to store the Dataframe schema information. It is simil...

  • 1443 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

this is awesome thanks

  • 2 kudos
1 More Replies
SIRIGIRI
by Databricks Partner
  • 3401 Views
  • 3 replies
  • 5 kudos

Availability Zone in Azure

Please Find the content Herehttps://medium.com/@sharikrishna26/availability-zone-in-azure-52e7764357b6

  • 3401 Views
  • 3 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

yeah this is awesome thanks, but ever you think that what will happen if in that region no instance will left , how our jobs will start , any idea here?

  • 5 kudos
2 More Replies
CBull
by New Contributor III
  • 2832 Views
  • 3 replies
  • 2 kudos

Spark Notebook to import data into Excel

Is there a way to create a notebook that will take the SQL that I want to put into the Notebook and populate Excel daily and send it to a particular person?

  • 2832 Views
  • 3 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

@Aviral Bhardwaj​  thanks for this, I was needed this info

  • 2 kudos
2 More Replies
bradm0
by New Contributor III
  • 5145 Views
  • 3 replies
  • 3 kudos

Resolved! Use of badRecordsPath in COPY INTO SQL command

I'm trying to use the badRecordsPath to catch improperly formed records in a CSV file and continue loading the remainder of the file. I can get the option to work using python like thisdf = spark.read\ .format("csv")\ .option("header","true")\ .op...

  • 5145 Views
  • 3 replies
  • 3 kudos
Latest Reply
bradm0
New Contributor III
  • 3 kudos

Thanks. It was the inferSchema setting. I tried it with and without the SELECT and it worked both ways when I added inferSchemaBoth of these workeddrop table my_db.t2; create table my_db.t2 (col1 int,col2 int); copy into my_db.t2 from (SELECT cast(...

  • 3 kudos
2 More Replies
Meghala
by Valued Contributor II
  • 2321 Views
  • 2 replies
  • 2 kudos
  • 2321 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

Hi @S Meghala​ ,Please go through this Github link you will get good amount of data here ,this way you can learn morehttps://github.com/AlexIoannides/pyspark-example-projectPlease select my answer as best answer if your query is fulfilled ThanksAvira...

  • 2 kudos
1 More Replies
hello_world
by Databricks Partner
  • 6064 Views
  • 7 replies
  • 3 kudos

What happens if I have both DLTs and normal tables in a single notebook?

I've just learned Delta Live Tables on Databricks Academy and have no environment to try it out.I'm wondering what happens to the pipeline if the notebook consists of both normal tables and DLTs. For exampleTable ADLT A that reads and cleans Table AT...

  • 6064 Views
  • 7 replies
  • 3 kudos
Latest Reply
Rishabh-Pandey
Databricks MVP
  • 3 kudos

hey ,@S L​  According to you , you have normal table table A and DLT table Table B , so it will give thrown an error that your upstream table is not streaming Live table and you need to create streaming live table Table a , if you want to use the ou...

  • 3 kudos
6 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 3730 Views
  • 2 replies
  • 2 kudos

Subquery does not work in Databricks Community version?

I am testing some SQL code based on the book SQL Cookbook Second Edition, available from https://downloads.yugabyte.com/marketing-assets/O-Reilly-SQL-Cookbook-2nd-Edition-Final.pdfBased on Page 43, I am OK with the left join, as shown here:However, w...

image image
  • 3730 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

it must have some github link check there or you cans hare your code and data we can help you

  • 2 kudos
1 More Replies
MC006
by New Contributor III
  • 12284 Views
  • 4 replies
  • 2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi,  I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

  • 12284 Views
  • 4 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

Hi everyone this data was helped me thanks ​

  • 2 kudos
3 More Replies
uzadude
by New Contributor III
  • 14437 Views
  • 5 replies
  • 3 kudos

Adding to PYTHONPATH in interactive Notebooks

I'm trying to set PYTHONPATH env variable in the cluster configuration: `PYTHONPATH=/dbfs/user/blah`. But in the driver and executor envs it is probably getting overridden and i don't see it.`%sh echo $PYTHONPATH` outputs:`PYTHONPATH=/databricks/spar...

  • 14437 Views
  • 5 replies
  • 3 kudos
Latest Reply
uzadude
New Contributor III
  • 3 kudos

Update:At last found a (hacky) solution!in the driver I can dynamically set the sys.path in the workers with:`spark._sc._python_includes.append("/dbfs/user/blah/")`combine that with, in the driver:```%load_ext autoreload%autoreload 2```and setting: `...

  • 3 kudos
4 More Replies
Labels