cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

anibose
by New Contributor III
  • 4749 Views
  • 3 replies
  • 7 kudos

Resolved! Hands-On exercise material

Hi FriendsI am following Databricks Customer Academy training material, and created a Databricks service in Azure Trial account and was able to launched a single node cluster there. Could you please guide me on how to do all the hands-on?

  • 4749 Views
  • 3 replies
  • 7 kudos
Latest Reply
anibose
New Contributor III
  • 7 kudos

Thanks Doug, I was able to locate .dbc file, appreciate your response. Best RegardsAnindya

  • 7 kudos
2 More Replies
benydc
by New Contributor II
  • 1074 Views
  • 0 replies
  • 2 kudos

Is it possible to connect to IPython Kernel from local or client outside databricks cluster?

When looking in the standard output of a notebook run in a cluster, we get this message: "To connect another client to this kernel, use: /databricks/kernel-connections-dj8dj93d3d3.json"Is it possible to connect to the databricks ipython kernel and ma...

  • 1074 Views
  • 0 replies
  • 2 kudos
dataexplorer
by New Contributor III
  • 8031 Views
  • 6 replies
  • 5 kudos

Resolved! COPY INTO generating duplicate rows in Delta table

Hello Everyone,I'm trying to bulk load tables from a SQL server database into ADLS as parquet files and then loading these files into Delta tables (raw/bronze). I had done a one off history/base load but my subsequent incremental loads (which had a d...

  • 8031 Views
  • 6 replies
  • 5 kudos
Latest Reply
dataexplorer
New Contributor III
  • 5 kudos

thanks for the guidance!

  • 5 kudos
5 More Replies
User16826994223
by Honored Contributor III
  • 10418 Views
  • 2 replies
  • 3 kudos

How to Prevent Duplicate Entries to enter to delta lake of Azure Storage

I Have a Dataframe stored in the format of delta into Adls, now when im trying to append new updated rows to that delta lake it should, Is there any way where i can delete the old existing record in delta and add the new updated Record.There is a uni...

  • 10418 Views
  • 2 replies
  • 3 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 3 kudos

You should use a MERGE command on this table to match records on the unique column. Delta Lake does not enforce primary keys so if you append only the duplicate ids will appear. Merge will provide you the functionality you desire. https://docs.databr...

  • 3 kudos
1 More Replies
vanessafvg
by New Contributor III
  • 3659 Views
  • 4 replies
  • 5 kudos
  • 3659 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

We're always here, even for newbie errors @Vanessa Van Gelder​ !Thanks for posting, and thanks @Hubert Dudek​ for always being so helpful.

  • 5 kudos
3 More Replies
db-avengers2rul
by Contributor II
  • 4214 Views
  • 1 replies
  • 2 kudos

Resolved! unable to replace null with 0 in dataframe using Pyspark databricks notebook (community edition)

Hello Experts,I am unable to replace nulls with 0 in a dataframe ,please refer to the screen shotfrom pyspark.sql.functions import col emp_csv_df = emp_csv_df.na.fill(0).withColumn("Total_Sal",col('sal')+col('comm')) display(emp_csv_df)erorr desired ...

unable to fill nulls with 0 in dataframe using PySpark in databricks Screenshot 2022-10-03 at 20.26.23
  • 4214 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I bet that it is not real null but the string "null". Please check what is in the source and try luck with replacing it.

  • 2 kudos
db-avengers2rul
by Contributor II
  • 2019 Views
  • 3 replies
  • 2 kudos

Resolved! Documentation - notebook not working

Dear Team,While practising few examples i have noticed the below notebook is not fetching the full dataset and also no schema is fetched https://docs.databricks.com/_static/notebooks/widget-demo.htmlcan you please re try and let me know the results N...

  • 2019 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I think https://health.data.ny.gov/api/views/myeu-hzra/rows.csv was a public dataset, but now it shows authentication_required, so this error is independent of databricks.But the good news is that I was able to generate a new URL on the New Your heal...

  • 2 kudos
2 More Replies
subhransu02
by New Contributor II
  • 1172 Views
  • 2 replies
  • 2 kudos

Databricks Lakehouse Fundamentals badge not received

I have completed and passed the short assessment for Lakehouse fundamentals but I didn't receive any badge. I have also checked in credentials.databricks.com but I don't see any badge.

  • 1172 Views
  • 2 replies
  • 2 kudos
Latest Reply
Vartika
Databricks Employee
  • 2 kudos

Hey @Subhransu Ranjan Sankhua​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly. 

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 6407 Views
  • 4 replies
  • 1 kudos

Constructor public com.databricks.backend.daemon.dbutils.FSUtilsParallel is not whitelisted when mounting a s3 bucket

Hello all, I'm experiencing this issueConstructor public com.databricks.backend.daemon.dbutils.FSUtilsParallel is not whitelisted when I'm trying to mount a s3 bucket. %python dbutils.fs.mount("s3a://dd-databricks-staging-storage/data/staging/datalak...

  • 6407 Views
  • 4 replies
  • 1 kudos
Latest Reply
leonids2005
New Contributor II
  • 1 kudos

WE have this problem running cluster with 11.2 and shared access mode. spark.databricks.pyspark.enablePy4JSecurity false - this does not help because it says spark.databricks.pyspark.enablePy4JSecurity is not allowed when choosing access modehere is ...

  • 1 kudos
3 More Replies
sgarcia
by New Contributor II
  • 3584 Views
  • 4 replies
  • 1 kudos

Call scala application jar in notebook

Hi,Is there any way to execute jar scala-spark application inside the notebook, without using jobs?I have different jars for different intakes and I want to call them from a notebook, so I could call them in a parameterized way.Thanks

  • 3584 Views
  • 4 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi @Sergio Garccia​ ,Just a friendly follow-up. Do you still need help? have you check our docs? This might help https://docs.databricks.com/workflows/jobs/jobs.html#jar-jobs-1

  • 1 kudos
3 More Replies
djfliu
by New Contributor III
  • 1215 Views
  • 0 replies
  • 3 kudos

Getting a com.databricks.s3commit.S3CommitRejectException exception error when using structured streaming to write to a delta table on s3.

The full error is below: An error occurred while calling o95098.execute.: com.databricks.s3commit.S3CommitRejectException: rejected by server26 times at com.databricks.s3commit.S3CommitClientImpl.commit(S3CommitClient.scala:303)It was a one-off inci...

  • 1215 Views
  • 0 replies
  • 3 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels