cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kskistad
by New Contributor III
  • 1953 Views
  • 0 replies
  • 1 kudos

Set and use variables in DLT pipeline notebooks

Using DLT, I have two streaming sources coming from autoloader. Source1 contains a single row of data in the file and Source2 has thousands of rows. There is a common key column between the two sources to join them together. So far, so good.I have a ...

  • 1953 Views
  • 0 replies
  • 1 kudos
mikaellognseth
by New Contributor III
  • 12562 Views
  • 7 replies
  • 0 kudos

Resolved! Databricks cluster start-up: Self Bootstrap Failure

When attempting to deploy/start an Azure Databricks cluster through the UI, the following error consistently occurs: { "reason": { "code": "SELF_BOOTSTRAP_FAILURE", "parameters": { "databricks_error_message": "Self-bootstrap failure d...

  • 12562 Views
  • 7 replies
  • 0 kudos
Latest Reply
mikaellognseth
New Contributor III
  • 0 kudos

Hi, in our case the issue turned out to be DNS... As the DNS servers set on the Databricks workspace vnet are only available when peering the "management" vnet in our setup. Took a while to figure out as the error didn't exactly give a lot of clarity...

  • 0 kudos
6 More Replies
NavyaD
by New Contributor III
  • 2573 Views
  • 2 replies
  • 4 kudos

How to read a sql notebook in python notebook on workspace

I have a notebook named ecom_sellout.sql under the path notebooks/python/dataloader/queries.I have another notebook(named dataloader under the path notebooks/python/dataloader) in which I am calling this sql notebook.My code runs perfectly fine on re...

image
  • 2573 Views
  • 2 replies
  • 4 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 4 kudos

use magic commands and other hand you can use python and SQL formatted there. It will work

  • 4 kudos
1 More Replies
rami-lv
by New Contributor II
  • 3946 Views
  • 3 replies
  • 3 kudos

What gets overridden when writing overriding a delta lake table?

I just tried to write to a delta lake table using override mode, and I found that history is reserved. It's unclear to me how the data is overridden, and how long the history could be preserved. As they say, a code is better than a thousand words: my...

  • 3946 Views
  • 3 replies
  • 3 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 3 kudos

Hi @Rami ALZEBAK​ overwrite means first it will remove the data and again it will write the whole data.If you want to see history use can use DESCRIBE HISTORY command

  • 3 kudos
2 More Replies
Chris_Shehu
by Valued Contributor III
  • 1422 Views
  • 1 replies
  • 2 kudos

What are the options for extracting data from the delta lake for a vendor?

Our vendor is looking to use Microsoft API Manager to retrieve data from a variety of sources. Is it possible to extract records from the delta lake by using an API?What I've tried:I reviewed the available databricks API's it looks like most of them ...

  • 1422 Views
  • 1 replies
  • 2 kudos
Latest Reply
Chris_Shehu
Valued Contributor III
  • 2 kudos

Another possibility for this potentially is to stand up a cluster and have a notebook running flask to create an API interface. I'm still looking into options, but it seems like there should be a baked in solution besides the JDBC connector. I'm not ...

  • 2 kudos
gauthamchettiar
by New Contributor II
  • 1928 Views
  • 0 replies
  • 1 kudos

Spark always performing broad casts irrespective of spark.sql.autoBroadcastJoinThreshold during streaming merge operation with DeltaTable.

I am trying to do a streaming merge between delta tables using this guide - https://docs.delta.io/latest/delta-update.html#upsert-from-streaming-queries-using-foreachbatchOur Code Sample (Java): Dataset<Row> sourceDf = sparkSession ...

BroadCastJoin 1M
  • 1928 Views
  • 0 replies
  • 1 kudos
same213
by New Contributor III
  • 5353 Views
  • 4 replies
  • 8 kudos

Is it possible to create a sqlite database and export it?

I am trying to create a sqlite database in databricks and add a few tables to it. Ultimately, I want to export this using Azure. Is this possible?

  • 5353 Views
  • 4 replies
  • 8 kudos
Latest Reply
same213
New Contributor III
  • 8 kudos

@Hubert Dudek​  We currently have a process in place that reads in a SQLite file. We recently transitioned to using Databricks. We were hoping to be able to create a SQLite file so we didn't have to alter the current process we have in place.

  • 8 kudos
3 More Replies
Aj2
by New Contributor III
  • 13654 Views
  • 1 replies
  • 5 kudos
  • 13654 Views
  • 1 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

A live table or view always reflects the results of the query that defines it, including when the query defining the table or view is updated, or an input data source is updated. Like a traditional materialized view, a live table or view may be entir...

  • 5 kudos
URJ24
by New Contributor II
  • 1480 Views
  • 3 replies
  • 1 kudos

I have attended Data + AI World Tour Asia Pacific this week but did not received post confirmation email.

I have attended Data + AI World Tour Asia Pacific this week but did not received post confirmation email. After webinar I received short survey and then thank you note for participation. But unexpectedly did not received any email with feedback link ...

  • 1480 Views
  • 3 replies
  • 1 kudos
Latest Reply
URJ24
New Contributor II
  • 1 kudos

Emailing apacevents@databricks.com helped.

  • 1 kudos
2 More Replies
antonyj453
by New Contributor II
  • 2618 Views
  • 1 replies
  • 3 kudos

How to extract JSON object from a pyspark data frame. I was able to extract data from another column which in array format using "Explode" function, but Explode is not working for Object type. Its returning with type mismatch error.

I have tried below code to extract data which in Array:df2 = df_deidentifieddocuments_tst.select(F.explode('annotationId').alias('annotationId')).select('annotationId.$oid')It was working fine.. but,its not working for JSON object type. Below is colu...

CreateaAT
  • 2618 Views
  • 1 replies
  • 3 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 3 kudos

Did you try extracting that column data using from_json function ?

  • 3 kudos
gpzz
by New Contributor II
  • 2632 Views
  • 1 replies
  • 3 kudos

pyspark code error

rdd4 = rdd3.reducByKey(lambda x,y: x+y)AttributeError: 'PipelinedRDD' object has no attribute 'reducByKey'Pls help me out with this

  • 2632 Views
  • 1 replies
  • 3 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 3 kudos

Is it a typo or are you really using reducByKey instead of reduceByKey ?

  • 3 kudos
Axserv
by New Contributor II
  • 2987 Views
  • 4 replies
  • 1 kudos

How do I "Earn 100 points to the Databricks Community Rewards Store" ? (As advertised on Databricks Academy)

Hello, how do I join the Databricks Community study group for 100points, as advertised on the Databricks Academy website?

image
  • 2987 Views
  • 4 replies
  • 1 kudos
Latest Reply
Harun
Honored Contributor
  • 1 kudos

@Alex Serlovsky​ You need to earn the lakehouse fundamental credetial certification, then you can join this community group. Within 24 to 48 hours you will get 100 reward points. But As per databricks, you need to earn the credential on or before Nov...

  • 1 kudos
3 More Replies
Dave_Nithio
by Contributor
  • 1735 Views
  • 0 replies
  • 1 kudos

Natively Query Delta Lake with R

I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...

  • 1735 Views
  • 0 replies
  • 1 kudos
lawrence009
by Contributor
  • 2984 Views
  • 4 replies
  • 4 kudos

Cannot CREATE TABLE with 'No Isolation Shared' cluster

Recently I ran into a number issues running with our notebooks in Interactive Mode. For example, we can't create (delta) table. The command would run and then idle for no apparent exception. The path is created on AWS S3 but delta log is never create...

  • 2984 Views
  • 4 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

The Admin can disable the possibility to use the no Isolate Shared cluster. I recommend you to switch to Single user where UC is activated. Don't worry you won't need to change your code. If you encounter this kind of issues, make sure to open a tick...

  • 4 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels