cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

scrimpton
by New Contributor II
  • 4531 Views
  • 0 replies
  • 0 kudos

Permanently delete dropped table (Unity Catalog)

The recommendation before dropping a table is to do a DELETE then VACUUM RETENTION 0 (recommended in DEV).If you DROP the table without doing a DELETE|VACUUM, your table will be soft deleted with your entire data (permanently deletedin 30 days) and y...

  • 4531 Views
  • 0 replies
  • 0 kudos
scrimpton
by New Contributor II
  • 2777 Views
  • 0 replies
  • 0 kudos

Statistics for rearranged columns

The table property dataSkippingNumIndexedCols that gets statistics for a table starts from left to right. I am wondering what will happen to the statistics for both new and old records if we add a column in between using FIRST|AFTER identifier. 

Vince_03_0-1696039970438.png
  • 2777 Views
  • 0 replies
  • 0 kudos
marvin1
by New Contributor III
  • 648 Views
  • 0 replies
  • 0 kudos

Bamboolib error

What is the status of bamboolib?  I understand that it is public preview but I'm unable to find any support references.  I am getting error below.  I've tried installing in a notebook, on a cluster, creating a pandas dataframe and running bam, etc.  ...

  • 648 Views
  • 0 replies
  • 0 kudos
938452
by New Contributor III
  • 8726 Views
  • 0 replies
  • 0 kudos

Spark is not reading Kinesis Data as fast as specified

Hi Databricks community team,I have code as below"""df = spark.readStream \.format("kinesis") \.option("endpointUrl", endpoint_url) \.option("streamName", stream_name) \.option("initialPosition", "latest") \.option("consumerMode", "efo") \.option("ma...

  • 8726 Views
  • 0 replies
  • 0 kudos
mbvb_py
by New Contributor II
  • 5199 Views
  • 4 replies
  • 0 kudos

Create cluster error: Backend service unavailable

hello,i'm new to Databricks (community edition account) and encountered a problem just now.When creating a new cluster (default 10.4 LTS) it fails with the following error: Backend service unavailable.I've tried a different runtime > same issue.I've ...

  • 5199 Views
  • 4 replies
  • 0 kudos
Latest Reply
stefnhuy
New Contributor III
  • 0 kudos

Hey mbvb_py,I'm sorry to hear you're facing this "Backend service unavailable" issue with Databricks. I've encountered similar problems in the past, and it can be frustrating. Don't worry; you're not alone in this!From my experience, this error can o...

  • 0 kudos
3 More Replies
DBEnthusiast
by New Contributor III
  • 2776 Views
  • 2 replies
  • 0 kudos

How does Job Cluster knows how many resources to assign to an Application ?

Hi All Enthusiasts !As per my understanding when a user submits an application in spark cluster it specifies how much memory, executors etc. it would need . But in Data bricks notebooks we never specify that anywhere. If we have submitted the noteboo...

  • 2776 Views
  • 2 replies
  • 0 kudos
Latest Reply
BilalAslamDbrx
Databricks Employee
  • 0 kudos

@DBEnthusiast great question! Today, with Job Clusters, you have to specify this. As @btafur note, you do this by setting CPU, memory etc. We are in early preview of Serverless Job Clusters where you no longer specify this configuration, instead Data...

  • 0 kudos
1 More Replies
smurug
by New Contributor II
  • 8571 Views
  • 3 replies
  • 1 kudos

Databricks Job scheduling - continuous mode

While scheduling the Databricks job using continuous mode - what will happen if the job is configured to run with Job cluster.At the end of each run will the cluster be terminated and re-created again for the next run? The official documentation is n...

  • 8571 Views
  • 3 replies
  • 1 kudos
Latest Reply
Jo5h
New Contributor II
  • 1 kudos

Hello @youssefmrini So how is the DBU calculated? As the cluster is reused, the DBU should be calculated per hour on all the jobs run in an hour correct? Or will it be calculated based on each run?I would like to know the cost calculation when runnin...

  • 1 kudos
2 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 1619 Views
  • 1 replies
  • 1 kudos

Streaming Data Modeling Normalization with Databricks Delta Live Tables

Streamline Data Modeling Normalization with Databricks Delta Live Tables in Just a Few Steps:- Use the "Apply changes" function to populate tables with slowly changing dimensions using auto-increment IDs.- Register SQL mapping functions to associate ...

scd1.png scd2.png scd3.png
  • 1619 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing this @Hubert-Dudek !!!

  • 1 kudos
BAZA
by New Contributor III
  • 11725 Views
  • 8 replies
  • 2 kudos

Invisible empty spaces when reading .csv files

When importing a .csv file with leading and/or trailing empty spaces around the separators, the output results in strings that appear to be trimmed on the output table or when using .display() but are not actually trimmed.It is possible to identify t...

  • 11725 Views
  • 8 replies
  • 2 kudos
Latest Reply
Raluka
New Contributor III
  • 2 kudos

Thank you so much for helping me.

  • 2 kudos
7 More Replies
Nico1
by New Contributor II
  • 15929 Views
  • 11 replies
  • 2 kudos

Resolved! Problems connecting Simba ODBC with a M1 Macbook Pro

Hi,There's a way to make work the Simba ODBC Driver for M1 Macbook Pros?I find myself able to run on an old intel version of Macbook easily, but now every time I even test the connection with the iODBC Manager fails.Definitely, the issue is around no...

CleanShot 2022-05-15 at 22.50.36@2x
  • 15929 Views
  • 11 replies
  • 2 kudos
Latest Reply
kunalmishra9
Contributor
  • 2 kudos

Things seem to be mostly working for me now. I've added a bit more detail on my connection steps and process in case it's helpful for anyone on Stack Overflow: https://stackoverflow.com/questions/76407426/connecting-rstudio-desktop-to-databricks-comm...

  • 2 kudos
10 More Replies
DanBrown
by New Contributor
  • 3153 Views
  • 0 replies
  • 0 kudos

Remove WHERE 1=0

I am hoping someone can help me remove the WHERE 1=0 that is constantly getting added onto the end of my Query (see below).  Please let me know if I can provide more info here.This is running a notebook, in Azure Databricks against a cluster that has...

  • 3153 Views
  • 0 replies
  • 0 kudos
zak_k
by New Contributor III
  • 5250 Views
  • 5 replies
  • 1 kudos

com.databricks.spark.safespark.UDFException: UNAVAILABLE: Channel shutdownNow invoked

Trying to determine a root cause of UDFException that occurs when returning a variable length ArrayType. If I hardcode the data returned from the UDF to a fixed length, say 19, the error does not occur. Setup codesplit_runs_UDF = udf(split_runs_udf, ...

  • 5250 Views
  • 5 replies
  • 1 kudos
Latest Reply
zak_k
New Contributor III
  • 1 kudos

After further investigation, It reproduces slightly differently on single user mode.Single user mode: runs foreverShared: gives the above messageI've determined that there was a corner case in the dataset which lead to UDF never returning. I am am as...

  • 1 kudos
4 More Replies
RiyuLite
by New Contributor III
  • 2557 Views
  • 0 replies
  • 0 kudos

How to retrieve cluster IDs of a deleted All Purpose cluster ?

I need to retrieve the event logs of deleted All Purpose clusters of a certain workspace.databricks list API ({workspace_url}/api/2.0/clusters/list) provides me with the list of all active/terminated clusters but not the clusters that are deleted. I ...

  • 2557 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels