cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Henrik_
by New Contributor III
  • 7195 Views
  • 2 replies
  • 0 kudos

Callback bound method error

 When executing a withColumn (running on DBR 14.3 LST) I get this error:Error in callback <bound method UserNamespaceCommandHook.post_run_cell of <dbruntime.DatasetInfo.UserNamespaceCommandHook object at 0x7feda2b2efb0>> (for post_run_cell):How shoul...

  • 7195 Views
  • 2 replies
  • 0 kudos
Latest Reply
TjommeV-Vlaio
New Contributor III
  • 0 kudos

We have the same issue using a shared cluster running DBR 14.3:Code executed: dfNew = dfTmp.withColumn(HashKeyColumnName, F.sha2(F.concat_ws("||", *ColumnList), 256))Error received: Error in callback <bound method UserNamespaceCommandHook.post_run_ce...

  • 0 kudos
1 More Replies
Zavi
by New Contributor
  • 1491 Views
  • 1 replies
  • 0 kudos

When are DLT going to support multiple targets

Due to the limitations with all output data needing to be stored in one target we have stopped using DLT until more flexibility is added. If anyone has a workaround we are open to suggestions. 

  • 1491 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rafael-Ribeiro
New Contributor II
  • 0 kudos

Hi Zavi,One potential workaround is to establish multiple DLT pipelines, with each pipeline specifically configured to point to a unique target. This approach effectively allows for a diverse range of output data to be stored across various targets.T...

  • 0 kudos
nikhilprajapati
by New Contributor
  • 1275 Views
  • 2 replies
  • 1 kudos

Data in dataframe is also getting deleted when we are trying to delete records from underlying table

  Hi , We are trying to load data from a delta table to a dataframe(a copy of original table) . Initially delta table has count 911 . The dataframe in which the data is loaded also has the same count .Now,  we are deleting some records from the delta...

nikhilprajapati_1-1701930598953.png nikhilprajapati_2-1701930598960.png nikhilprajapati_3-1701930598967.png nikhilprajapati_4-1701930598974.png
  • 1275 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hkesharwani
Contributor II
  • 1 kudos

Hi, There is a way to retain the copy of data frame, even if the data in underling table is manipulated but that's a memory expensive operation, be careful while using it.df1 = spark.createDataFrame(df.rdd.map(lambda x: x), schema=df.schema)Here we a...

  • 1 kudos
1 More Replies
karola61
by New Contributor II
  • 1164 Views
  • 1 replies
  • 0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure:

org.apache.spark.SparkException: Job aborted due to stage failure:

  • 1164 Views
  • 1 replies
  • 0 kudos
Latest Reply
rajeshg
New Contributor II
  • 0 kudos

Along with Job aborted due to stage failure: if you see slave lost... then it is due to less memory allocated for executors, more cores per executor more memory required or the other possibility is you have used max cpu available in cluster and the d...

  • 0 kudos
Vanshika
by New Contributor
  • 386 Views
  • 0 replies
  • 0 kudos

Databricks and Cloud Services Pricing

Hi,If I connect databricks (trial version) with AWS/Azure/Google Cloud and then work on dashboards and Genie - will there be any minimal charges, or its completely free to use the cloud services?

  • 386 Views
  • 0 replies
  • 0 kudos
FerArribas
by Contributor
  • 1271 Views
  • 1 replies
  • 1 kudos
  • 1271 Views
  • 1 replies
  • 1 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 1 kudos

There is no distinction to make, it's VM's and you can't choose. Databricks SQL Serverless Warehouses uses K8s under the hood though. 

  • 1 kudos
fperry
by New Contributor II
  • 937 Views
  • 0 replies
  • 0 kudos

Concurrent State Update from Worker Nodes Possible?

For a data processing pipeline I use structured streaming and arbitrary stateful processing. I was wondering if the partitioning over several worker nodes and thus updating the state from different worker nodes has to be considered (e.g. using a lock...

  • 937 Views
  • 0 replies
  • 0 kudos
eimis_pacheco
by Contributor
  • 3039 Views
  • 3 replies
  • 1 kudos

Confused with databricks Tips and Tricks - Optimizations regarding partitining

Hello Community,Today I was in Tips and Tricks - Optimizations webinar and I started being confused, they said:"Don't partition tables <1TB in size and plan carefully when partitioning• Partitions should be >=1GB" Now my confusion is if this recommen...

Get Started Discussions
data engineering
performance
  • 3039 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

that is partitions on disk.Defining the correct amount of partitions is not that easy.  One would think that more partitions is better because you can process more data in parallel.And that is true if you only have to do local transformations (no shu...

  • 1 kudos
2 More Replies
Nexalyn33
by New Contributor II
  • 926 Views
  • 1 replies
  • 0 kudos

[offisiell] Nexalyn Norge Erfaringer anmeldelser – Nexalyn Ingredienser pris, kjøp

Nexalyn Norge Opplevelser Dose, inntak: I en verden hvor vitalitet og ytelse ofte er synonymt med suksess, er det viktig å opprettholde topp fysisk form. For menn strekker dette seg ofte utover bare kondisjon til områder med vitalitet, virilitet og g...

  • 926 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nexalyn33
New Contributor II
  • 0 kudos

Klikk her for å kjøpe nå fra den offisielle nettsiden til Nexalyn

  • 0 kudos
Newbienewbster
by New Contributor
  • 399 Views
  • 1 replies
  • 1 kudos

How do you analyze performance

Curious to hear how you guys optimize compute. As in how you dig into the details of the Spark execution and improve?

  • 399 Views
  • 1 replies
  • 1 kudos
Latest Reply
mhiltner
Databricks Employee
  • 1 kudos

That is it. Usually, people take the time it takes to run a job/query/process as their KPI.  Then you start to check which processes are taking more time, drilling down one by one. Sometimes it could be a misplaced .cache(), .collect() or display() t...

  • 1 kudos
Data_Analytics1
by Contributor III
  • 2687 Views
  • 2 replies
  • 0 kudos

Getting secret from Key Vault of previous version

Hi,I have added secrets in Azure Key Vault and also updated it few times. I need to access current as well as previous version secret in a data pipeline. dbutils.secrete.get(KeyName, SecretScopeName)This gives me the current version of secret.How can...

  • 2687 Views
  • 2 replies
  • 0 kudos
Latest Reply
johnb1
Contributor
  • 0 kudos

Hi @Retired_mod, @Data_Analytics1 Has a solution this problem been found in the meantime?I need to access a secret with a SPECIFIC VERSION from Azure Key Vault via Databricks Secret Scope. Hence while retrieving the secret I need to pass BOTH AKV sec...

  • 0 kudos
1 More Replies
BricksNewbie
by New Contributor II
  • 1868 Views
  • 1 replies
  • 0 kudos

Data Analysis with Databricks SQL Course Material

Hi Community,I am currently going through Databrick Academy's course for Data Analysis with Databricks SQL.  I have downloaded the DBC course material, however there does not seen to be any material under docs folder.  Can someone please share some l...

  • 1868 Views
  • 1 replies
  • 0 kudos
Latest Reply
BricksNewbie
New Contributor II
  • 0 kudos

Hi Databrick Team, any helping points appreciated! 

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels