cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

skumarrm
by New Contributor II
  • 734 Views
  • 2 replies
  • 0 kudos

DLT PipelineID/PipleLineName values from the TASK1 should get passed to TASK2 notebook (Non-DLT)

DLT PipelineID/PipleLineName values from the TASK1 should get passed to TASK2 notebook (Non-DLT)TASK1(DLT)---> TASK2(Non-DLT)How to pass the parameters to TASK2 from TASK1. I need to get the DLT task notebook pipelineID,pipelineName and pass to TASK2...

Data Engineering
dlt
DLT parameter
  • 734 Views
  • 2 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

@skumarrm Please try the below: Set Up Task Parameters: In the job configuration, you can set up task parameters to pass values from one task to another.For TASK1 (DLT), ensure it outputs the PipelineID or PipelineName. Use Task Parameters in TASK2:...

  • 0 kudos
1 More Replies
PKD28
by New Contributor II
  • 440 Views
  • 1 replies
  • 0 kudos

Databaricks Cluster issue

Jobs within the all purpose DB Cluster are failing with "the spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached"In the event log it says "Event_type=DRIVER_NOT_RESPONDING & MESSAGE= "Driver is up b...

  • 440 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

@PKD28  The error indicates that the driver memory is not enough to handle the load.Please refer to this doc for more info and on how to fix thishttps://kb.databricks.com/en_US/jobs/driver-unavailable

  • 0 kudos
abduldjafar
by New Contributor
  • 725 Views
  • 1 replies
  • 0 kudos

Merge take too long

Hi all,I performed a merge process on approximately 19 million rows using two i3.4xlarge workers. However, the process took around 20 minutes to complete. How can I further optimize this process? I have already implemented the OPTIMIZE command and us...

  • 725 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

@abduldjafar Use this general doc to optimize your workload based on your job analysis https://www.databricks.com/discover/pages/optimize-data-workloads-guide

  • 0 kudos
bhanuteja_1
by New Contributor II
  • 324 Views
  • 1 replies
  • 0 kudos

NoClassDefFoundError: scala/Product Caused by: ClassNotFoundException: scala.Product

NoClassDefFoundError: scala/Product Caused by: ClassNotFoundException: scala.Productat preimport step itself . Please suggest me something .

  • 324 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @bhanuteja_1  The scala.Product is a core class in the Scala std library used for tuples, case classes, etc. There seem to be a classloading problem or more likely a jar conflict. Are you deploying a job using custom jars, uber jars, and having de...

  • 0 kudos
bhanuteja_1
by New Contributor II
  • 560 Views
  • 1 replies
  • 0 kudos

NoClassDefFoundError: org/apache/spark/sql/SparkSession$

NoClassDefFoundError: org/apache/spark/sql/SparkSession$    at com.microsoft.nebula.common.ConfigProvider.<init>(configProvider.scala:17)    at $linef37a348949c145718a08f6b29642317b35.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$...

  • 560 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @bhanuteja_1 , Where are you running this from? Based on the short output, it looks like from a Databricks Notebook, but it would be a weird error unless you're having some classpath overrides or jar conflicts, leading to this error; it is simply ...

  • 0 kudos
qwerty1
by Contributor
  • 6412 Views
  • 7 replies
  • 19 kudos

Resolved! When will databricks runtime be released for Scala 2.13?

I see that spark fully supports Scala 2.13. I wonder why is there no databricks runtime with Scala 2.13 yet. Any plans on making this available? It would be super useful.

  • 6412 Views
  • 7 replies
  • 19 kudos
Latest Reply
guersam
New Contributor II
  • 19 kudos

I agree with @777. As Scala 3 is getting mature and there are more real use cases with Scala 3 on Spark now, support for Scala 2.13 will be valuable to users including us.I think the recent upgrade of Databricks runtime from JDK 8 to 17 was one of a ...

  • 19 kudos
6 More Replies
sathya08
by New Contributor III
  • 2317 Views
  • 3 replies
  • 0 kudos

Databricks Python function achieving Parallelism

Hello everyone,I have a very basic question wrt Databricks spark parallelism.I have a python function within a for loop, so I believe this is running sequentially.Databricks cluster is enabled with Photon and with Spark 15x, does that mean the driver...

  • 2317 Views
  • 3 replies
  • 0 kudos
Latest Reply
sathya08
New Contributor III
  • 0 kudos

any help here , thanks 

  • 0 kudos
2 More Replies
TeachingWithDat
by New Contributor II
  • 7124 Views
  • 3 replies
  • 2 kudos

I am getting this error: com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: com.databricks.rpc.UnknownRemoteException: Remote exception occurred:

I am teaching a class for BYU Idaho and every table in every database has been imploded for my class. We keep getting this error:com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: com.databricks.rpc.UnknownRemoteException: ...

  • 7124 Views
  • 3 replies
  • 2 kudos
Latest Reply
aparna123
New Contributor II
  • 2 kudos

i am facing the issue before i trying to execute a code error message:com.databricks.rpc.UnknownRemoteException: Remote exception occurred:

  • 2 kudos
2 More Replies
User16685683696
by Databricks Employee
  • 1819 Views
  • 1 replies
  • 2 kudos

Free Training: Databricks Lakehouse Fundamentals The demand for technology roles is only growing – it&#39;s projected that over 150 million jobs will ...

Free Training: Databricks Lakehouse FundamentalsThe demand for technology roles is only growing – it's projected that over 150 million jobs will be added in the next five years. Across industries and regions, this is translating to increased demand f...

  • 1819 Views
  • 1 replies
  • 2 kudos
Latest Reply
Eddie_AZ
New Contributor II
  • 2 kudos

I watched all 4 videos but getting an error when I try to take the test. How do I complete the test and get my badge? 

  • 2 kudos
Gaurav_Lokhande
by New Contributor II
  • 1946 Views
  • 7 replies
  • 3 kudos

We are trying to connect to AWS RDS MySQL instance from DBX with PySpark using JDBC

We are trying to connect to AWS RDS MySQL instance from DBX with PySpark using JDBC: jdbc_df = (spark.read.format("jdbc").options(url=f"jdbc:mysql://{creds['host']}:{creds['port']}/{creds['database']}", driver="com.mysql.cj.jdbc.Driver", dbtable="(SE...

  • 1946 Views
  • 7 replies
  • 3 kudos
Latest Reply
arjun_kr
Databricks Employee
  • 3 kudos

@Gaurav_Lokhande  With Spark JDBC usage, connectivity happens between your Databricks VPC (in your AWS account) and RDS VPC, assuming you are using non-serverless clusters. You may need to ensure this connectivity works (like by peering).

  • 3 kudos
6 More Replies
trentlglover
by New Contributor
  • 515 Views
  • 1 replies
  • 0 kudos

Notebooks running long in workflow

I have deployed a new databricks environment for development. I've copied a workflow from production to this environment with exactly the same compute configuration. Four notebooks that complete within minutes do not complete after 2 hours in develop...

  • 515 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @trentlglover, It sounds like you're experiencing a significant performance issue with your notebooks in the new development environment. Here are a few potential areas to investigate: Cluster Configuration: Even though you mentioned that the comp...

  • 0 kudos
isai-ds
by New Contributor
  • 418 Views
  • 0 replies
  • 0 kudos

Salesforce LakeFlow connect - Deletion Salesforce records

Hello, I am new in databricks and related to data engineering. I am running a POC to sync data between a Salesforce sandbox and Databricks using LakeFlow connect.I already make the connection and i successfully sync data between salesforce and databr...

  • 418 Views
  • 0 replies
  • 0 kudos
RajeshRK
by Contributor II
  • 13164 Views
  • 7 replies
  • 3 kudos

Resolved! Download event, driver, and executor logs

Hi Team, I can see logs in Databricks console by navigating workflow -> job name -> logs. These logs are very generic like stdout, stderr and log4-avtive.log. How to download event, driver, and executor logs at once for a job? Regards,Rajesh.

  • 13164 Views
  • 7 replies
  • 3 kudos
Latest Reply
RajeshRK
Contributor II
  • 3 kudos

@Kaniz Fatma​ @John Lourdu​ @Vidula Khanna​ Hi Team,I managed to download logs using the Databricks command line as below: Installed the Databricks command line on my Desktop (pip install databricks-cli)Configured the Databricks cluster URL and perso...

  • 3 kudos
6 More Replies
hkmodi
by New Contributor II
  • 1787 Views
  • 3 replies
  • 0 kudos

Perform row_number() filter in autoloader

I have created an autoloader job that reads data from S3 (files with no extension) having json using (cloudFiles.format, text). Now this job is suppose to run every 4 hours and read all the new data that arrived. But before writing into a delta table...

  • 1787 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

HI @hkmodi ,Basically, as @daniel_sahal  said, bronze layer should reflect the source system. The silver layer is dedicated for deduplication/cleaning/enrichment of dataset. If you still need to deduplicate at bronze layer you have 2 options:- use me...

  • 0 kudos
2 More Replies
vibhakar
by New Contributor
  • 5247 Views
  • 3 replies
  • 1 kudos

Not able to mount ADLS Gen2 in Data bricks

py4j.security.Py4JSecurityException: Method public com.databricks.backend.daemon.dbutils.DBUtilsCore$Result com.databricks.backend.daemon.dbutils.DBUtilsCore.mount(java.lang.String,java.lang.String,java.lang.String,java.lang.String,java.util.Map) is ...

  • 5247 Views
  • 3 replies
  • 1 kudos
Latest Reply
cpradeep
New Contributor III
  • 1 kudos

Hi , have you sorted this issue ? can you please let me know the solution? 

  • 1 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels