cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nolanlavender00
by New Contributor
  • 2771 Views
  • 2 replies
  • 0 kudos

How to control garbage collection while using Autoloader File Notification?

I am using Autoloader to load files from a directory. I have set up File Notification with the Event Subscription. I have a backfill interval set to 1 day and have not run the stream for a week. There should only be about ~100 new files to pick up an...

  • 2771 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @nolanlavender008​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 0 kudos
1 More Replies
sagiatul
by New Contributor II
  • 3852 Views
  • 2 replies
  • 3 kudos

Databricks driver logs

I am running jobs on databricks clusters. When the cluster is running I am able to find the executor logs by going to Spark Cluster UI Master dropdown, selecting a worker and going through the stderr logs. However, once the job is finished and cluste...

image
  • 3852 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Atul Arora​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 3 kudos
1 More Replies
Ossian
by New Contributor
  • 1533 Views
  • 1 replies
  • 0 kudos

Driver restarts and job dies after 10-20 hours (Structured Streaming)

I am running a java/jar Structured Streaming job on a single node cluster (Databricks runtime 8.3). The job contains a single query which reads records from multiple Azure Event Hubs using Spark Kafka functionality and outputs results to a mssql dat...

  • 1533 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

its seems that when your nodes are increasing it is seeking for init script and it is failing so you can use reserve instances for this activity instead of spot instances it will increase your overall costor alternatively, you can use depended librar...

  • 0 kudos
Hila_DG
by New Contributor II
  • 2327 Views
  • 5 replies
  • 4 kudos

Resolved! How to proactively monitor the use of the cache for driver node?

The problem:We have a dataframe which is based on the query:SELECT * FROM Very_Big_TableThis table returns over 4 GB of data, and when we try to push the data to Power BI we get the error message:ODBC: ERROR [HY000] [Microsoft][Hardy] (35) Error from...

  • 2327 Views
  • 5 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey @Hila Galapo​ Hope everything is going good. Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 4 kudos
4 More Replies
Anonymous
by Not applicable
  • 4194 Views
  • 1 replies
  • 1 kudos
  • 4194 Views
  • 1 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

To access these driver log files from the UI, you could go to the Driver Logs tab on the cluster details page. You can also configure a log delivery location for the cluster. Both worker and cluster logs are delivered to the location you specify.

  • 1 kudos
Labels