cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ashraf1395
by Visitor
  • 5 Views
  • 0 replies
  • 0 kudos

Starting Serverless sql cluster on GCP

Hello there,I am trying to start a serverless databricks SQL cluster in GCP. I am following this databricks doc: https://docs.gcp.databricks.com/en/admin/sql/serverless.htmlI have checked that all my requirements are fulfilled for activating the clus...

Screenshot 2024-05-07 113120.png Screenshot 2024-05-07 113137.png
  • 5 Views
  • 0 replies
  • 0 kudos
halox6000
by New Contributor III
  • 33 Views
  • 1 replies
  • 0 kudos

How do i stop pyspark from outputting text

I am using a tqdm progress bar to monitor the amount of data records I have collected via API. I am temporarily writing them to a file in the DBFS, then uploading to a Spark DataFrame. Each time I write to a file, I get a message like 'Wrote 8873925 ...

  • 33 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @halox6000, To stop the progress bar output from tqdm, you can use the disable argument. Set it to True to silence any tqdm output. In fact, it will not only hide the display but also skip the progress bar calculations entirely1. Here’s an examp...

  • 0 kudos
jainshasha
by New Contributor II
  • 156 Views
  • 8 replies
  • 0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 156 Views
  • 8 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

@jainshasha base on the screenshot you sent, looks like your jobs are starting at 12:30 and runs in parallel Why do you thin your jobs are waiting for clusters?  

  • 0 kudos
7 More Replies
namankhamesara
by New Contributor II
  • 12 Views
  • 0 replies
  • 0 kudos

Error while running Databricks modules

Hi Databricks Community,I am following https://customer-academy.databricks.com/learn/course/1266/data-engineering-with-databricks?generated_by=575333&hash=6edddab97f2f528922e2d38d8e4440cda4e5302a this course provided by databricks. In this when I am ...

namankhamesara_0-1715054731073.png
Data Engineering
databrickscommunity
  • 12 Views
  • 0 replies
  • 0 kudos
MrD
by New Contributor
  • 77 Views
  • 1 replies
  • 0 kudos

Issue with autoscalling the cluster

Hi All, My job is breaking as the cluster is not able to autoscale. below is the log,can it be due to AWS vms are not spinning up or can be due to issue databricks configuration.Does anyone has faced it before ?TERMINATING Compute terminated. Reason:...

  • 77 Views
  • 1 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

Hey MrD,I faced this issue while running Azure VMs. A restart and re atatching the cluster helped me. Please let me know if that works for you.

  • 0 kudos
smukhi
by New Contributor
  • 113 Views
  • 2 replies
  • 0 kudos

Encountering Error UNITY_CREDENTIAL_SCOPE_MISSING_SCOPE

As of this morning we started receiving the following error message on a Databricks job with a single Pyspark Notebook task. The job has not had any code changes in 2 months. The cluster configuration has also not changed. The last successful run of ...

  • 113 Views
  • 2 replies
  • 0 kudos
Latest Reply
smukhi
New Contributor
  • 0 kudos

As advised, I double confirmed that no code or cluster configuration was changed (even got a second set of eyes on it that confirmed the same).I was able to find a "fix" which puts a bandaid on the issue:I was able to pinpoint that the issue seems to...

  • 0 kudos
1 More Replies
Wolfoflag
by New Contributor II
  • 34 Views
  • 1 replies
  • 0 kudos

Threads vs Processes (Parallel Programming) Databricks

Hi Everyone,I am trying to implement parallel processing in databricks and all the resources online point to using ThreadPool from the pythons multiprocessing.pool library or concurrent future library. These libraries offer methods for creating async...

  • 34 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

I am not super expert but I have been using databricks for a while and I can say that - when you use any Python library like asyncio, ThredPool and so one - this is good only to some maintenance things, small api calls etc.When you want to leverage s...

  • 0 kudos
digui
by New Contributor
  • 1974 Views
  • 4 replies
  • 0 kudos

Issues when trying to modify log4j.properties

Hi y'all.​I'm trying to export metrics and logs to AWS cloudwatch, but while following their tutorial to do so, I ended up facing this error when trying to initialize my cluster with an init script they provided.This is the part where the script fail...

  • 1974 Views
  • 4 replies
  • 0 kudos
Latest Reply
cool_cool_cool
New Contributor II
  • 0 kudos

@digui Did you figure out what to do? We're facing the same issue, the script works for the executors.I was thinking on adding an if that checks if there is log4j.properties and modify it only if it exists

  • 0 kudos
3 More Replies
Menegat
by Visitor
  • 42 Views
  • 1 replies
  • 0 kudos

VACUUM seems to be deleting Autoloader's log files.

Hello everyone,I have a workflow setup that updates a few Delta tables incrementally with autoloader three times a day. Additionally, I run a separate workflow that performs VACUUM and OPTIMIZE on these tables once a week.The issue I'm facing is that...

  • 42 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Menegat, It seems you’re encountering an issue with your Delta tables during incremental updates. Let’s dive into this and explore potential solutions. Delta Live Tables and Incremental Updates: Delta Live Tables allow for incremental updates...

  • 0 kudos
georgef
by Visitor
  • 41 Views
  • 1 replies
  • 0 kudos

Cannot import relative python paths

Hello,Some variations of this question have been asked before but there doesn't seem to be an answer for the following simple use case:I have the following file structure on a Databricks Asset Bundles project: src --dir1 ----file1.py --dir2 ----file2...

  • 41 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @georgef, It appears that you’re encountering issues with importing modules within a Databricks Asset Bundles (DABs) project. Let’s explore some potential solutions to address this problem. Bundle Deployment and Import Paths: When deploying a ...

  • 0 kudos
ChingizK
by New Contributor II
  • 300 Views
  • 1 replies
  • 0 kudos

Workflow Failure Alert Webhooks for OpsGenie

I'm trying to set up a Workflow Job Webhook notification to send an alert to OpsGenie REST API on job failure. We've set up Teams & Email successfully.We've created the Webhook and when I configure "On Failure" I can see it in the JSON/YAML view. How...

Screenshot 2024-04-12 at 1.15.33 PM.png Screenshot 2024-04-12 at 1.17.27 PM.png
Data Engineering
jobs
opsgenie
webhooks
Workflows
  • 300 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @ChingizK, Configuring the payload for OpsGenie Webhook integration is essential to ensure that the data sent to OpsGenie meets your requirements. Let’s walk through the steps: Create a Webhook Integration in OpsGenie: Go to Settings > Integra...

  • 0 kudos
lindsey
by New Contributor
  • 513 Views
  • 1 replies
  • 0 kudos

"Error: cannot read mws credentials: invalid Databricks Account configuration" on TF Destroy

I have a terraform project that creates a workspace in Databricks, assigns it to an existing metastore, then creates external location/storage credential/catalog. The apply works and all expected resources are created. However, without touching any r...

  • 513 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @lindsey, It seems you’re encountering an issue with Terraform and Databricks when trying to destroy resources. Let’s explore some potential solutions to address this problem: Resource Order in Terraform Configuration: Ensure that the databric...

  • 0 kudos
dlaxminaresh
by New Contributor
  • 286 Views
  • 1 replies
  • 0 kudos

what config do we use to set row groups fro delta tables on data bricks.

I have tried multiples way to set row group for delta tables on data bricks notebook its not working where as I am able to set it properly using spark.I tried 1. val blockSize = 1024 * 1024 * 60spark.sparkContext.hadoopConfiguration.setInt( "dfs.bloc...

  • 286 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @dlaxminaresh, Setting row groups for Delta tables in Databricks can be a bit tricky, but let’s explore some options to achieve this. First, let’s address the approaches you’ve tried: Setting Block Sizes: You’ve attempted to set the block size...

  • 0 kudos
Labels
Top Kudoed Authors