cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

JonathanFlint
by New Contributor
  • 2 Views
  • 0 replies
  • 0 kudos

Asset bundle doesn't sync files to workspace

I've created a completely fresh project with a completely empty workspaceLocally I have the databricks CLI version 0.230.0 installedI rundatabricks bundle init default-pythonI have auth set up with a PAT generated by an account which has workspace ad...

  • 2 Views
  • 0 replies
  • 0 kudos
L1000
by New Contributor III
  • 63 Views
  • 4 replies
  • 0 kudos

DLT Serverless incremental refresh of materialized view

I have a materialized view that always does a "COMPLETE_RECOMPUTE", but I can't figure out why.I found how I can get the logs:  SELECT * FROM event_log(pipeline_id) WHERE event_type = 'planning_information' ORDER BY timestamp desc;   And for my table...

  • 63 Views
  • 4 replies
  • 0 kudos
Latest Reply
L1000
New Contributor III
  • 0 kudos

I split up materialized view in 3 separate ones:step1:@Dlt.table(name="step1", table_properties={"delta.enableRowTracking": "true"}) def step1(): isolate_names = dlt.read("soruce").select("Name").groupBy("Name").count() return isolate_namesst...

  • 0 kudos
3 More Replies
RobDineen
by New Contributor II
  • 50 Views
  • 2 replies
  • 1 kudos

Resolved! %SQL delete from temp table driving me mad

Hello there, i have a temp table where i want to remove a null / empty values ( see below )if there are no rows to delete, then shouldn't it just say Zero rows affected? 

RobDineen_0-1729682455777.png
  • 50 Views
  • 2 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@RobDineen This should answer your question: https://community.databricks.com/t5/get-started-discussions/how-to-create-temporary-table-in-databricks/m-p/67774/highlight/true#M2956Long story short, don't use it.

  • 1 kudos
1 More Replies
billykimber
by Visitor
  • 7 Views
  • 0 replies
  • 0 kudos

Datamart creation

In a scenario where multiple teams access overlapping but not identical datasets from a shared data lake, is it better to create separate datamarts for each team (despite data redundancy) or to maintain a single datamart and use views for team-specif...

  • 7 Views
  • 0 replies
  • 0 kudos
FabianGutierrez
by Visitor
  • 17 Views
  • 1 replies
  • 0 kudos

Issue with DAB (Databricks Asset Bundle) requesting Terraform files

Hi community,Since recently (2 days ago) we have been receiving the following error when validating and deploying our DAB (Databricks Asset Bundle):"Error: error downloading Terraform: Get "https://releases.hashicorp.com/terraform/1.5.5/index.json": ...

  • 17 Views
  • 1 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@FabianGutierrez From where are you deploying DAB? The error states that there is definitely some kind of connectivity issue.

  • 0 kudos
sandy311
by New Contributor III
  • 2000 Views
  • 7 replies
  • 3 kudos

Resolved! Databricks asset bundle does not create new job if I change configuration of existing Databricks yam

When deploying multiple jobs using the `Databricks.yml` file via the asset bundle, the process either overwrites the same job or renames it, instead of creating separate, distinct jobs.

  • 2000 Views
  • 7 replies
  • 3 kudos
Latest Reply
Ncolin1999
Visitor
  • 3 kudos

@filipniziol my requirements is to just deploy  notebooks in databricks workspace. I don’t not wana create any job. Can I still uses databricks asset bundle 

  • 3 kudos
6 More Replies
olivier-soucy
by New Contributor III
  • 60 Views
  • 3 replies
  • 0 kudos

Spark Streaming foreachBatch with Databricks connect

I'm trying to use the foreachBatch method of a Spark Streaming DataFrame with databricks-connect. Given that spark connect supported was added to  `foreachBatch` in 3.5.0, I was expecting this to work.Configuration:- DBR 15.4 (Spark 3.5.0)- databrick...

  • 60 Views
  • 3 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@olivier-soucy Are you sure that you're using DBR 15.4 and databricks-connect 15.4.2?I've seen this issue when using databricks-connect 15.4.x with DBR 14.3LTS.Anyway, I've just tested that with the same versions you've provided and it works on my en...

  • 0 kudos
2 More Replies
Tamizh035
by Visitor
  • 37 Views
  • 2 replies
  • 0 kudos

[INSUFFICIENT_PERMISSIONS] Insufficient privileges:

While reading csv file using spark and listing the files under a folder using data bricks utils, I am getting below error:[INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SELECT on any file. SQLSTATE: 42501File <comma...

  • 37 Views
  • 2 replies
  • 0 kudos
Latest Reply
Panda
Valued Contributor
  • 0 kudos

@Tamizh035 , If your file in dbfs or external location or local folder ?Use dbutils.fs.ls to verify that the path exists and you have access:files = dbutils.fs.ls("dbfs:/path_to_your_file/") display(files) 

  • 0 kudos
1 More Replies
RangaSarangan
by New Contributor
  • 25 Views
  • 1 replies
  • 1 kudos

Asset Bundles pause_status Across Different Environments

HiQuestion probably around best practices, but curious if someone else has dealt with a similar situation. I have 2 Databricks workspaces - one for Dev and one for Prod. Had to be two workspaces because Azure Landing Zones had to be air gapped from e...

  • 25 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Hi @RangaSarangan ,We have faced same issue and solved using databricks workflow API and json file for job metadata that consist job and thier respective status for each env.You can create azure devops that run after your cicd pipeline and change the...

  • 1 kudos
Adrianj
by New Contributor III
  • 8031 Views
  • 12 replies
  • 7 kudos

Databricks Bundles - How to select which jobs resources to deploy per target?

Hello, My team and I are experimenting with bundles, we follow the pattern of having one main file Databricks.yml and each job definition specified in a separate yaml for modularization. We wonder if it is possible to select from the main Databricks....

  • 8031 Views
  • 12 replies
  • 7 kudos
Latest Reply
sergiopolimante
New Contributor II
  • 7 kudos

"This include array can appear only as a top-level mapping." - you can't use include inside targets. You can use sync - exclude to exclude the yml files, but if they are in the include the workflows are going to be created anyway, even if the yml fil...

  • 7 kudos
11 More Replies
Stephanos
by New Contributor
  • 382 Views
  • 1 replies
  • 0 kudos

Sequencing Job Deployments with Databricks Asset Bundles

Hello Databricks Community!I'm working on a project where I need to deploy jobs in a specific sequence using Databricks Asset Bundles. Some of my jobs (let's call them coordination jobs) depend on other jobs (base jobs) and need to look up their job ...

  • 382 Views
  • 1 replies
  • 0 kudos
Latest Reply
MohcineRouessi
New Contributor II
  • 0 kudos

Hey Steph, Have you found anything here please ? I'm currently stuck here, trying to achieve the same thing

  • 0 kudos
amelia1
by New Contributor II
  • 960 Views
  • 2 replies
  • 0 kudos

pyspark read data using jdbc url returns column names only

Hello,I have a remote azure sql warehouse serverless instance that I can access using databricks-sql-connector. I can read/write/update tables no problem.But, I'm also trying to read/write/update tables using local pyspark + jdbc drivers. But when I ...

  • 960 Views
  • 2 replies
  • 0 kudos
Latest Reply
infodeliberatel
  • 0 kudos

I added `UseNativeQuery=0` in url. It works for me.

  • 0 kudos
1 More Replies
nengen
by Visitor
  • 30 Views
  • 1 replies
  • 0 kudos

Debugging difference between "task time" and execution time for SQL query

I have a pretty large SQL query that has the following stats from the query profiler:Tasks total time: 1.93sExecuting: 27sBased on the information in the query profiler this can be due to tasks waiting for available nodes.How should I approach this t...

  • 30 Views
  • 1 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Contributor
  • 0 kudos

Hi nengenYou may have more infos to share, so we can help you?

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels