cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

chevichenk
by New Contributor II
  • 30 Views
  • 0 replies
  • 0 kudos

No userid, username, job when making modifications on tables

Hi, everyone!I'm in this situationI have some jobs that makes changes on a particular table. I use only one user to make this modifications, but then there's a process i can't identify that also makes changes on my table.The question is, there's a re...

chevichenk_1-1718308350095.png
Data Engineering
history
jobs
userid
username
  • 30 Views
  • 0 replies
  • 0 kudos
aliehs0510
by New Contributor
  • 37 Views
  • 1 replies
  • 0 kudos

DLT Pipeline does not create the view but it shows up on the DLT graph

I wanted a more filtered data set from a materialized view so I figured a view might be the solution but it doesn't get created under the target schema  however it shows upon in the graph as a part of the pipeline. Can't we use MVs as data source for...

  • 37 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh264
Honored Contributor II
  • 0 kudos

Issue at Hand:You mentioned that a view is not created under the target schema but appears in the DLT graph. This situation arises due to how DLT manages views and materialized views.Possible Causes and Solutions:DLT Execution and Target Schema:In DL...

  • 0 kudos
Jorge3
by New Contributor III
  • 1630 Views
  • 4 replies
  • 3 kudos

Dynamic partition overwrite with Streaming Data

Hi,I'm working on a job that propagate updates of data from a delta table to a parquet files (requirement of the consumer). The data is partitioned by day (year > month > day) and the daily data is updated every hour. I'm using table read streaming w...

  • 1630 Views
  • 4 replies
  • 3 kudos
Latest Reply
JacintoArias
New Contributor III
  • 3 kudos

We had a similar situation, @Hubert-Dudek we are using delta, but we are having some problems when propagating updates via merge, as you cannot read the resulting table as streaming source anymore... so using complete overwrite over parquet partition...

  • 3 kudos
3 More Replies
ajithgaade
by New Contributor
  • 52 Views
  • 1 replies
  • 0 kudos

Databricks Job Params

Hi,Job params override the task params(same name params). Is there a way task params override the job params.Use case:job params: a = "param-1".job has 12 tasks. 10 of them should use job param(a = "param-1").2 of them should override the job param(a...

  • 52 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @ajithgaade You can use foreach Activity in Databricks workflow to achieve this.Note - foreach Activity is in Private Preview now

  • 0 kudos
jeroaranda
by New Contributor II
  • 352 Views
  • 1 replies
  • 0 kudos

How to pass task name as parameter in scheduled job that will be used as a schema name in query

I want to run a parametrized sql query in a task. Query: select * from {{client}}.catalog.table with client value being {{task.name}}.if client is a string parameter, it is replaced with quotes which throws an error.if table is a dropdown list parame...

  • 352 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jeroaranda,  When you use a string parameter in your SQL query, it’s important to ensure that the parameter value is properly quoted. If you’re directly substituting the parameter value into the query, you might encounter issues with quotes. To a...

  • 0 kudos
RobinK
by Contributor
  • 2680 Views
  • 12 replies
  • 14 kudos

Resolved! Databricks Jobs do not run on job compute but on shared compute

Hello,since last night none of our ETL jobs in Databricks are running anymore, although we have not made any code changes.The identical jobs (deployed with Databricks asset bundles) run on an all-purpose cluster, but fail on a job cluster. We have no...

  • 2680 Views
  • 12 replies
  • 14 kudos
Latest Reply
jcap
New Contributor II
  • 14 kudos

I do not believe this is solved, similar to a comment over here:https://community.databricks.com/t5/data-engineering/databrickssession-broken-for-15-1/td-p/70585We are also seeing this error in 14.3 LTS from a simple example:from pyspark.sql.function...

  • 14 kudos
11 More Replies
NikhilK1998
by New Contributor II
  • 967 Views
  • 1 replies
  • 1 kudos

DataBricks Certification Exam Got Suspended. Require support for the same.

Hi,I applied for Databricks Certified: Data Engineer Professional certification on 5th July 2023. The test was going fine for me but suddenly there was an alert from the system (I think I was in proper angle in front of camera and was genuinely givin...

  • 967 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @NikhilK1998, I'm sorry to hear your exam was suspended. Thank you for filing a ticket with our support team. Please allow the support team 24-48 hours to resolve. In the meantime, you can review the following documentation: Room requirements Beh...

  • 1 kudos
mh_db
by New Contributor III
  • 804 Views
  • 1 replies
  • 1 kudos

How to get different dynamic value for each task in workflow

I created a workflow with two tasks. It runs the first notebook and then it wait for that to finish to start the second notebook. I want to use this dynamic value as one of the parameters {{job.start_time.iso_datetime}} for both tasks. This should gi...

  • 804 Views
  • 1 replies
  • 1 kudos
Latest Reply
lucasrocha
New Contributor III
  • 1 kudos

Hello @mh_db , The dynamic value {{job.start_time.iso_datetime}} you are using in your workflow is designed to capture the start time of the job run, not the individual tasks within the job. This is why you are seeing the same date and time for both ...

  • 1 kudos
dataslicer
by Contributor
  • 398 Views
  • 2 replies
  • 0 kudos

How to export/clone Databricks Notebook without results via web UI?

When a Databricks Notebook exceeds size limit, it suggests to `clone/export without results`.  This is exactly what I want to do, but the current web UI does not provide the ability to bypass/skip the results in either the `clone` or `export` context...

  • 398 Views
  • 2 replies
  • 0 kudos
Latest Reply
dataslicer
Contributor
  • 0 kudos

Thank you @Yeshwanth for the response. I am looking for a way without clearing up the current outputs. This is necessary because I want to preserve the existing outputs and fork off another notebook instance to run with few parameter changes and come...

  • 0 kudos
1 More Replies
pjv
by New Contributor II
  • 620 Views
  • 2 replies
  • 0 kudos

Asynchronous API calls from Databricks Workflow job

Hi all,I have many API calls to run on a python Databricks notebook which I then run regularly on a Databricks Workflow job. When I test the following code on an all purpose cluster locally i.e. not via a job, it runs perfectly fine. However, when I ...

  • 620 Views
  • 2 replies
  • 0 kudos
Latest Reply
pjv
New Contributor II
  • 0 kudos

I actually got it too work though I do see that if I run two jobs of the same code in parallel the async execution time slows down. Do the number of workers of the cluster on which the parallel jobs are run effect the execution time of async calls of...

  • 0 kudos
1 More Replies
robbe
by New Contributor II
  • 185 Views
  • 1 replies
  • 1 kudos

Get job ID from Asset Bundles

When using Asset Bundles to deploy jobs, how does one get the job ID of the resources that are created?I would like to deploy some jobs through asset bundles, get the job IDs, and then trigger these jobs programmatically outside the CI/CD pipeline us...

  • 185 Views
  • 1 replies
  • 1 kudos
Latest Reply
mhiltner
New Contributor III
  • 1 kudos

Hey, not sure if this will do the trick, but i've thought about two workarounds: 1. Check if the "databricks bundle run my_job" suits your case. It accepts the name as the key to run here.  2. Would it be an option for you to use databricks jobs list...

  • 1 kudos
kazinahian
by New Contributor III
  • 632 Views
  • 2 replies
  • 1 kudos

Resolved! Lowcode ETL in Databricks

Hello everyone,I work as a Business Intelligence practitioner, employing tools like Alteryx or various low-code solutions to construct ETL processes and develop data pipelines for my Dashboards and reports. Currently, I'm delving into Azure Databrick...

  • 632 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @kazinahian,  In the Azure ecosystem, you have a few options for building ETL (Extract, Transform, Load) data pipelines, including low-code solutions. Let’s explore some relevant tools: Azure Data Factory: Purpose: Azure Data Factory is a clou...

  • 1 kudos
1 More Replies
Labels