cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

zmsoft
by Contributor
  • 1030 Views
  • 3 replies
  • 0 kudos

How to set DLT pipeline warning alert?

Hi there,The example description of custom event hooks in the documentation is not clear enough, I do not know how to implement it inside python functions. event-hooks  My Code: %python # Read the insertion of data raw_user_delta_streaming=spark.rea...

  • 1030 Views
  • 3 replies
  • 0 kudos
Latest Reply
Priyanka_Biswas
Databricks Employee
  • 0 kudos

Hi @zmsoft  The event hook provided, user_event_hook, must be a Python callable that accepts exactly one parameter - a dictionary representation of the event that triggered the execution of this event hook. The return value of the event hook has no s...

  • 0 kudos
2 More Replies
NamNguyenCypher
by New Contributor II
  • 925 Views
  • 2 replies
  • 2 kudos

Resolved! Adding column masks to a column using the DLT Python create_streaming_table API

I'm having difficulty adding a mask function to columns while creating streaming tables using the DLT Python method create_streaming_table() like this but it does not work, the streaming table is created but no column is masked:def prepare_column_pro...

  • 925 Views
  • 2 replies
  • 2 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 2 kudos

@NamNguyenCypher Delta Live Tables’ Python API does not currently honor column-mask metadata embedded in a PySpark StructType. Masking (and row filters) on DLT tables are only applied when you define your table with a DDL-style schema that includes a...

  • 2 kudos
1 More Replies
vziog
by New Contributor II
  • 1009 Views
  • 5 replies
  • 1 kudos

Unexpected SKU Names in Usage Table for Job Cost Calculation

I'm trying to calculate the cost of a job using the usage and list_prices system tables, but I'm encountering some unexpected behavior that I can't explain.When I run a job using a shared cluster, the sku_name in the usage table is PREMIUM_JOBS_SERVE...

  • 1009 Views
  • 5 replies
  • 1 kudos
Latest Reply
vziog
New Contributor II
  • 1 kudos

Thank you all for your replies. @lingareddy_Alva what about @Walter_C and @mnorland mentioned about enabling serverless tasks. Is this possible and how?

  • 1 kudos
4 More Replies
ankit001mittal
by New Contributor III
  • 659 Views
  • 1 replies
  • 0 kudos

DLT Publish event log to metastore

Hi Guys,I am trying to use DLT Publish event log to metastore feature.and I noticed it creates a table with the logs for each DLT pipelines separately. Does it mean it maintains the separate log table for all the DLT tables ( in our case, we have 100...

ankit001mittal_0-1745328628320.png
  • 659 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Contributor III
  • 0 kudos

Hi @ankit001mittal Yes, you're right, when you enable the "Publish Event Log to Metastore" option for DLT pipelines, Databricks creates a separate event log table for each pipeline. So, if you have thousands of pipelines, you'll see thousands of log ...

  • 0 kudos
holychs
by New Contributor III
  • 584 Views
  • 2 replies
  • 0 kudos

Repairing running workflow with few failed child jobs

I have a parent job that calls multiple child jobs in workflow, Out of 10 child jobs, one has failed and rest 9 are still running, I want to repair the failed child tasks. can I do that while the other child jobs are running?

  • 584 Views
  • 2 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi holychs,How are you doing today?, As per my understanding, yes, in Databricks Workflows, if you're running a multi-task job (like your parent job triggering multiple child tasks), you can repair only the failed task without restarting the entire j...

  • 0 kudos
1 More Replies
vivi007
by New Contributor
  • 870 Views
  • 1 replies
  • 0 kudos

Can we have a depend-on for jobs to run on two different dabs?

If there are two different DABs. Can we have a dependency for one job from one DAB to run after a job run from another DAB?  Similar to how tasks can depend on each other to run one after the other in the same DAB. Can we have the same for two differ...

  • 870 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

@vivi007 Yes, you can create dependencies between jobs in different DABs (Databricks Asset Bundles), but this requires a different approach than task dependencies within a single DAB.Since DABs are designed to be independently deployable units, direc...

  • 0 kudos
ShreevaniRao
by New Contributor III
  • 6767 Views
  • 13 replies
  • 4 kudos

Newbie learning DLT pipelines

Hello,I am learning to create DLT pipelines using different graphs using a  14 day trial version of the premium Databricks. I have currently one graph Mat view -> Streaming Table -> Mat view. When i ran this pipeline(serverless compute) 1st time, ran...

  • 6767 Views
  • 13 replies
  • 4 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 4 kudos

use this https://www.youtube.com/watch?v=iqf_QHC7tgQ&list=PL2IsFZBGM_IGpBGqxhkiNyEt4AuJXA0Gg it will help you a lot

  • 4 kudos
12 More Replies
ktagseth
by New Contributor II
  • 482 Views
  • 3 replies
  • 0 kudos

dbutils.fs.mv inefficient with ADLS

dbutils.fs.mv with ADLS currently copies the file and then deletes the old one. This incurs costs and has a lot of overhead vs using the rename functionality in ADLS which is instant and doesn't incur extra costs involved with writing the 'new' data....

  • 482 Views
  • 3 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

The tool is really meant for dbfs and is only accessible from within Databricks.  If I had to guess the idea is that most folks will not be using dbfs for production or sensitive data (for a host of good reasons) and as such there has not been a big ...

  • 0 kudos
2 More Replies
fedemgp
by New Contributor
  • 807 Views
  • 1 replies
  • 0 kudos

Configure verbose audit logs through terraform

Hi everyone,I was looking into the databricks_workspace_conf Terraform resource to configure Verbose Audit Logs (and avoid changing it through the UI). However, I attempted to apply this configuration and encountered the following error:Error: cannot...

  • 807 Views
  • 1 replies
  • 0 kudos
Latest Reply
TheRealOliver
Contributor
  • 0 kudos

@fedemgp I was able to turn the desired setting on and off with Terraform with this code: GitHub Gist I'm using Databricks Terraform provider version 1.74.0 and my Databricks runs on Google Cloud.

  • 0 kudos
gm_co
by New Contributor
  • 1064 Views
  • 1 replies
  • 0 kudos

Bar chart data labels in percent

Hello, I am currently working with bar visualizations in a new workbook editor. When I use labels, I can see the count of rows returned, and hovering over them shows the percentage of the two values returned. How can I make the percentage display on ...

gm_co_0-1739384118290.png
  • 1064 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @gm_co! Were you able to sort this out?You can display % in two ways: In the General settings, check the box for Normalize values to percentage.As you have enabled Labels, just set the Data labels to {{ @@yPercent }}. This will show the percent...

  • 0 kudos
MrJava
by New Contributor III
  • 16040 Views
  • 17 replies
  • 13 kudos

How to know, who started a job run?

Hi there!We have different jobs/workflows configured in our Databricks workspace running on AWS and would like to know who actually started the job run? Are they started by a user or a service principle using curl?Currently one can only see, who is t...

  • 16040 Views
  • 17 replies
  • 13 kudos
Latest Reply
jeremy98
Honored Contributor
  • 13 kudos

News on this feature?

  • 13 kudos
16 More Replies
VicS
by Contributor
  • 1590 Views
  • 6 replies
  • 3 kudos

Resolved! How can I use Terraform to assign an external location to multiple workspaces?

How can I use Terraform to assign an external location to multiple workspaces?When I create an external location with Terraform, I do not see any option to directly link workspaces. it also only links to the workspace of the databricks profile that I...

VicS_0-1744880469889.png
  • 1590 Views
  • 6 replies
  • 3 kudos
Latest Reply
TheRealOliver
Contributor
  • 3 kudos

@Walter_C I think you need to use databricks_workspace_binding resource for that multi-workspace binding.  I was able to achieve it in Terraform. The resource docs seem to agree with result that I have. My Databricks runs on Google Cloud.My Terraform...

  • 3 kudos
5 More Replies
oakhill
by New Contributor III
  • 1577 Views
  • 4 replies
  • 0 kudos

Unable to read Delta Table using external tools

I am using the new credential vending API to get tokens and url for my tables in Unity Catalog.I get the token, url and I am able to scan the folder using read_parquet, but NOT any Delta Lake functions. Not TableExists, scan_delta or delta_scan from ...

  • 1577 Views
  • 4 replies
  • 0 kudos
Latest Reply
matovitch
New Contributor II
  • 0 kudos

When copying a problematic delta table and reading the copy the issue disappear, it seems to be related to the new delta checkpointPolicy (v2) not supported by the rust implementation of delta but fine with the scala/java one (deltalake vs delta-spar...

  • 0 kudos
3 More Replies
carlos_tasayco
by Contributor
  • 542 Views
  • 1 replies
  • 1 kudos

Power bi connection

https://databrickster.medium.com/databricks-will-refresh-your-powerbi-semantic-model-both-dataset-metadata-and-data-4e8279e10b8eAbove is what I am trying to do, I already created the connection apparently all looks good, I added the task to my workfl...

carlos_tasayco_0-1744383834953.png
  • 542 Views
  • 1 replies
  • 1 kudos
Latest Reply
Renu_
Valued Contributor II
  • 1 kudos

Hi @carlos_tasayco, it looks like the issue is due to missing permissions in Power BI. If the role is set to Viewer, try updating it to Contributor or higher. You may need help from the Power BI admin to adjust the access. This should resolve the err...

  • 1 kudos
anand010210
by New Contributor
  • 588 Views
  • 1 replies
  • 0 kudos

AWS account databricks account / subscriptions disable before 14days trial over

I had registered trial version for 14days thru AWS, its been deactivated now before 14 days over.  when i am going thru manage subscription and it is asking to register product and redirecting me to create and link account which was already there and...

  • 588 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @anand010210! Were you able to resolve the issue? It seems that your AWS account is still linked to an active Databricks subscription, even though it's been disabled. I recommend checking your AWS Marketplace subscription to see if Databricks i...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels