cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Steve_Harrison
by New Contributor III
  • 1491 Views
  • 2 replies
  • 0 kudos

Invalid Path when getting Notebook Path

The undocumented feature to get a notebook path as is great but it does not actually return a valid path that can be used in python, e.g.:from pathlib import Pathprint(Path(dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPat...

  • 1491 Views
  • 2 replies
  • 0 kudos
Latest Reply
Steve_Harrison
New Contributor III
  • 0 kudos

I actually think the major issue is that the above is undocumented and not supported. A supported and documented way of doing this would be much appreciated.

  • 0 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 8102 Views
  • 10 replies
  • 10 kudos

Delta Live Table name dynamically

Hi Team,Can we pass Delta Live Table name dynamically [from a configuration file, instead of hardcoding the table name]? We would like to build a metadata-driven pipeline.

  • 8102 Views
  • 10 replies
  • 10 kudos
Latest Reply
bmhardy
New Contributor III
  • 10 kudos

Is this post referring to Direct Publishing Mode? As we are multi-tenanted we have to have separate schema per client, which currently means a single pipeline per client. This is not cost effective at all, so we are very much reliant on DPM. I believ...

  • 10 kudos
9 More Replies
maikl
by New Contributor III
  • 554 Views
  • 4 replies
  • 0 kudos

Resolved! DABs job name must start with a letter or underscore

Hi,In UI I used the pipeline name 00101_source_bronze. I wanted to do the same in the Databricks Asset Bundles.but when the configuration is refreshed against Databricks Workspace I see this error:I found that this issue can be connect to Terraform v...

maikl_0-1733912307017.png maikl_1-1733912509922.png
  • 554 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

As mentioned above, this is a limitation directly with Terraform due to this our engineering team is limited on the actions that can be done, you can find more information about this limitation on the Terraform documentation: https://developer.hashic...

  • 0 kudos
3 More Replies
Anonymous
by Not applicable
  • 1474 Views
  • 1 replies
  • 1 kudos

Resolved! workflow set maximum queued items

Hi all,I have a question regarding Workflows and queuing of job runs. I'm running into a case where jobs are running longer than expected and result in job runs being queued, which is expected and desired. However, in this particular case we only nee...

  • 1474 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Unfortunately there is no way to control the number of jobs that will be moved to queue status when queuing is enabled.

  • 1 kudos
alcatraz96
by New Contributor II
  • 1791 Views
  • 3 replies
  • 0 kudos

Guidance Needed for Developing CI/CD Process in Databricks Using Azure DevOps

Hi everyone,I am working on setting up a complete end-to-end CI/CD process for my Databricks environment using Azure DevOps. So far, I have developed a build pipeline to create a Databricks artifact (DAB). Now, I need to create a release pipeline to ...

alcatraz96_1-1733897791930.png
  • 1791 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @alcatraz96 ,One question, why don't you use Databricks Assets Bundles? Then the whole process would be much simplerHere you have a good end to end example:CI/CD Integration with Databricks Workflows - Databricks Community - 81821

  • 0 kudos
2 More Replies
skarpeck
by New Contributor III
  • 601 Views
  • 3 replies
  • 0 kudos

Update set in foreachBatch

I need to track codes of records that were ingested in foreachBatch function, and pass it as a task value, so downstream tasks can take actions based on this output. What would be the best approach to achieve that? Now, I have a following solution, b...

  • 601 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Another approach is to persist the collected codes in a Delta table and then read from this table in downstream tasks. Make sure to add ample logging and counts. Checkpointing also would help if you suspect the counts in set are not the same as what ...

  • 0 kudos
2 More Replies
JKR
by Contributor
  • 2974 Views
  • 1 replies
  • 0 kudos

Databricks sql variables and if/else workflow

I have 2 tasks in databricks job workflow first task is of type SQL and SQL task is query.In that query I've declared 2 variables and SET the values by running query.e.g:DECLARE VARIABLE max_timestamp TIMESTAMP DEFAULT '1970-01-01'; SET VARIABLE max_...

Data Engineering
databricks-sql
Workflows
  • 2974 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Please try with  max_timestamp = dbutils.jobs.taskValues("sql_task_1")["max_timestamp"] dbutils.jobs.taskValues("python_task_1", {"max_timestamp": max_timestamp}) Reference- https://docs.databricks.com/en/jobs/task-values.html  

  • 0 kudos
willie_nelson
by New Contributor II
  • 756 Views
  • 3 replies
  • 1 kudos

ABFS Authentication with a SAS token -> 403!

Hi guys,I'm running a streamReader/Writer with autoloader from StorageV2 (general purpose v2) over abfss instead of wasbs. My checkpoint location is valid, the reader properly reads the file schema and autoloader is able to sample 105 files to do so....

  • 756 Views
  • 3 replies
  • 1 kudos
Latest Reply
BricksGuy
New Contributor III
  • 1 kudos

Would you mind to paste the sample code please. I am trying to use abfs with autoloader and getting error like yours.

  • 1 kudos
2 More Replies
Vetrivel
by Contributor
  • 1921 Views
  • 3 replies
  • 1 kudos

Resolved! SSIS packages migration to Databricks Workflows

We are doing POC to migrate SSIS packages to Databricks workflows as part of our effort to build the analytics layer, including dimension and fact tables. How can we accelerate or automate the SSIS package migration to Databricks environment?

  • 1921 Views
  • 3 replies
  • 1 kudos
Latest Reply
BlakeHill
New Contributor II
  • 1 kudos

Thank you so much for the solution.

  • 1 kudos
2 More Replies
GabrieleMuciacc
by New Contributor III
  • 5252 Views
  • 5 replies
  • 2 kudos

Resolved! Support for kwargs parameter in `/2.1/jobs/create` endpoint for `python_wheel_task`

If I create a job from the web UI and I select Python wheel, I can add kwargs parameters. Judging from the generated JSON job description, they appear under a section named `namedParameters`.However, if I use the REST APIs to create a job, it appears...

  • 5252 Views
  • 5 replies
  • 2 kudos
Latest Reply
manojpatil04
New Contributor III
  • 2 kudos

@GabrieleMuciacc , in case of serverless compute job this can be pass as external dependency you can't use libraries. "tasks": [{ "task_key": task_id,                      "spark_python_task": {                        "python_file": py_file,         ...

  • 2 kudos
4 More Replies
radix
by New Contributor II
  • 1330 Views
  • 1 replies
  • 0 kudos

Pool clusters and init scripts

Hey, just trying out pool clusters and providing the instance_pool_type and driver_instance_pool_id configuration to the Airflow new_cluster fieldI also pass the init_scripts field with an s3 link as usual but it this case of pool clusters it doesn't...

  • 1330 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

When using a non pool cluster are you able to see the init script being deployed? You could set init script logging to see if it is being called or not at all https://docs.databricks.com/en/init-scripts/logs.html 

  • 0 kudos
Direo
by Contributor II
  • 1659 Views
  • 1 replies
  • 0 kudos

Managing Secrets for Different Groups in a Databricks Workspace

Hi everyone,I'm looking for some advice on how people are managing secrets within Databricks when you have different groups (or teams) in the same workspace, each requiring access to different sets of secrets.Here’s the challenge:We have multiple gro...

  • 1659 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Managing secrets within Databricks when you have different groups or teams in the same workspace can be approached in several ways, each with its own advantages. Here are some best practices and methods based on the context provided: Using Azure Key...

  • 0 kudos
mjedy78
by New Contributor II
  • 738 Views
  • 3 replies
  • 0 kudos

How to enable AQE in foreachbatch mode

I am processing the daily data from checkpoint to checkpoint everyday by using for each batch in streaming way.df.writeStream.format("delta") .option("checkpointLocation", "dbfs/loc") .foreachBatch(transform_and_upsert) .outpu...

mjedy78_0-1733819593344.png
  • 738 Views
  • 3 replies
  • 0 kudos
Latest Reply
mjedy78
New Contributor II
  • 0 kudos

@MuthuLakshmi any idea?

  • 0 kudos
2 More Replies
niruban
by New Contributor II
  • 2608 Views
  • 3 replies
  • 0 kudos

Databricks Asset Bundle to deploy only one workflow

Hello Community -I am trying to deploy only one workflow from my CICD. But whenever I am trying to deploy one workflow using "databricks bundle deploy - prod", it is deleting all the existing workflow in the target environment. Is there any option av...

Data Engineering
CICD
DAB
Databricks Asset Bundle
DevOps
  • 2608 Views
  • 3 replies
  • 0 kudos
Latest Reply
nvashisth
New Contributor III
  • 0 kudos

Hi Team, the deployment via DAB(Databricks Asset Bundle) reads all yml files present and based on that workflows are generated. In the previous versions of Databricks CLI prior to 0.236(or latest one), it use to delete all the workflow by making dele...

  • 0 kudos
2 More Replies
sangwan
by New Contributor
  • 631 Views
  • 1 replies
  • 0 kudos

Issue: 'Catalog hive_metastore doesn't exist. Create it?' Error When Installing Reconcile

Utility : Remorph (Databricks)Issue: 'Catalog  hive_metastore doesn't exist. Create it?'Error When Installing ReconcileI am encountering an issue while installing Reconcile on Databricks. Despite hive_metastore catalog is by default present in the Da...

  • 631 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @sangwan , Its not very clear, does the error come with a stacktrace? if so could you please share it? Also, any WARN/ERROR messages in the Driver log by any chance?

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels