cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dimitry
by Contributor III
  • 34 Views
  • 3 replies
  • 0 kudos

databricks notebook parameter works in interactive mode but not in the job

Hi guys I've added a parameter "files_mask " to a notebook, with a default value.The job running this notebook broke with error: com.databricks.dbutils_v1.InputWidgetNotDefined: No input widget named files_mask is definedCode: mask = dbutils.widgets....

  • 34 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Dimitry ,Do you use python or scala in your notebook?

  • 0 kudos
2 More Replies
janm2
by New Contributor II
  • 1157 Views
  • 5 replies
  • 1 kudos

Autoloader cleansource option does not take any effect

Hello everyone,I was very keen to try out the Autoloader's new cleanSource option so we can clean up our landing folder easily.However I found out it does not have any effect whatsoever. As I cannot create a support case I am creating this post.A sim...

  • 1157 Views
  • 5 replies
  • 1 kudos
Latest Reply
SanthoshU
New Contributor II
  • 1 kudos

Any Solution ? 

  • 1 kudos
4 More Replies
jorperort
by Contributor
  • 51 Views
  • 4 replies
  • 2 kudos

Resolved! Spark JDBC Write Fails for Record Not Present - PK error

Good afternoon everyone,I’m writing this post to see if anyone has encountered this problem and if there is a way to resolve it or understand why it happens. I’m working in a Databricks Runtime 15.4 LTS environment, which includes Apache Spark 3.5.0 ...

  • 51 Views
  • 4 replies
  • 2 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 2 kudos

@jorperort  When writing to SQL Server tables with composite primary keys from Databricks using JDBC, unique constraint violations are often caused by Spark’s distributed retry logic  https://docs.databricks.com/aws/en/archive/connectors/jdbcSolution...

  • 2 kudos
3 More Replies
ashraf1395
by Honored Contributor
  • 2735 Views
  • 4 replies
  • 1 kudos

Resolved! How to capture dlt pipeline id / name using dynamic value reference

Hi there,I have a usecase where I want to set the dlt pipeline id in the configuration parameters of that dlt pipeline.The way we can use workspace ids or task id in notebook task task_id = {{task.id}}/ {{task.name}} and can save them as parameters a...

  • 2735 Views
  • 4 replies
  • 1 kudos
Latest Reply
CaptainJack
New Contributor III
  • 1 kudos

Did someone was able to get pipeline_id programaticaly?

  • 1 kudos
3 More Replies
Sergecom
by New Contributor III
  • 13 Views
  • 0 replies
  • 0 kudos

Migrating from on-premises HDFS to Unity Catalog - Looking for advice on on-prem options

Hi,We’re currently running a Databricks installation with an on-premises HDFS file system. As we’re looking to adopt Unity Catalog, we’ve realized that our current HDFS setup has limited support and compatibility with Unity Catalog.Our requirement: W...

  • 13 Views
  • 0 replies
  • 0 kudos
shadowinc
by New Contributor III
  • 3252 Views
  • 1 replies
  • 0 kudos

Call SQL Function via API

Background - I created a SQL function with the name schema.function_name, which returns a table, in a notebook, the function works perfectly, however, I want to execute it via API using SQL Endpoint. In API, I got insufficient privileges error, so gr...

  • 3252 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

Do you know if API service principal / user has USAGE on the database itself? This seems like the most likely issue based on information on the question.  Quick Fix Checklist:   Run these commands in order (replace api_user with the actual user from ...

  • 0 kudos
Pw76
by New Contributor II
  • 2267 Views
  • 4 replies
  • 1 kudos

CDC with Snapshot - next_snapshot_and_version() function

I am trying to use create_auto_cdc_from_snapshot_flow (formerly apply_changes_from_snapshot())  (see: https://docs.databricks.com/aws/en/dlt/cdc#cdc-from-snapshot)I am attempting to do SCD type 2 changes using historic snapshot data.In the first coup...

Data Engineering
CDC
dlt
Snapshot
  • 2267 Views
  • 4 replies
  • 1 kudos
Latest Reply
fabdsp
Visitor
  • 1 kudos

I have the same issue - very big limitation of create_auto_cdc_from_snapshot_flow and no solution

  • 1 kudos
3 More Replies
jeremy98
by Honored Contributor
  • 839 Views
  • 3 replies
  • 0 kudos

how to pass secrets keys using a spark_python_task

Hello community,I was searching a way to pass secrets to spark_python_task. Using a notebook file is easy, it's only to use dbutils.secrets.get(...) but how to do the same thing using a spark_python_task set using serveless compute?Kind regards,

  • 839 Views
  • 3 replies
  • 0 kudos
Latest Reply
analytics_eng
New Contributor III
  • 0 kudos

@Renu_  but passing them as spark_env will not work with serverless I guess? See also the limitations on the docs  Serverless compute limitations | Databricks on AWS 

  • 0 kudos
2 More Replies
nefflev1
by Visitor
  • 28 Views
  • 0 replies
  • 0 kudos

VS Code Python file execution

Hi Everyone,I'm using the Databricks VS Code Extension to develop and deploy Asset Bundles. Usually we work with Notebooks and use the "Run File as Workflow" function. Now I'm trying to use pure python file for a new use case and tried to use the "Up...

  • 28 Views
  • 0 replies
  • 0 kudos
abhirupa7
by New Contributor
  • 25 Views
  • 0 replies
  • 0 kudos

Databricks Workflow

I have a query. I have multiple job (workflow)present in my workspace. Those job runs regularly. Multiple task present in those jobs. Few task having notebook that contain for each code in it. now when a job runs that particular task execute the for ...

  • 25 Views
  • 0 replies
  • 0 kudos
dpc
by Contributor
  • 133 Views
  • 5 replies
  • 3 kudos

Resolved! Pass parameters between jobs

Hello I have jobIn that job, it runs a task (GetGid) that executes a notebook and obtains some value using dbutils.jobs.taskValuesSete.g. dbutils.jobs.taskValuesSet(key = "gid", value = gid)As a result, I can use this and pass it to another task for ...

  • 133 Views
  • 5 replies
  • 3 kudos
Latest Reply
dpc
Contributor
  • 3 kudos

Thanks @Hubert-Dudek and @ilir_nuredini I see this nowI'm setting using:dbutils.jobs.taskValues.Set()passing to the job task using Key - gid; Value - {{tasks.GetGid.values.gid}}Then reading using: pid = dbutils.widgets.get()

  • 3 kudos
4 More Replies
AlleyCat
by New Contributor II
  • 792 Views
  • 3 replies
  • 0 kudos

To identify deleted Runs in Workflow.Job UI in "system.lakeflow"

Hi,I executed a few runs in a Workflow.Jobs UI. I then deleted some of them. I am seeing the deleted runs in "system.lakeflow.job_run_timeline". How do i know which runs are the deleted ones? Thanks

  • 792 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @AlleyCat , Hope you are doing well!  The jobs table includes a delete_time column that records the time when the job was deleted by the user. So to identify deleted jobs, you can run a query like the following: SELECT * FROM system.lakeflow.jobs ...

  • 0 kudos
2 More Replies
Adam_Borlase
by New Contributor III
  • 42 Views
  • 2 replies
  • 0 kudos

Error trying to edit Job Cluster via Databricks CLI

Good Day all,After having issues with Cloud resources allocated to Lakeflow jobs and Gateways I am trying to apply a policy to the cluster that is allocated to the Job. I am very new to a lot of the databricks platform and the administration so all h...

  • 42 Views
  • 2 replies
  • 0 kudos
Latest Reply
Adam_Borlase
New Contributor III
  • 0 kudos

Good Afternoon Louis,Thank you for the detailed answer. The issue I face is that the default gateway is allocating Virtual CPUs which is not in our Quotas so I need to apply the Compute policy at the creation stage. At this point in the pipelines I c...

  • 0 kudos
1 More Replies
DM0341
by New Contributor
  • 71 Views
  • 2 replies
  • 1 kudos

Resolved! SQL Stored Procedures - Notebook to always run the CREATE query

I have a stored procedure that is saved as a query file. I can run it and the proc is created. However I want to take this one step further. I want my notebook to run the query file called sp_Remit.sql so if there is any changes to the proc between t...

  • 71 Views
  • 2 replies
  • 1 kudos
Latest Reply
DM0341
New Contributor
  • 1 kudos

Thank you. I did find this about an hour after I posted. Thank you Kevin

  • 1 kudos
1 More Replies
SuMiT1
by New Contributor III
  • 32 Views
  • 1 replies
  • 0 kudos

Databricks to snowflake data load

Hi Team, I’m trying to load data from Databricks into Snowflake using the Snowflake Spark connector. I’m using a generic username and password, but I’m unable to log in using these credentials directly. In the Snowflake UI, I can only log in through ...

  • 32 Views
  • 1 replies
  • 0 kudos
Latest Reply
nayan_wylde
Honored Contributor III
  • 0 kudos

@SuMiT1  The recommended method to connect to snowflake from databricks is OAuth with Client Credentials Flow.This method uses a registered Azure AD application to obtain an OAuth token without user interaction.Steps:Register an app in Azure AD and c...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels