cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jeremy98
by Honored Contributor
  • 1637 Views
  • 3 replies
  • 0 kudos

how to pass secrets keys using a spark_python_task

Hello community,I was searching a way to pass secrets to spark_python_task. Using a notebook file is easy, it's only to use dbutils.secrets.get(...) but how to do the same thing using a spark_python_task set using serveless compute?Kind regards,

  • 1637 Views
  • 3 replies
  • 0 kudos
Latest Reply
analytics_eng
New Contributor III
  • 0 kudos

@Renu_  but passing them as spark_env will not work with serverless I guess? See also the limitations on the docs  Serverless compute limitations | Databricks on AWS 

  • 0 kudos
2 More Replies
dpc
by Contributor III
  • 1118 Views
  • 5 replies
  • 3 kudos

Resolved! Pass parameters between jobs

Hello I have jobIn that job, it runs a task (GetGid) that executes a notebook and obtains some value using dbutils.jobs.taskValuesSete.g. dbutils.jobs.taskValuesSet(key = "gid", value = gid)As a result, I can use this and pass it to another task for ...

  • 1118 Views
  • 5 replies
  • 3 kudos
Latest Reply
dpc
Contributor III
  • 3 kudos

Thanks @Hubert-Dudek and @ilir_nuredini I see this nowI'm setting using:dbutils.jobs.taskValues.Set()passing to the job task using Key - gid; Value - {{tasks.GetGid.values.gid}}Then reading using: pid = dbutils.widgets.get()

  • 3 kudos
4 More Replies
AlleyCat
by New Contributor II
  • 1345 Views
  • 3 replies
  • 0 kudos

To identify deleted Runs in Workflow.Job UI in "system.lakeflow"

Hi,I executed a few runs in a Workflow.Jobs UI. I then deleted some of them. I am seeing the deleted runs in "system.lakeflow.job_run_timeline". How do i know which runs are the deleted ones? Thanks

  • 1345 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @AlleyCat , Hope you are doing well!  The jobs table includes a delete_time column that records the time when the job was deleted by the user. So to identify deleted jobs, you can run a query like the following: SELECT * FROM system.lakeflow.jobs ...

  • 0 kudos
2 More Replies
DM0341
by New Contributor II
  • 552 Views
  • 2 replies
  • 1 kudos

Resolved! SQL Stored Procedures - Notebook to always run the CREATE query

I have a stored procedure that is saved as a query file. I can run it and the proc is created. However I want to take this one step further. I want my notebook to run the query file called sp_Remit.sql so if there is any changes to the proc between t...

  • 552 Views
  • 2 replies
  • 1 kudos
Latest Reply
DM0341
New Contributor II
  • 1 kudos

Thank you. I did find this about an hour after I posted. Thank you Kevin

  • 1 kudos
1 More Replies
SuMiT1
by New Contributor III
  • 655 Views
  • 1 replies
  • 1 kudos

Databricks to snowflake data load

Hi Team, I’m trying to load data from Databricks into Snowflake using the Snowflake Spark connector. I’m using a generic username and password, but I’m unable to log in using these credentials directly. In the Snowflake UI, I can only log in through ...

  • 655 Views
  • 1 replies
  • 1 kudos
Latest Reply
nayan_wylde
Esteemed Contributor II
  • 1 kudos

@SuMiT1  The recommended method to connect to snowflake from databricks is OAuth with Client Credentials Flow.This method uses a registered Azure AD application to obtain an OAuth token without user interaction.Steps:Register an app in Azure AD and c...

  • 1 kudos
StephanieAlba
by Databricks Employee
  • 3543 Views
  • 2 replies
  • 0 kudos

Is it possible to turn off the redaction of secrets? Is there a better way to solve this?

As part of our Azure Data Factory pipeline, we utilize Databricks to run some scripts that identify which files we need to load from a certain source. This list of files is then passed back into Azure Data Factory utilizing the Exit status from the n...

  • 3543 Views
  • 2 replies
  • 0 kudos
Latest Reply
joanafloresc
New Contributor II
  • 0 kudos

Hello, as of today, is it still not possible to unredact secret names?

  • 0 kudos
1 More Replies
GANAPATI_HEGDE
by New Contributor III
  • 1420 Views
  • 8 replies
  • 1 kudos

Unable to run sql alert task using databricks job using service principal

I am trying to run a SQL alert task in data bricks job, Service principal is the run_as as set to the job, and task fails with below error message.  I also checked if SPN can be granted permission on SQL alert and SQL query, turns out only user id or...

GANAPATI_HEGDE_0-1761623911949.jpeg
  • 1420 Views
  • 8 replies
  • 1 kudos
Latest Reply
GANAPATI_HEGDE
New Contributor III
  • 1 kudos

unfortunately, only these options are available in my UI, is it the new UI?

  • 1 kudos
7 More Replies
crami
by New Contributor III
  • 392 Views
  • 1 replies
  • 0 kudos

Resolved! Declative Pipeline: Can pipeline or job be deployed run_as using asset bundle

Hi, I have very interesting scenario. I am trying to use Declarative pipelines for first time. The platform team has made workspace artefacts as devops based deployment [infra as code], meaning, I cannot create compute. I have to create compute with ...

  • 392 Views
  • 1 replies
  • 0 kudos
Latest Reply
donna567taylor
New Contributor III
  • 0 kudos

@crami wrote:Hi, I have very interesting scenario. I am trying to use Declarative pipelines for first time. The platform team has made workspace artefacts as devops based deployment [infra as code], meaning, I cannot create compute. I have to create ...

  • 0 kudos
Alessandro
by New Contributor II
  • 2007 Views
  • 2 replies
  • 0 kudos

Resolved! Update jobs parameter, when running, from API

Hi, When a Job is running, I would like to change the parameters with an API call.I know that I can set parameters value from API when I start a job from API, or that I can update the default value if the job isn't running, but I didn't find an API c...

  • 2007 Views
  • 2 replies
  • 0 kudos
Latest Reply
XueChunmei
New Contributor II
  • 0 kudos

Hi, Alessandro, I am trying to set job parameters value from API when I start a job from API call within python notebook, however, it has never succeeded, the job can be triggered, but always with job parameters' default values instead values from AP...

  • 0 kudos
1 More Replies
Sakthi0311
by New Contributor
  • 652 Views
  • 2 replies
  • 1 kudos

How to enable Liquid Clustering on an existing Delta Live Table (DLT) and syntax for enabling it

 Hi all,I’m working with Delta Live Tables (DLT) and want to enable Liquid Clustering on an existing DLT table that was already created without it.Could someone please clarify:How can I enable Liquid Clustering on an existing DLT table (without recre...

  • 652 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Sakthi0311 ,For SQL language you can enable LC for materialized views and streaming tables. So the syntax looks following:If you want to use automatic clustering then use CLUSTER BY AUTO. 

  • 1 kudos
1 More Replies
excavator-matt
by Contributor III
  • 1902 Views
  • 6 replies
  • 3 kudos

Resolved! How do use Databricks Lakeflow Declarative Pipeline on AWS DMS data?

Hi!I am trying to replicate an AWS RDS PostgreSQL database in Databricks. I have successfully manage to enable CDC using AWS DMS that writes an initial load file and continuous CDC files in parquet.I have been trying to follow the official guide Repl...

Data Engineering
AUTO CDC
AWS DMS
declarative pipelines
LakeFlow
  • 1902 Views
  • 6 replies
  • 3 kudos
Latest Reply
mmayorga
Databricks Employee
  • 3 kudos

hey @excavator-matt  Let's remember that the Bronze layer is for mere raw ingestion; this provides a baseline for auditing and to start applying transformations based on the different use cases you need to serve. Systems and their requirements change...

  • 3 kudos
5 More Replies
Adam_Borlase
by New Contributor III
  • 1980 Views
  • 4 replies
  • 5 kudos

Resolved! Quota Limit Exhausted Error when Creating Data Ingestion with SQL Server Connector (Azure)

Good Day all,I am having an issue with our first Data Ingestion Pipelines, I am wanting to connect to our Azure SQL Server with our Unity Connector (I can access the data in Unity Catalog). When I am on Step 3 of the process (Source) when it is scann...

  • 1980 Views
  • 4 replies
  • 5 kudos
Latest Reply
Adam_Borlase
New Contributor III
  • 5 kudos

you for all of your assistance!

  • 5 kudos
3 More Replies
ghofigjong
by New Contributor
  • 14555 Views
  • 5 replies
  • 3 kudos

Resolved! How does partition pruning work on a merge into statement?

I have a delta table that is partitioned by Year, Date and month. I'm trying to merge data to this on all three partition columns + an extra column (an ID). My merge statement is below:MERGE INTO delta.<path of delta table> oldData using df newData ...

  • 14555 Views
  • 5 replies
  • 3 kudos
Latest Reply
Umesh_S
New Contributor II
  • 3 kudos

Isn't the suggested idea only filtering the input dataframe (resulting in a smaller amount of data to match across the whole delta table) rather than prune the delta table for relevant partitions to scan?

  • 3 kudos
4 More Replies
yit
by Databricks Partner
  • 821 Views
  • 3 replies
  • 3 kudos

Resolved! Does Autoloader supports loading PDF files?

I need to process PDF files already ingested. Based on the documentation, Autoloader does not support PDFs - or am I missing something?Also, I've found this sparkPDF library in other discussions in the community, but from what I see it's only for bat...

  • 821 Views
  • 3 replies
  • 3 kudos
Latest Reply
yit
Databricks Partner
  • 3 kudos

Any suggestions how to handle PDFs? @szymon_dybczak 

  • 3 kudos
2 More Replies
Filip
by New Contributor II
  • 8983 Views
  • 7 replies
  • 0 kudos

How to Assign User Managed Identity to DBR Cluster so I can use it for quering ADLSv2?

Hi,I'm trying to figure out if we can switch from Entra ID SPN's to User Assigned Managed Indentities and everything works except I can't figure out how to access the lake files from python notebook.I've tried with below code and was running it on a ...

  • 8983 Views
  • 7 replies
  • 0 kudos
Latest Reply
Coffee77
Honored Contributor II
  • 0 kudos

Besides, this only works in dedicated clusters, non working on shared ones. Why? No idea at all. Latest case, IMDS (Internal Metadata Service) used by Azure to inject token endpoint inside resources as a unique secure and valid channel to get tokens ...

  • 0 kudos
6 More Replies
Labels