Data Engineering

Forum Posts

Sorted by:

by Anske • New Contributor III

04-22-2024 7:40:43 AM

562 Views
4 replies
0 kudos

how to stop dataframe with federated table source to be reevaluated when referenced (cache?)

Hi,Would anyone happen to know whether it's possible to cache a dataframe in memory that the result of a query on a federated table?I have a notebook that queries a federated table, does some transformations on the dataframe and then writes this data...

Data Engineering

562 Views
4 replies
0 kudos

04-22-2024 7:40:43 AM

View Replies

Latest Reply

Anske
New Contributor III

04-23-2024 12:55:18 AM

0 kudos

@daniel_sahal , this is the code snippet:lsn_incr_batch = spark.sql(f"""select start_lsn,tran_begin_time,tran_end_time,tran_id,tran_begin_lsn,cast('{current_run_ts}' as timestamp) as appendedfrom externaldb.cdc.lsn_time_mappingwhere tran_end_time > '...

0 kudos

04-23-2024 12:55:18 AM

3 More Replies

by CarstenWeber • New Contributor II

04-22-2024 1:10:34 AM

950 Views
4 replies
1 kudos

Resolved! Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

Hi Community,i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with: model = PipelineModel.load(path) i set the spark config: spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.azure.account.oauth.provi...

Data Engineering

950 Views
4 replies
1 kudos

04-22-2024 1:10:34 AM

View Replies

Latest Reply

CarstenWeber
New Contributor II

04-23-2024 12:36:40 AM

1 kudos

@daniel_sahal using the settings above did indeed work.

1 kudos

04-23-2024 12:36:40 AM

3 More Replies

by amar1995 • New Contributor II

04-19-2024 4:09:16 AM

1250 Views
4 replies
0 kudos

Performance Issue with XML Processing in Spark Databricks

I am reaching out to bring attention to a performance issue we are encountering while processing XML files using Spark-XML, particularly with the configuration spark.read().format("com.databricks.spark.xml").Currently, we are experiencing significant...

Data Engineering

1250 Views
4 replies
0 kudos

04-19-2024 4:09:16 AM

View Replies

Latest Reply

shan_chandra
Esteemed Contributor

04-19-2024 9:51:30 AM

0 kudos

@amar1995 - Can you try this streaming approach and see if it works for your use case (using autoloader) - https://kb.databricks.com/streaming/stream-xml-auto-loader

0 kudos

04-19-2024 9:51:30 AM

3 More Replies

by johnp • New Contributor II

04-22-2024 1:38:18 PM

551 Views
1 replies
0 kudos

Call databricks notebook from azure flask app

I have an Azure web app running flask web server. From flask server, I want to run some queries on the data stored in ADLS Gen2 storage. I already created Databricks notebooks running these queries. The flask server will pass some parameters in ...

Data Engineering

551 Views
1 replies
0 kudos

04-22-2024 1:38:18 PM

View Replies

Latest Reply

feiyun0112
Contributor III

04-22-2024 6:56:41 PM

0 kudos

you can use databricks SDKhttps://docs.databricks.com/en/dev-tools/sdk-python.html#create-a-job

0 kudos

04-22-2024 6:56:41 PM

by Kanti1989 • New Contributor II

04-19-2024 3:30:37 AM

897 Views
4 replies
0 kudos

Pyspark execution error

I am getting a error message when executing a simple pyspark code. Can anyone help me with this.

Data Engineering

897 Views
4 replies
0 kudos

04-19-2024 3:30:37 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

04-22-2024 4:44:22 PM

0 kudos

Could you please share the entire error message?Are you running the code locally or on databricks?

0 kudos

04-22-2024 4:44:22 PM

3 More Replies

by data-grassroots • New Contributor III

04-16-2024 8:36:45 AM

1395 Views
6 replies
1 kudos

Resolved! Ingesting Files - Same file name, modified content

We have a data feed with files whose filenames stays the same but the contents change over time (brand_a.csv, brand_b.csv, brand_c.csv ....).Copy Into seems to ignore the files when they change.If we set the Force flag to true and run it, we end up w...

Data Engineering

1395 Views
6 replies
1 kudos

04-16-2024 8:36:45 AM

View Replies

Latest Reply

data-grassroots
New Contributor III

04-22-2024 3:43:36 PM

1 kudos

Thanks for the validation, Werners! That's the path we've been heading down (copy + merge). I still have some DLT experiments planned but - at least for this situation - copy + merge works just fine.

1 kudos

04-22-2024 3:43:36 PM

5 More Replies

by miaomia123 • New Contributor

06-29-2023 11:11:22 AM

470 Views
1 replies
0 kudos

LLM using DataBrick

Is there any coding example for how to use LLM?

Data Engineering

470 Views
1 replies
0 kudos

06-29-2023 11:11:22 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

04-22-2024 1:08:35 PM

0 kudos

I would like to share the following links https://www.databricks.com/product/machine-learning/large-language-models https://docs.databricks.com/en/large-language-models/index.html

0 kudos

04-22-2024 1:08:35 PM

by BrianJ • New Contributor II

04-19-2024 9:32:13 AM

1597 Views
5 replies
4 kudos

{{job.trigger.type}} not working and throws error on Edit Parameter from Job page

Following the instruction on the Job Parameter Dynamic values, I am able to use {{job.id}}{{job.name}}{{job.run_id}}{{job.repair_count}}{{job.start_time.[argument]}}However, when I set trigger_type as trigger_type: {{job.trigger.type}} and hit SAVE, ...

Data Engineering

1597 Views
5 replies
4 kudos

04-19-2024 9:32:13 AM

View Replies

Latest Reply

BrianJ
New Contributor II

04-22-2024 10:00:57 AM

4 kudos

Thanks everyone, I decided to use the Sparkcontext instead. dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson()

4 kudos

04-22-2024 10:00:57 AM

4 More Replies

by niruban • New Contributor II

04-20-2024 8:46:39 PM

800 Views
2 replies
0 kudos

Databricks Asset Bundle to deploy only one workflow

Hello Community -I am trying to deploy only one workflow from my CICD. But whenever I am trying to deploy one workflow using "databricks bundle deploy - prod", it is deleting all the existing workflow in the target environment. Is there any option av...

Data Engineering

CICD

DAB

Databricks Asset Bundle

DevOps

800 Views
2 replies
0 kudos

04-20-2024 8:46:39 PM

View Replies

Latest Reply

niruban
New Contributor II

04-22-2024 5:59:39 AM

0 kudos

@Rajani : This is what I am doing. I am having git actions to kick off which will run - name: bundle-deployrun: | cd ${{ vars.HOME }}/dev-ops/databricks_cicd_deployment databricks bundle deploy --debug Before running this step, I am creatin...

0 kudos

04-22-2024 5:59:39 AM

1 More Replies

by Espenol1 • New Contributor II

04-22-2024 4:23:36 AM

2046 Views
4 replies
2 kudos

Resolved! Using managed identities to access SQL server - how?

Hello! My company wants us to only use managed identities for authentication. We have set up Databricks using Terraform, got Unity Catalog and everything, but we're a very small team and I'm struggling to control permissions outside of Unity Catalog....

Data Engineering

2046 Views
4 replies
2 kudos

04-22-2024 4:23:36 AM

View Replies

Latest Reply

Espenol1
New Contributor II

04-22-2024 5:33:12 AM

2 kudos

Thanks a lot. Then I guess we will try to use dbmanagedidentity for most of our needs, and create service principals +secret scopes when there are more specific needs, such as for limiting access to sensitive data. A bit of a hassle to scale, probabl...

2 kudos

04-22-2024 5:33:12 AM

3 More Replies

by SenthilJ • New Contributor III

04-14-2024 6:58:01 PM

956 Views
1 replies
1 kudos

Resolved! Unity Catalog Metastore Details

hi,I would like to seek response to my following questions regarding Unity Catalog Metastore's path.While configuring metastore, designating a metastore storage account (in case of Azure, it's ADLS Gen2) seems to be an optional thing. In case I confi...

Data Engineering

Unity Catalog

956 Views
1 replies
1 kudos

04-14-2024 6:58:01 PM

View Replies

Latest Reply

PL_db
New Contributor III

04-22-2024 2:04:32 AM

1 kudos

The storage container you configure for the metastore will contain the files of managed tables and volumes. The metadata is stored in a database of the Databricks control plane.

1 kudos

04-22-2024 2:04:32 AM

by Snoonan • Contributor

04-19-2024 2:09:06 AM

2647 Views
6 replies
0 kudos

Resolved! Unity catalog issues

Hi all,I have recently enabled Unity catalog in my DBX workspace. I have created a new catalog with an external location on Azure data storage.I can create new schemas(databases) in the new catalog but I can't create a table. I get the below error wh...

Data Engineering

2647 Views
6 replies
0 kudos

04-19-2024 2:09:06 AM

View Replies

Latest Reply

daniel_sahal
Esteemed Contributor

04-19-2024 3:27:03 AM

0 kudos

@Snoonan First of all, check the networking tab on the storage account to see if it's behind firewall. If it is, make sure that Databricks/Storage networking is properly configured (https://learn.microsoft.com/en-us/azure/databricks/security/network/...

0 kudos

04-19-2024 3:27:03 AM

5 More Replies

by Carlton • New Contributor III

04-21-2024 11:33:56 AM

352 Views
1 replies
0 kudos

Help Refactor T-SQL Code to Databricks SQL

Hello CommunityCan someone help refactor the following T-SQL Code to Databricks SQLCONVERT(DECIMAL(26, 8), ISNULL(xxx.xxxxxxx * ISNULL(RH.xxxxx, 1 / NULLIF(ST.xxxxxx, 0)), ST.xxxxx)) AS AmountWhen I attempt to execute the above code I get the followi...

Data Engineering

352 Views
1 replies
0 kudos

04-21-2024 11:33:56 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

04-21-2024 6:02:45 PM

0 kudos

You can use CAST instead.Eg:SELECT cast('2024' as int);

0 kudos

04-21-2024 6:02:45 PM

by jfvizoso • New Contributor II

09-28-2022 3:20:02 AM

5741 Views
4 replies
0 kudos

Can I pass parameters to a Delta Live Table pipeline at running time?

I need to execute a DLT pipeline from a Job, and I would like to know if there is any way of passing a parameter. I know you can have settings in the pipeline that you use in the DLT notebook, but it seems you can only assign values to them when crea...

Data Engineering

5741 Views
4 replies
0 kudos

09-28-2022 3:20:02 AM

View Replies

Latest Reply

Mustafa_Kamal
New Contributor II

04-19-2024 4:11:36 PM

0 kudos

Hi @jfvizoso ,I also have the same scenario, did you find any work around.Thanks in advance.

0 kudos

04-19-2024 4:11:36 PM

3 More Replies

by LorenRD • Contributor

11-18-2021 8:15:32 AM

6477 Views
8 replies
9 kudos

Resolved! Is it posible to share a Dashboard with an user inside your org that doesn't have a Databricks account?

Data Engineering

6477 Views
8 replies
9 kudos

11-18-2021 8:15:32 AM

View Replies

Latest Reply

miranda_luna_db
Contributor II

04-19-2024 1:05:13 PM

9 kudos

Hi friends - To confirm, with new lakeview dashboards you can share dashboards to users and groups in your organization without having to provide any workspace and/or compute access. https://docs.databricks.com/en/dashboards/index.html#what-is-shar...

9 kudos

04-19-2024 1:05:13 PM

7 More Replies

User

Count

1602

738

348

285

247

Databricks Community

Forum Posts

how to stop dataframe with federated table source to be reevaluated when referenced (cache?)

Resolved! Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

Performance Issue with XML Processing in Spark Databricks

Call databricks notebook from azure flask app

Pyspark execution error

Resolved! Ingesting Files - Same file name, modified content

LLM using DataBrick

{{job.trigger.type}} not working and throws error on Edit Parameter from Job page

Databricks Asset Bundle to deploy only one workflow

Resolved! Using managed identities to access SQL server - how?

Resolved! Unity Catalog Metastore Details

Resolved! Unity catalog issues

Help Refactor T-SQL Code to Databricks SQL

Can I pass parameters to a Delta Live Table pipeline at running time?

Resolved! Is it posible to share a Dashboard with an user inside your org that doesn't have a Databricks account?

Pyspark serialization

Getting com.databricks.client.jdbc.Driver is not f...

Unit Testing DLT Pipelines

Retrieve job-level parameters in spark_python_task...

Cannot pass arrays to spark.sql() using named para...