cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anske
by New Contributor III
  • 562 Views
  • 4 replies
  • 0 kudos

how to stop dataframe with federated table source to be reevaluated when referenced (cache?)

Hi,Would anyone happen to know whether it's possible to cache a dataframe in memory that the result of a query on a federated table?I have a notebook that queries a federated table, does some transformations on the dataframe and then writes this data...

  • 562 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anske
New Contributor III
  • 0 kudos

@daniel_sahal , this is the code snippet:lsn_incr_batch = spark.sql(f"""select start_lsn,tran_begin_time,tran_end_time,tran_id,tran_begin_lsn,cast('{current_run_ts}' as timestamp) as appendedfrom externaldb.cdc.lsn_time_mappingwhere tran_end_time > '...

  • 0 kudos
3 More Replies
CarstenWeber
by New Contributor II
  • 950 Views
  • 4 replies
  • 1 kudos

Resolved! Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

Hi Community,i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with: model = PipelineModel.load(path) i set the spark config:  spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.azure.account.oauth.provi...

  • 950 Views
  • 4 replies
  • 1 kudos
Latest Reply
CarstenWeber
New Contributor II
  • 1 kudos

@daniel_sahal using the settings above did indeed work. 

  • 1 kudos
3 More Replies
amar1995
by New Contributor II
  • 1250 Views
  • 4 replies
  • 0 kudos

Performance Issue with XML Processing in Spark Databricks

I am reaching out to bring attention to a performance issue we are encountering while processing XML files using Spark-XML, particularly with the configuration spark.read().format("com.databricks.spark.xml").Currently, we are experiencing significant...

  • 1250 Views
  • 4 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@amar1995 - Can you try this streaming approach and see if it works for your use case (using autoloader) - https://kb.databricks.com/streaming/stream-xml-auto-loader

  • 0 kudos
3 More Replies
johnp
by New Contributor II
  • 551 Views
  • 1 replies
  • 0 kudos

Call databricks notebook from azure flask app

I have an Azure web app running flask web server.  From flask server, I want to run some queries on the data  stored in ADLS Gen2 storage.   I already created Databricks notebooks running these queries.  The flask server will pass some parameters in ...

  • 551 Views
  • 1 replies
  • 0 kudos
Latest Reply
feiyun0112
Contributor III
  • 0 kudos

you can use databricks SDKhttps://docs.databricks.com/en/dev-tools/sdk-python.html#create-a-job 

  • 0 kudos
Kanti1989
by New Contributor II
  • 897 Views
  • 4 replies
  • 0 kudos

Pyspark execution error

I am getting a error message when executing a simple pyspark code. Can anyone help me with this.  

Kanti1989_0-1713522601530.png
  • 897 Views
  • 4 replies
  • 0 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 0 kudos

Could you please share the entire error message?Are you running the code locally or on databricks?

  • 0 kudos
3 More Replies
data-grassroots
by New Contributor III
  • 1395 Views
  • 6 replies
  • 1 kudos

Resolved! Ingesting Files - Same file name, modified content

We have a data feed with files whose filenames stays the same but the contents change over time (brand_a.csv, brand_b.csv, brand_c.csv ....).Copy Into seems to ignore the files when they change.If we set the Force flag to true and run it, we end up w...

  • 1395 Views
  • 6 replies
  • 1 kudos
Latest Reply
data-grassroots
New Contributor III
  • 1 kudos

Thanks for the validation, Werners! That's the path we've been heading down (copy + merge). I still have some DLT experiments planned but - at least for this situation - copy + merge works just fine.

  • 1 kudos
5 More Replies
miaomia123
by New Contributor
  • 470 Views
  • 1 replies
  • 0 kudos

LLM using DataBrick

Is there any coding example for how to use LLM?

  • 470 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

I would like to share the following links https://www.databricks.com/product/machine-learning/large-language-models https://docs.databricks.com/en/large-language-models/index.html

  • 0 kudos
BrianJ
by New Contributor II
  • 1597 Views
  • 5 replies
  • 4 kudos

{{job.trigger.type}} not working and throws error on Edit Parameter from Job page

Following the instruction on the Job Parameter Dynamic values, I am able to use {{job.id}}{{job.name}}{{job.run_id}}{{job.repair_count}}{{job.start_time.[argument]}}However, when I set trigger_type as trigger_type: {{job.trigger.type}} and hit SAVE, ...

BrianJ_1-1713544000542.png BrianJ_0-1713544144110.png
  • 1597 Views
  • 5 replies
  • 4 kudos
Latest Reply
BrianJ
New Contributor II
  • 4 kudos

Thanks everyone, I decided to use the Sparkcontext instead. dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson()

  • 4 kudos
4 More Replies
niruban
by New Contributor II
  • 800 Views
  • 2 replies
  • 0 kudos

Databricks Asset Bundle to deploy only one workflow

Hello Community -I am trying to deploy only one workflow from my CICD. But whenever I am trying to deploy one workflow using "databricks bundle deploy - prod", it is deleting all the existing workflow in the target environment. Is there any option av...

Data Engineering
CICD
DAB
Databricks Asset Bundle
DevOps
  • 800 Views
  • 2 replies
  • 0 kudos
Latest Reply
niruban
New Contributor II
  • 0 kudos

@Rajani : This is what I am doing. I am having git actions to kick off which will run - name: bundle-deployrun: |      cd ${{ vars.HOME }}/dev-ops/databricks_cicd_deployment      databricks bundle deploy --debug Before running this step, I am creatin...

  • 0 kudos
1 More Replies
Espenol1
by New Contributor II
  • 2046 Views
  • 4 replies
  • 2 kudos

Resolved! Using managed identities to access SQL server - how?

Hello! My company wants us to only use managed identities for authentication. We have set up Databricks using Terraform, got Unity Catalog and everything, but we're a very small team and I'm struggling to control permissions outside of Unity Catalog....

  • 2046 Views
  • 4 replies
  • 2 kudos
Latest Reply
Espenol1
New Contributor II
  • 2 kudos

Thanks a lot. Then I guess we will try to use dbmanagedidentity for most of our needs, and create service principals +secret scopes when there are more specific needs, such as for limiting access to sensitive data. A bit of a hassle to scale, probabl...

  • 2 kudos
3 More Replies
SenthilJ
by New Contributor III
  • 956 Views
  • 1 replies
  • 1 kudos

Resolved! Unity Catalog Metastore Details

hi,I would like to seek response to my following questions regarding Unity Catalog Metastore's path.While configuring metastore, designating a metastore storage account (in case of Azure, it's ADLS Gen2) seems to be an optional thing. In case I confi...

Data Engineering
Unity Catalog
  • 956 Views
  • 1 replies
  • 1 kudos
Latest Reply
PL_db
New Contributor III
  • 1 kudos

The storage container you configure for the metastore will contain the files of managed tables and volumes. The metadata is stored in a database of the Databricks control plane.

  • 1 kudos
Snoonan
by Contributor
  • 2647 Views
  • 6 replies
  • 0 kudos

Resolved! Unity catalog issues

Hi all,I have recently enabled Unity catalog in my DBX workspace. I have created a new catalog with an external location on Azure data storage.I can create new schemas(databases) in the new catalog but I can't create a table. I get the below error wh...

  • 2647 Views
  • 6 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@Snoonan First of all, check the networking tab on the storage account to see if it's behind firewall. If it is, make sure that Databricks/Storage networking is properly configured (https://learn.microsoft.com/en-us/azure/databricks/security/network/...

  • 0 kudos
5 More Replies
Carlton
by New Contributor III
  • 352 Views
  • 1 replies
  • 0 kudos

Help Refactor T-SQL Code to Databricks SQL

Hello CommunityCan someone help refactor the following T-SQL Code to Databricks SQLCONVERT(DECIMAL(26, 8), ISNULL(xxx.xxxxxxx * ISNULL(RH.xxxxx, 1 / NULLIF(ST.xxxxxx, 0)), ST.xxxxx)) AS AmountWhen I attempt to execute the above code I get the followi...

  • 352 Views
  • 1 replies
  • 0 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 0 kudos

You can use CAST instead.Eg:SELECT cast('2024' as int);

  • 0 kudos
jfvizoso
by New Contributor II
  • 5741 Views
  • 4 replies
  • 0 kudos

Can I pass parameters to a Delta Live Table pipeline at running time?

I need to execute a DLT pipeline from a Job, and I would like to know if there is any way of passing a parameter. I know you can have settings in the pipeline that you use in the DLT notebook, but it seems you can only assign values to them when crea...

  • 5741 Views
  • 4 replies
  • 0 kudos
Latest Reply
Mustafa_Kamal
New Contributor II
  • 0 kudos

Hi @jfvizoso ,I also have the same scenario, did you find any work around.Thanks in advance.

  • 0 kudos
3 More Replies
LorenRD
by Contributor
  • 6477 Views
  • 8 replies
  • 9 kudos
  • 6477 Views
  • 8 replies
  • 9 kudos
Latest Reply
miranda_luna_db
Contributor II
  • 9 kudos

Hi friends -  To confirm, with new lakeview dashboards you can share dashboards to users and groups in your organization without having to provide any workspace and/or compute access.  https://docs.databricks.com/en/dashboards/index.html#what-is-shar...

  • 9 kudos
7 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors