cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anske
by New Contributor II
  • 107 Views
  • 4 replies
  • 1 kudos

Resolved! DLT apply_changes applies only deletes and inserts not updates

Hi,I have a DLT pipeline that applies changes from a source table (cdctest_cdc_enriched) to a target table (cdctest), by the following code:dlt.apply_changes(    target = "cdctest",    source = "cdctest_cdc_enriched",    keys = ["ID"],    sequence_by...

Data Engineering
Delta Live Tables
  • 107 Views
  • 4 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Anske, It seems you’re encountering an issue with your Delta Live Tables (DLT) pipeline where updates from the source table are not being correctly applied to the target table. Let’s troubleshoot this together! Pipeline Update Process: Whe...

  • 1 kudos
3 More Replies
jainshasha
by New Contributor
  • 92 Views
  • 6 replies
  • 0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 92 Views
  • 6 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

HI @jainshasha i tried to replicate your problem but in my case i was able to run jobs in parallel(the only difference is that i am running notebook from workspace, not from repo)As you can see jobs did not started exactly same time but it run in par...

  • 0 kudos
5 More Replies
Ameshj
by New Contributor
  • 282 Views
  • 7 replies
  • 0 kudos

Dbfs init script migration

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.This was don...

Data Engineering
Azure Databricks
dbfs
Great expectations
python
  • 282 Views
  • 7 replies
  • 0 kudos
Latest Reply
NandiniN
Valued Contributor II
  • 0 kudos

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

  • 0 kudos
6 More Replies
ashraf1395
by Visitor
  • 43 Views
  • 3 replies
  • 2 kudos

Resolved! Optimising Clusters in Databricks on GCP

Hi there everyone,We are trying to get hands on Databricks Lakehouse for a prospective client's project.Our Major aim for the project is to Compare Datalakehosue on Databricks and Bigquery Datawarehouse in terms of Costs and time to setup and run que...

  • 43 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @ashraf1395, Comparing Databricks Lakehouse and Google BigQuery is essential to make an informed decision for your project. Let’s address your questions: Cluster Configurations for Databricks: Databricks provide flexibility in configuring com...

  • 2 kudos
2 More Replies
tanjil
by New Contributor III
  • 8688 Views
  • 8 replies
  • 6 kudos

Resolved! Downloading sharepoint lists using python

Hello, I am trying to download lists from SharePoint into a pandas dataframe. However I cannot get any information successfully. I have attempted many solution mentioned in stackoverflow. Below is one of those attempts: # https://pypi.org/project/sha...

  • 8688 Views
  • 8 replies
  • 6 kudos
Latest Reply
huntaccess
Visitor
  • 6 kudos

The error "<urlopen error [Errno -2] Name or service not known>" suggests that there's an issue with the server URL or network connectivity. Double-check the server URL to ensure it's correct and accessible. Also, verify that your network connection ...

  • 6 kudos
7 More Replies
RabahO
by New Contributor III
  • 26 Views
  • 2 replies
  • 0 kudos

Dashboard always display truncated data

Hello, we're working with a serverless SQL cluster to query Delta tables and display some analytics in dashboards. We have some basic group by queries that generate around 36k lines, and they are executed without the "limit" key word. So in the data ...

RabahO_0-1714985064998.png RabahO_1-1714985222841.png
  • 26 Views
  • 2 replies
  • 0 kudos
Latest Reply
mhiltner
New Contributor II
  • 0 kudos

Hey @RabahO This is likely a memory issue.  The current behavior is that Databricks will only attempt to display the first 64000 rows of data. If the first 64000 rows of data are larger than 2187 MB, then it will fail to display anything. In your cas...

  • 0 kudos
1 More Replies
pragarwal
by New Contributor II
  • 32 Views
  • 2 replies
  • 0 kudos

Adding Member to group using account databricks rest api

Hi All,I want to add a member to a group in databricks account level using rest api (https://docs.databricks.com/api/azure/account/accountgroups/patch) as mentioned in this link I could able to authenticate but not able to add member while using belo...

  • 32 Views
  • 2 replies
  • 0 kudos
Latest Reply
pragarwal
New Contributor II
  • 0 kudos

Hi @Kaniz I have tried suggest body also but still member is not added to group. is there any other method that i can use add member to the group at account levelThanks,Phani.

  • 0 kudos
1 More Replies
smedegaard
by New Contributor III
  • 604 Views
  • 3 replies
  • 0 kudos

DLT run filas with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found"

I've created a streaming live table from a foreign catalog. When I run the DLT pipeline it fils with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found".I haven't seen any documentation that suggests I need to install Debezium manuall...

  • 604 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @smedegaard, The error message you’re encountering, “com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found,” indicates that the specified class is not available in your classpath.   To address this issue, follow these steps: Verif...

  • 0 kudos
2 More Replies
Chengzhu
by New Contributor
  • 124 Views
  • 1 replies
  • 0 kudos

Databricks Model Registry Notification

Hi community,Currently, I am training models on databricks cluster and use mlflow to log and register models. My goal is to send notification to me when a new version of registered model happens (if the new run achieves some model performance baselin...

Screenshot 2024-04-17 at 1.14.11 PM.png Screenshot 2024-04-17 at 1.13.14 PM.png
  • 124 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Chengzhu, It seems like you’re using MLflow’s Model Registry to manage the lifecycle of your machine learning models. Let’s explore this further. The MLflow Model Registry provides a centralized model store, APIs, and a UI to collaboratively m...

  • 0 kudos
EWhitley
by New Contributor II
  • 254 Views
  • 1 replies
  • 0 kudos

Custom ENUM input as parameter for SQL UDF?

Hello  - We're migrating from T-SQL to Spark SQL. We're migrating a significant number of queries."datediff(unit, start,end)" is different between these two implementations (in a good way).  For the purpose of migration, we'd like to stay as consiste...

  • 254 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @EWhitley, You’re on the right track with creating a custom UDF in Python for your migration. To achieve similar behaviour to the T-SQL DATEDIFF function with an enum-like unit parameter, you can follow these steps: Create a Custom UDF: Define...

  • 0 kudos
YannLevavasseur
by New Contributor
  • 341 Views
  • 1 replies
  • 0 kudos

SQL function refactoring into Databricks environment

Hello all,I'm currently working on importing  some SQL functions from Informix Database into Databricks using Asset Bundle deploying Delta Live Table to Unity Catalog. I'm struggling importing a recursive one, there is the code :CREATE FUNCTION "info...

YannLevavasseur_0-1713952085696.png YannLevavasseur_1-1713952236903.png
  • 341 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @YannLevavasseur, It looks like you’re dealing with a recursive SQL function for calculating the weight of articles in a Databricks environment. Handling recursion in SQL can be tricky, especially when translating existing Informix code to Data...

  • 0 kudos
Sambit_S
by New Contributor II
  • 270 Views
  • 1 replies
  • 0 kudos

Error during deserializing protobuf data

I am receiving protobuf data in a json attribute and along with it I receive a descriptor file.I am using from_protobuf to deserialize the data as below,It works most of the time but giving error when there are some recursive fields within the protob...

Sambit_S_0-1713966940987.png
  • 270 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Sambit_S, Handling recursive fields in Protobuf can indeed be tricky, especially when deserializing data. Let’s explore some potential solutions to address this issue: Casting Issue with Recursive Fields: The error you’re encountering might b...

  • 0 kudos
Skr7
by New Contributor II
  • 34 Views
  • 1 replies
  • 0 kudos

Databricks Asset Bundles

Hi, I'm implementing Databricks Asset bundles, my scripts are in GitHub and my /resource has all the .yml of my Databricks workflow which are pointing to the main branch      git_source: git_url: https://github.com/xxxx git_provider: ...

Data Engineering
Databricks
  • 34 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Skr7 , Let’s break down your requirements: Dynamically Changing Git Branch for Databricks Asset Bundles (DABs): When deploying and running your DAB, you want the Databricks workflows to point to your feature branch instead of the main branch....

  • 0 kudos
madhumitha
by Visitor
  • 58 Views
  • 4 replies
  • 0 kudos

Connect power bi desktop semantic model output to databricks

Hello, I am trying to connect the power bi semantic model output (basically the data that has already been pre processed) to databricks. Does anybody know how to do this? I would like it to be an automated process so I would like to know any way to p...

  • 58 Views
  • 4 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @madhumitha, Connecting Power BI semantic model output to Databricks can be done in a few steps. Here are a couple of options: Databricks Power Query Connector: The new Databricks connector is natively integrated into Power BI. You can configu...

  • 0 kudos
3 More Replies
dbdude
by New Contributor II
  • 4580 Views
  • 7 replies
  • 0 kudos

AWS Secrets Works In One Cluster But Not Another

Why can I use boto3 to go to secrets manager to retrieve a secret with a personal cluster but I get an error with a shared cluster?NoCredentialsError: Unable to locate credentials 

  • 4580 Views
  • 7 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @dbdude and @drii_cavalcanti , The NoCredentialsError you’re encountering when using Boto3 to retrieve a secret from AWS Secrets Manager typically indicates that the AWS SDK is unable to find valid credentials for your API request. Let’s explor...

  • 0 kudos
6 More Replies
Labels
Top Kudoed Authors