Data Engineering

Forum Posts

Sorted by:

by vinaykumar • New Contributor III

02-14-2023 8:00:04 AM

1978 Views
3 replies
0 kudos

Can define custom session variable for login user authentication in databricks for Row -Column level security .

can create custom session variable for login user authentication in databricks .Like HANA session Variables, we have scenarios like today’s spotfire where we use a single generic user to connect to HANA ( we don’t have single sign on enabled ) in th...

Data Engineering

1978 Views
3 replies
0 kudos

02-14-2023 8:00:04 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-20-2023 2:29:27 AM

0 kudos

Hi @vinay kumar Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

0 kudos

04-20-2023 2:29:27 AM

2 More Replies

by prasad95 • New Contributor III

04-04-2024 3:01:13 AM

341 Views
2 replies
0 kudos

How to unwrap the notebook code lines, By default its getting wrapping the code lines

Data Engineering

341 Views
2 replies
0 kudos

04-04-2024 3:01:13 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 12:36:55 PM

0 kudos

Hi @prasad95 , To prevent automatic line wrapping in Databricks Notebooks, follow these steps: Run Selected Text: If you want to execute specific code lines without wrapping, select the desired text and either: Click Run > Run selected text from ...

0 kudos

04-05-2024 12:36:55 PM

1 More Replies

by Gilg • Contributor II

04-03-2024 5:56:09 PM

1532 Views
2 replies
0 kudos

Move files

HiI am using DLT with Autoloader.DLT pipeline is running in Continuous mode.Autoloader is in Directory Listing mode (Default)Question.I want to move files that has been processed by the DLT to another folder (archived) and planning to have another no...

Data Engineering

1532 Views
2 replies
0 kudos

04-03-2024 5:56:09 PM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 4:23:46 AM

0 kudos

Hi @Gilg, Deploying an AutoML pipeline in production while using a shared cluster in Databricks can be a bit tricky due to compatibility constraints. Let’s explore some potential workarounds: Shared Cluster with AutoML Compatibility: As you ment...

0 kudos

04-05-2024 4:23:46 AM

1 More Replies

by Brad • Contributor

04-05-2024 12:02:36 AM

1313 Views
2 replies
1 kudos

Resolved! What is the behavior when merge key is not unique

Hi, When using the MERGE statement, if merge key is not unique on both source and target, it will throw error. If merge key is unique in source but not unique in target, WHEN MATCHED THEN DELETE/UPDATE should work or not? For example merge key is id....

Data Engineering

1313 Views
2 replies
1 kudos

04-05-2024 12:02:36 AM

View Replies

Latest Reply

Brad
Contributor

04-05-2024 3:27:34 PM

1 kudos

Cool, this is what I tested out. Great to get confirmed. Thanks. BTW, https://medium.com/@ritik20023/delta-lake-upserting-without-primary-key-f4a931576b0 has a workaround which can fix the merge with duplicate merge key on both source and target.

1 kudos

04-05-2024 3:27:34 PM

1 More Replies

by Erik_L • Contributor II

04-05-2024 10:29:47 AM

289 Views
2 replies
1 kudos

Visualizations failing to show

I have a SQL query that generates a table. I created a visualization from that table with the UI. I then have a widget that updates a value used in the query and re-runs the SQL, but then the visualization shows nothing, that there is "1 row," but if...

Data Engineering

289 Views
2 replies
1 kudos

04-05-2024 10:29:47 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 1:13:35 PM

1 kudos

Hi @Erik_L , It seems like you’re encountering an issue with your visualization in Databricks. Let’s troubleshoot this! Here are a few common reasons why visualizations might not display as expected: Data Issues: Ensure that your SQL query is cor...

1 kudos

04-05-2024 1:13:35 PM

1 More Replies

by aurora • New Contributor

04-04-2024 12:15:58 AM

315 Views
1 replies
0 kudos

JDBC drivers for Microsoft Dataverse IO

I want to run Databricks ETLs on on-prem Unix, Azure and on AWS (in future). I am trying to find suitable JDBC drivers but couldn't find anything except CDATA which is very costly.Can someone please help me? Also, what could be other viable solutions...

Data Engineering

dataverse

JDBC

spark

315 Views
1 replies
0 kudos

04-04-2024 12:15:58 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 1:22:46 PM

0 kudos

Hi @aurora, Let’s explore your options for running Databricks ETLs and connecting to Microsoft Dataverse. 1. JDBC Drivers for Databricks: Databricks provides its own JDBC drivers that allow you to connect to various data sources, including Microso...

0 kudos

04-05-2024 1:22:46 PM

by DumbBeaver • New Contributor II

04-04-2024 2:28:54 PM

255 Views
1 replies
0 kudos

Issue while writing data to unity catalog using JDBC

While writing the data to a pre-existing table in the unity catalog using JDBC. it just writes the Delta of the data. Driver used: com.databricks:databricks-jdbc:2.6.36Lets say I have the table has rows:+-+-+ |a|b| +-+-+ |1|2| |3|4| and I am appendi...

Data Engineering

JDBC

spark

Unity Catalog

255 Views
1 replies
0 kudos

04-04-2024 2:28:54 PM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 12:48:25 PM

0 kudos

Hi @DumbBeaver, When writing data to a pre-existing table in the Unity Catalog using JDBC, it’s essential to understand how the .union operation and the .overwrite mode work. Union Operation: When you use .union to append rows to an existing Data...

0 kudos

04-05-2024 12:48:25 PM

by himanshu_k • New Contributor

04-04-2024 9:42:03 AM

211 Views
1 replies
0 kudos

Clarification Needed: Ensuring Correct Pagination with Offset and Limit in PySpark

Hi community,I hope you're all doing well. I'm currently engaged in a PySpark project where I'm implementing pagination-like functionality using the offset and limit functions. My aim is to retrieve data between a specified starting_index and ending_...

Data Engineering

211 Views
1 replies
0 kudos

04-04-2024 9:42:03 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 12:46:34 PM

0 kudos

Hi @himanshu_k, Let’s delve into your questions regarding pagination using the offset and limit functions in PySpark, especially when dealing with partitioned data frames. Consistency of offset and limit Functions: The offset and limit functions ...

0 kudos

04-05-2024 12:46:34 PM

by Leszek • Contributor

04-04-2024 5:59:21 AM

253 Views
1 replies
0 kudos

[Delta Sharing - open sharing protocol] Token rotation

Hi, Do you have any experience of rotating Tokens in Delta Sharing automatically?There is an option to do that using CLI (Create and manage data recipients for Delta Sharing | Databricks on AWS). But what to do next? Sending new link to the token via...

Data Engineering

253 Views
1 replies
0 kudos

04-04-2024 5:59:21 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 12:44:43 PM

0 kudos

Hi @Leszek, Rotating tokens in Delta Sharing is a crucial security practice. Let’s break down the steps: Token Rotation: First, you’ve already taken the right step by using the CLI to create and manage data recipients for Delta Sharing. When you...

0 kudos

04-05-2024 12:44:43 PM

by Check • New Contributor

04-04-2024 12:31:35 AM

299 Views
1 replies
0 kudos

How to call azure databricks api from azure api management

Hi,Has anyone successfully configure azure apim to access databricks rest api ? If yes, appreciate he can provide the setup guide for me as I am stuck at this point. Thanks.

Data Engineering

299 Views
1 replies
0 kudos

04-04-2024 12:31:35 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 12:33:22 PM

0 kudos

Hi @Check, Configuring Azure API Management (APIM) to access Databricks REST API can be a bit tricky, but I’ll guide you through some potential approaches: Using Environment Variables and cURL: To execute Databricks API via a curl request, you ne...

0 kudos

04-05-2024 12:33:22 PM

by 397973 • New Contributor III

04-05-2024 7:52:48 AM

436 Views
3 replies
0 kudos

Having trouble installing my own Python wheel?

I want to install my own Python wheel package on a cluster but can't get it working. I tried two ways: I followed these steps: https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#:~:text=March%2025%2C%202024,code%...

Data Engineering

cluster

Notebook

436 Views
3 replies
0 kudos

04-05-2024 7:52:48 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

04-05-2024 10:44:17 AM

0 kudos

@397973 - Once you uploaded the .whl file, did you had a chance to list the file manually in the notebook? Also, did you had a chance to move the files to /Volumes .whl file?

0 kudos

04-05-2024 10:44:17 AM

2 More Replies

by SyedSaqib • New Contributor II

04-02-2024 10:40:50 AM

299 Views
2 replies
0 kudos

Delta Live Table : [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view

Hi,I have a delta live table workflow with storage enabled for cloud storage to a blob store.Syntax of bronze table in notebook===@dlt.table(spark_conf = {"spark.databricks.delta.schema.autoMerge.enabled": "true"},table_properties = {"quality": "bron...

Data Engineering

299 Views
2 replies
0 kudos

04-02-2024 10:40:50 AM

View Replies

Latest Reply

SyedSaqib
New Contributor II

04-05-2024 11:35:30 AM

0 kudos

Hi Kaniz,Thanks for replying back.I am using python for delta live table creation, so how can I set these configurations?When creating the table, add the IF NOT EXISTS clause to tolerate pre-existing objects.consider using the OR REFRESH clause Answe...

0 kudos

04-05-2024 11:35:30 AM

1 More Replies

by Henrique_Lino • New Contributor II

04-05-2024 7:53:04 AM

667 Views
6 replies
0 kudos

value is null after loading a saved df when using specific type in schema

I am facing an issue when using databricks, when I set a specific type in my schema and read a json, its values are fine, but after saving my df and loading again, the value is gone.I have this sample code that shows this issue: from pyspark.sql.typ...

Data Engineering

667 Views
6 replies
0 kudos

04-05-2024 7:53:04 AM

View Replies

Latest Reply

Lakshay
Esteemed Contributor

04-05-2024 9:51:32 AM

0 kudos

@Henrique_Lino , Where are you saving your df?

0 kudos

04-05-2024 9:51:32 AM

5 More Replies

by Anandsingh • New Contributor

04-05-2024 6:52:19 AM

235 Views
1 replies
0 kudos

Writing to multiple files/tables from data held within a single file through autoloader

I have a requirement to read and parse JSON files using autoloader where incoming JSON file has multiple sub entities. Each sub entity needs to go into its own delta table. Alternatively we can write each entity data to individual files. We can use D...

Data Engineering

235 Views
1 replies
0 kudos

04-05-2024 6:52:19 AM

View Replies

Latest Reply

Lakshay
Esteemed Contributor

04-05-2024 9:55:48 AM

0 kudos

I think using DLT's medallion architecture should be helpful in this scenario. You can write all the incoming data to one bronze table and one silver table. And you can have multiple gold tables based on the value of the sub-entities.

0 kudos

04-05-2024 9:55:48 AM

by Kavi_007 • New Contributor III

04-01-2024 12:44:21 PM

1213 Views
7 replies
1 kudos

Resolved! Seeing history even after vacuuming the Delta table

Hi,I'm trying to do the vacuum on a Delta table within a unity catalog. The default retention is 7 days. Though I vacuum the table, I'm able to see the history beyond 7 days. Tried restarting the cluster but still not working. What would be the fix ?...

Data Engineering

1213 Views
7 replies
1 kudos

04-01-2024 12:44:21 PM

View Replies

Latest Reply

Kavi_007
New Contributor III

04-04-2024 12:06:25 PM

1 kudos

No, that's wrong. VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold.VACUUM - Azu...

1 kudos

04-04-2024 12:06:25 PM

6 More Replies

User

Count

1603

736

344

284

247

Databricks

Forum Posts

Can define custom session variable for login user authentication in databricks for Row -Column level security .

How to unwrap the notebook code lines, By default its getting wrapping the code lines

Move files

Resolved! What is the behavior when merge key is not unique

Visualizations failing to show

JDBC drivers for Microsoft Dataverse IO

Issue while writing data to unity catalog using JDBC

Clarification Needed: Ensuring Correct Pagination with Offset and Limit in PySpark

[Delta Sharing - open sharing protocol] Token rotation

How to call azure databricks api from azure api management

Having trouble installing my own Python wheel?

Delta Live Table : [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view

value is null after loading a saved df when using specific type in schema

Writing to multiple files/tables from data held within a single file through autoloader

Resolved! Seeing history even after vacuuming the Delta table

Load multiple delta tables at once from Sql server

Starting Serverless sql cluster on GCP

"Can't login to databricks socket is closed" when ...

Temporary views no longer working for Share Comput...

Does DLT use one single SparkSession?