Data Engineering

Forum Posts

Sorted by:

by Phani1 • Databricks MVP

01-18-2024 12:36:35 AM

2207 Views
1 replies
0 kudos

cloud fetch Qlik Sense

Hi Team,Cloud Fetch will improve data transfer efficiency from DataBricks to Power BI and is it compatible with Qlik Sense as well ?

Data Engineering

2207 Views
1 replies
0 kudos

01-18-2024 12:36:35 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-20-2024 9:42:48 AM

0 kudos

Cloud Fetch is a feature introduced in Databricks Runtime 8.3 and Simba ODBC 2.6.17 driver that significantly improves data transfer efficiency from Databricks to BI tools like Power BI. It achieves this by fetching data in parallel via cloud storage...

0 kudos

01-20-2024 9:42:48 AM

by Vishwanath_Rao • New Contributor II

01-19-2024 10:59:35 AM

3038 Views
1 replies
0 kudos

Same path producing different counts on Databricks and EMR

We're in the middle of migrating to Databricks and found that the same path on s3 is producing different counts between EMR (Spark 2.4.4) and Databricks (Spark 3.4.1) it is a simple spark.read.parquet().count(), tried multiple solutions like making t...

Data Engineering

3038 Views
1 replies
0 kudos

01-19-2024 10:59:35 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-20-2024 9:24:43 AM

0 kudos

The discrepancy in counts between EMR (Spark 2.4.4) and Databricks (Spark 3.4.1) could be due to several reasons:1. Different versions of Spark: The two environments are running different versions of Spark which might have different optimizations or ...

0 kudos

01-20-2024 9:24:43 AM

by DmitriyLamzin • New Contributor II

01-09-2024 8:41:37 AM

8281 Views
1 replies
0 kudos

applyInPandas hangs on runtime 13.3 LTS ML and above

Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...

Data Engineering

pandas PythonRunner

8281 Views
1 replies
0 kudos

01-09-2024 8:41:37 AM

View Replies

Latest Reply

Debayan
Databricks Employee

01-19-2024 11:12:40 PM

0 kudos

Same post: https://community.databricks.com/t5/data-engineering/applyinpandas-function-hangs-in-runtime-13-3-lts-ml-and-above/td-p/56795

0 kudos

01-19-2024 11:12:40 PM

by MarsSu • New Contributor II

06-22-2023 6:46:06 PM

11071 Views
3 replies
0 kudos

How to implement merge multiple rows in single row with array and do not result in OOM?

Hi, Everyone.Currently I try to implement spark structured streaming with Pyspark. And I would like to merge multiple rows in single row with array and sink to downstream message queue for another service to use. Related example can follow as:* Befor...

Data Engineering

11071 Views
3 replies
0 kudos

06-22-2023 6:46:06 PM

View Replies

Latest Reply

917074
Databricks Partner

01-19-2024 12:05:15 PM

0 kudos

Is there any solution to this, @MarsSu were you able to solve this, kindly shed some light on this if you resolve this.

0 kudos

01-19-2024 12:05:15 PM

2 More Replies

by uncle_rufus • Databricks Partner

01-19-2024 10:39:13 AM

3699 Views
0 replies
0 kudos

ipywidgets

I am having an issue with getting a display upon interaction with the ipywidgets dropdown menu. Once I've selected an option from the dropdown, nothing happens. I am inclined to believe it has to do with how I've structure my on_select function and n...

Data Engineering

3699 Views
0 replies
0 kudos

01-19-2024 10:39:13 AM

by Jake2 • New Contributor III

01-09-2024 1:28:40 PM

8116 Views
4 replies
1 kudos

Resolved! Z-Ordering a Unity Catalog Materialized View

Hey everyone, We're making the move to Unity Catalog from Hive_Metastore and we're running into some issues performing Z-order optimizations on some of our tables. These tables are, in either place, materialized views created with a "create or refres...

Data Engineering

8116 Views
4 replies
1 kudos

01-09-2024 1:28:40 PM

View Replies

Latest Reply

Jake2
New Contributor III

01-19-2024 7:14:53 AM

1 kudos

For anyone who's reading this later: You can still Z-order your materialized views, but you can't run it as a SQL command. Instead, you can set it as one of the TBLPROPERTIES when you define the table. Here's an example:create or refresh live table {...

1 kudos

01-19-2024 7:14:53 AM

3 More Replies

by DManowitz-BAH • New Contributor II

01-18-2024 12:47:57 PM

4965 Views
4 replies
1 kudos

Apparent bug with dbutils.fs.cp on S3 using DBR 13.3LTS

If I use dbutils.fs.cp on a cluster running DBR 13.3LTS to try to copy an object on S3 from one prefix to another, I don't get the expected results.For example, if I try the following command:dbutils.fs.cp('s3://some-bucket/some/prefix/some_file.gz',...

Data Engineering

4965 Views
4 replies
1 kudos

01-18-2024 12:47:57 PM

View Replies

Latest Reply

Lakshay
Databricks Employee

01-19-2024 6:43:33 AM

1 kudos

Hi @DManowitz-BAH , The correct syntax to use the dbutils.fs.cp command is to provide the file name in the destination path. Please check the document here: https://docs.databricks.com/en/dev-tools/databricks-utils.html#cp-command-dbutilsfscp

1 kudos

01-19-2024 6:43:33 AM

3 More Replies

by JagadishMori • New Contributor II

01-16-2024 7:07:32 AM

2501 Views
1 replies
0 kudos

Need to set parameter order and Note(Tooltip) on Databricks notebook task parameter

Hi Team,I have created a workflow job on Databricks which has 5 parameters. I created parameters using deploument.json like:"tasks": [{"task_key": "Test1","notebook_task": {"notebook_path": "Notebooks/F/UF/FileUpload","base_parameters": {"File_name":...

Data Engineering

2501 Views
1 replies
0 kudos

01-16-2024 7:07:32 AM

View Replies

Latest Reply

JagadishMori
New Contributor II

01-19-2024 6:12:59 AM

0 kudos

Thanks for your reply @Retired_mod , will you help me understand which order Databricks uses to arrange parameters on UI so I can use that as a prefix to order parameters? Like ASCII, binary or something else?

0 kudos

01-19-2024 6:12:59 AM

by Rags98 • New Contributor II

01-18-2024 1:28:39 PM

3391 Views
1 replies
0 kudos

Undrop a table from built-in catalogs Azure Databricks

How can I undrop a table from a built-in catalog in Azure Databricks

Data Engineering

Undrop

3391 Views
1 replies
0 kudos

01-18-2024 1:28:39 PM

View Replies

Latest Reply

Lakshay
Databricks Employee

01-19-2024 4:41:14 AM

0 kudos

If you are using Unity Catalog, you can simply run the UnDrop command. Ref Doc:- https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-undrop-table.html

0 kudos

01-19-2024 4:41:14 AM

by SenthilJ • New Contributor III

01-17-2024 5:49:19 AM

4199 Views
1 replies
2 kudos

Unity Catalog and Data Accessibility

Hi,I got a few question about the internals of #Unity Catalog in #Databricks1. Understand that we can customize the UC metastore at different levels (catalog/schema). Wondering where is the information about UC permission model stored for every data ...

Data Engineering

Databricks

Unity Catalog

4199 Views
1 replies
2 kudos

01-17-2024 5:49:19 AM

View Replies

Latest Reply

SenthilJ
New Contributor III

01-18-2024 8:55:22 PM

2 kudos

thank you @Retired_mod ,your response really helps. A quick follow up - when Unity Catalog uses its permissions to access objects across workspaces, what kind of connection method does it use to access the data object i.e. in this case, when User Y q...

2 kudos

01-18-2024 8:55:22 PM

by Simon_T • New Contributor III

01-17-2024 8:02:38 AM

3810 Views
1 replies
0 kudos

CURL API - Error while parsing token: io.jsonwebtoken.ExpiredJwtException: JWT expired

I am running this code:curl -X --request GET -H "Authorization: Bearer <databricks token>" "https://adb-1817728758721967.7.azuredatabricks.net/api/2.0/clusters/list"And I am getting this error:2024-01-17T13:21:41.4245092Z </head>2024-01-17T13:21:41.4...

Data Engineering

3810 Views
1 replies
0 kudos

01-17-2024 8:02:38 AM

View Replies

Latest Reply

Debayan
Databricks Employee

01-18-2024 8:12:58 PM

0 kudos

Hi, Could you please renew the token and confirm?

0 kudos

01-18-2024 8:12:58 PM

by youssefmrini • Databricks Employee

06-16-2023 4:35:23 AM

2318 Views
2 replies
0 kudos

Resolved! Can I undrop a table in Databricks ?

Data Engineering

2318 Views
2 replies
0 kudos

06-16-2023 4:35:23 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

06-16-2023 4:35:50 AM

0 kudos

yes you can use the undrop command.Watch the video to know more : https://www.youtube.com/watch?v=NZMhG25mg2E&feature=youtu.be

0 kudos

06-16-2023 4:35:50 AM

1 More Replies

by jborn • New Contributor III

01-11-2024 9:14:55 AM

9378 Views
6 replies
1 kudos

Resolved! Connecting an Azure Databricks to Azure Gen 2 storage stuck on "Running Command..."

I recently had an Azure Databricks setup done behind a VPN. I'm trying to connect to my Azure Storage Account Gen 2 Using the following code I haven't been able to connect and keep getting stuck on reading the file. What should I be checking? #i...

Data Engineering

9378 Views
6 replies
1 kudos

01-11-2024 9:14:55 AM

View Replies

Latest Reply

jborn
New Contributor III

01-18-2024 11:45:45 AM

1 kudos

I ended up opening a ticket with Microsoft support about this issue and they walked us through the debugging on the issue. In the end the route table was not attached to the subnet. Once attached everything worked.

1 kudos

01-18-2024 11:45:45 AM

5 More Replies

by VJ3 • Contributor

01-17-2024 2:22:37 PM

5393 Views
3 replies
2 kudos

Best Practice to use/implement SQL Persona using Azure Databricks

Hello,I am looking for details of Security Controls to use/implement SQL Persona using Azure Databricks.

Data Engineering

5393 Views
3 replies
2 kudos

01-17-2024 2:22:37 PM

View Replies

Latest Reply

Debayan
Databricks Employee

01-18-2024 11:32:01 AM

2 kudos

Hi, There are several documents for the same and can be followed, let me know if the below helps. https://learn.microsoft.com/en-us/answers/questions/1039176/whitelist-databricks-to-read-and-write-into-azure https://www.databricks.com/blog/2020/03/2...

2 kudos

01-18-2024 11:32:01 AM

2 More Replies

by Twilight • Contributor

02-27-2023 9:01:58 AM

9275 Views
5 replies
4 kudos

Resolved! Bug - Databricks requires extra escapes in repl string in regexp_replace (compared to Spark)

In Spark (but not Databricks), these work:regexp_replace('1234567890abc', '^(?<one>\\w)(?<two>\\w)(?<three>\\w)', '$3$2$1') regexp_replace('1234567890abc', '^(?<one>\\w)(?<two>\\w)(?<three>\\w)', '${three}${two}${one}')In Databricks, you have to use ...

Data Engineering

9275 Views
5 replies
4 kudos

02-27-2023 9:01:58 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-13-2023 4:54:53 AM

4 kudos

@Stephen Wilcoxon : No, it is not a bug. Databricks uses a different flavor of regular expression syntax than Apache Spark. In particular, Databricks uses Java's regular expression syntax, whereas Apache Spark uses Scala's regular expression syntax....

4 kudos

03-13-2023 4:54:53 AM

4 More Replies

Databricks Community

Forum Posts

cloud fetch Qlik Sense

Same path producing different counts on Databricks and EMR

applyInPandas hangs on runtime 13.3 LTS ML and above

How to implement merge multiple rows in single row with array and do not result in OOM?

ipywidgets

Resolved! Z-Ordering a Unity Catalog Materialized View

Apparent bug with dbutils.fs.cp on S3 using DBR 13.3LTS

Need to set parameter order and Note(Tooltip) on Databricks notebook task parameter

Undrop a table from built-in catalogs Azure Databricks

Unity Catalog and Data Accessibility

CURL API - Error while parsing token: io.jsonwebtoken.ExpiredJwtException: JWT expired

Resolved! Can I undrop a table in Databricks ?

Resolved! Connecting an Azure Databricks to Azure Gen 2 storage stuck on "Running Command..."

Best Practice to use/implement SQL Persona using Azure Databricks

Resolved! Bug - Databricks requires extra escapes in repl string in regexp_replace (compared to Spark)

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template