Data Engineering

Forum Posts

Sorted by:

by rami1 • New Contributor III

04-07-2022 3:21:17 AM

13912 Views
2 replies
6 kudos

METASTORE_DOWN: Cannot connect to metastore

I am trying to view databases and tables, default as well user created but it looks like the cluster created is not able to connect. I am using databricks default hive metastore. Viewing cluster logs provide following ventMETASTORE_DOWN Metastore is...

Data Engineering

13912 Views
2 replies
6 kudos

04-07-2022 3:21:17 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 10:01:59 AM

6 kudos

@rami :If the metastore is down, it means that the Databricks cluster is not able to connect to the metastore. Here are a few things you can try to resolve the issue:Check if the Hive metastore is up and running. You can try to connect to the metast...

6 kudos

05-13-2023 10:01:59 AM

1 More Replies

by Mumrel • Contributor

05-15-2023 2:07:17 AM

3773 Views
2 replies
2 kudos

Resolved! Error 95 when importing one Notebook into another

When I follow the instructions Modularize your code using files I get the following error:I am on azure, use DBRT 12.2 LTS, use ADLS as storage, I am happy to provide more details if needed. My research suggest that the reason is that the dfbs fuse...

Data Engineering

3773 Views
2 replies
2 kudos

05-15-2023 2:07:17 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

05-15-2023 6:25:48 AM

2 kudos

import works for .py files..%run is for notebooks.is lib a .py file or a notebook?

2 kudos

05-15-2023 6:25:48 AM

1 More Replies

by Thijs • New Contributor III

05-12-2023 3:35:05 AM

4612 Views
3 replies
4 kudos

How do I define & run jobs that execute scripts that are copied inside a custom DataBricks container?

Hi all, we are building custom Databricks containers (https://docs.databricks.com/clusters/custom-containers.html). During the container build process we install dependencies and also python source code scripts. We now want to run some of these scrip...

Data Engineering

4612 Views
3 replies
4 kudos

05-12-2023 3:35:05 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 6:01:55 PM

4 kudos

Hi @Thijs van den Berg Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

4 kudos

05-13-2023 6:01:55 PM

2 More Replies

by frank7 • Databricks Partner

04-28-2023 12:25:13 PM

4957 Views
2 replies
1 kudos

Resolved! Is it possible to write a pyspark dataframe to a custom log table in Log Analytics workspace?

I have a pyspark dataframe that contains information about the tables that I have on sql database (creation date, number of rows, etc)Sample data: { "Day":"2023-04-28", "Environment":"dev", "DatabaseName":"default", "TableName":"discount"...

Data Engineering

4957 Views
2 replies
1 kudos

04-28-2023 12:25:13 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 9:55:43 AM

1 kudos

@Bruno Simoes :Yes, it is possible to write a PySpark DataFrame to a custom log table in Log Analytics workspace using the Azure Log Analytics Workspace API.Here's a high-level overview of the steps you can follow:Create an Azure Log Analytics Works...

1 kudos

05-13-2023 9:55:43 AM

1 More Replies

by Long • New Contributor II

08-22-2022 2:25:56 AM

1957 Views
1 replies
1 kudos

Connecting to Azure SQL database using R in Databricks and Azure Key Vaults

I'm trying to connect to an Azure SQL database using R in Databricks. I want to read the credentials stored in Azure secret key vaults rather than hard coding in R code. I've seen some examples of it being done in Scala, however i'm after a R solutio...

Data Engineering

1957 Views
1 replies
1 kudos

08-22-2022 2:25:56 AM

View Replies

Latest Reply

ArturoNuor
New Contributor III

05-14-2023 10:30:07 PM

1 kudos

Did you find a solution for this @Long Pham ?? I am having the same issue

1 kudos

05-14-2023 10:30:07 PM

by Chalki • New Contributor III

05-12-2023 6:17:44 AM

5567 Views
2 replies
4 kudos

Resolved! Delta Table Merge statement is not accepting broadcast hint

I have a statement like this with pyspark:target_tbl.alias("target")\ .merge(stage_df.hint("broadcast").alias("source"), merge_join_expr)\ .whenMatchedUpdateAll()\ .whenNotMatchedInsertAll()\ .w...

Data Engineering

5567 Views
2 replies
4 kudos

05-12-2023 6:17:44 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 6:06:28 PM

4 kudos

Hi @Nikolay Chalkanov Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ans...

4 kudos

05-13-2023 6:06:28 PM

1 More Replies

by 720677 • New Contributor III

05-12-2023 2:47:53 AM

14671 Views
2 replies
0 kudos

S3 write to bucket - best performance tips

I'm writing big dataframes into deltas in s3 buckets. df.write\ .format("delta")\ .mode("append")\ .partitionBy(partitionColumns)\ .option("mergeSchema", "true")\ .save(target...

Data Engineering

14671 Views
2 replies
0 kudos

05-12-2023 2:47:53 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 8:55:52 AM

0 kudos

@Pablo (Ariel) :There are several ways to improve the performance of writing data to S3 using Spark. Here are some tips and recommendations:Increase the size of the write buffer: By default, Spark writes data in 1 MB batches. You can increase the si...

0 kudos

05-13-2023 8:55:52 AM

1 More Replies

by WillHeyer • New Contributor II

05-10-2023 1:43:54 PM

9274 Views
1 replies
2 kudos

Resolved! Best Practices for PowerBI Connectivity w/ Partner Connect. Access Token w/ Service Principal, Databricks Username w/ Service account, or OAuth?

I'm aware all are possible methods but are all equal? Or is the matter trivial? Thank you so much!

Data Engineering

9274 Views
1 replies
2 kudos

05-10-2023 1:43:54 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 9:08:08 AM

2 kudos

@Will Heyer :The best method for Power BI connectivity with Partner Connect depends on your specific use case and requirements. Here are some factors to consider for each method:Access Token with Service Principal: This method uses a client ID and s...

2 kudos

05-13-2023 9:08:08 AM

by DavideCagnoni • Contributor

05-10-2023 12:55:38 AM

17683 Views
1 replies
4 kudos

Resolved! How to use multi-cursor and rectangular selection for notebooks and query editor in Linux ?

The documentation explains how to use multicursor in notebooks. However, it only says it for Windows and MacOS. The Windows way would work in Linux (Ubuntu) up to a few days ago but it does not work now anymore.

Data Engineering

17683 Views
1 replies
4 kudos

05-10-2023 12:55:38 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2023 8:42:39 AM

4 kudos

@Davide Cagnoni :Multicursor support in Databricks notebooks is implemented using the Ace editor, which is a web-based code editor. Therefore, the behavior of multicursor support may depend on the specific browser and operating system you are using....

4 kudos

05-13-2023 8:42:39 AM

by vnc001 • New Contributor

05-12-2023 11:25:23 AM

2808 Views
1 replies
1 kudos

Resolved! Clusters API 2.0 - Unable to execute cluster events api

Details: I keep getting "Missing required field: cluster_id" even though you can see it is supplied. Is this a bug? or I am missing something? I am testing this in postman. Error: {"error_code":"INVALID_PARAMETER_VALUE","message":"Missing required fi...

Data Engineering

2808 Views
1 replies
1 kudos

05-12-2023 11:25:23 AM

View Replies

Latest Reply

SUMI1
New Contributor III

05-13-2023 4:07:11 AM

1 kudos

Hi guysI'm sorry to hear that the Clusters API 2.0 and cluster event execution are giving you trouble. I advise getting in touch with the support staff for guidance on quickly fixing the problem.

1 kudos

05-13-2023 4:07:11 AM

by Phani1 • Databricks MVP

05-11-2023 2:55:18 AM

2573 Views
1 replies
1 kudos

Resolved! DLT best practices

Hi Team,Could you please recommend the best practices to implement the delta live tables?Regards,Phanindra

Data Engineering

2573 Views
1 replies
1 kudos

05-11-2023 2:55:18 AM

View Replies

Latest Reply

Ryan_Chynoweth
Databricks Employee

05-12-2023 2:52:36 PM

1 kudos

Hi Phani, what exactly are you looking for with best practices? At a high level:Always provide an external storage location (S3, ADLS, GCS) for your pipelineUse Auto Scaling! Python imports can be leverage to reuse code With regards to providing a st...

1 kudos

05-12-2023 2:52:36 PM

by NOOR_BASHASHAIK • Databricks Partner

05-09-2023 10:13:09 AM

7296 Views
1 replies
2 kudos

Resolved! Azure Databricks PATs expire even before validity

Hi all,we have this issue in our environment - even thought we give 365 days validity for Databricks PATS generation, the PATs expire every now and then. Is there any problem with the command we use : curl --location --request POST 'https://<<HOST_NA...

Data Engineering

7296 Views
1 replies
2 kudos

05-09-2023 10:13:09 AM

View Replies

Latest Reply

karthik_p
Databricks Partner

05-12-2023 10:40:14 AM

2 kudos

@NOOR BASHA SHAIK It looks you are providing 365 days, can you please post your response. if you won't provide any lifetime then it should be valid indefinitely. can you please add 90 days validity and test

2 kudos

05-12-2023 10:40:14 AM

by Chinu • New Contributor III

05-11-2023 2:04:11 PM

1583 Views
1 replies
1 kudos

API to get Databricks Status AWS.

Hi, Do you have an api endpoint to call to get the databricks status for AWS?Thanks,

Data Engineering

1583 Views
1 replies
1 kudos

05-11-2023 2:04:11 PM

View Replies

Latest Reply

karthik_p
Databricks Partner

05-12-2023 10:27:10 AM

1 kudos

@Chinu Lee you have webhook/slack that can be used to fetch status https://docs.databricks.com/resources/status.html#webhookare you specifically looking for your account workspace/above one

1 kudos

05-12-2023 10:27:10 AM

by marcin-sg • New Contributor III

05-10-2023 6:31:06 AM

2145 Views
1 replies
3 kudos

Create (account wide) groups without account admin permissions

The use case is quite simple: each environment - databricks workspace (prod, test, dev) will be created by a separate service principal (which for isolation purpose should not have account wide admin permission) with terraform, but will belong to the...

Data Engineering

2145 Views
1 replies
3 kudos

05-10-2023 6:31:06 AM

View Replies

Latest Reply

marcin-sg
New Contributor III

05-12-2023 8:16:24 AM

3 kudos

Another thing would be also to assign workspace to a metastore without account admin permission - for similar reason.

3 kudos

05-12-2023 8:16:24 AM

by Anonymous • Not applicable

06-05-2021 9:50:06 PM

8422 Views
6 replies
2 kudos

Resolved! Delta Sharing - Unity Catalog difference

Delta Sharing and Unity catalog both have elements of data sharing. Can you please explain when one would use Delta sharing vs Unity Catalog?

Data Engineering

8422 Views
6 replies
2 kudos

06-05-2021 9:50:06 PM

View Replies

Latest Reply

DBXC
Contributor

05-12-2023 7:49:10 AM

2 kudos

Based on the Databricks reply from the post below: "Unity Catalog does not currently support separating data by workspace or Azure subscription. As you noted, data from all catalogs within a region can be accessed by any workspace within that region,...

2 kudos

05-12-2023 7:49:10 AM

5 More Replies

Databricks Community

Forum Posts

METASTORE_DOWN: Cannot connect to metastore

Resolved! Error 95 when importing one Notebook into another

How do I define & run jobs that execute scripts that are copied inside a custom DataBricks container?

Resolved! Is it possible to write a pyspark dataframe to a custom log table in Log Analytics workspace?

Connecting to Azure SQL database using R in Databricks and Azure Key Vaults

Resolved! Delta Table Merge statement is not accepting broadcast hint

S3 write to bucket - best performance tips

Resolved! Best Practices for PowerBI Connectivity w/ Partner Connect. Access Token w/ Service Principal, Databricks Username w/ Service account, or OAuth?

Resolved! How to use multi-cursor and rectangular selection for notebooks and query editor in Linux ?

Resolved! Clusters API 2.0 - Unable to execute cluster events api

Resolved! DLT best practices

Resolved! Azure Databricks PATs expire even before validity

API to get Databricks Status AWS.

Create (account wide) groups without account admin permissions

Resolved! Delta Sharing - Unity Catalog difference

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template