Data Engineering

Forum Posts

Sorted by:

by isaac_gritz • Valued Contributor II

08-23-2022 12:14:12 AM

703 Views
1 replies
3 kudos

Connecting Applications and BI Tools to Databricks SQL

Access Data in Databricks Using an Application or your Favorite BI ToolYou can leverage Partner Connect for easy, low-configuration connections to some of the most popular BI tools through our optimized connectors. Alternatively, you can follow these...

Data Engineering

703 Views
1 replies
3 kudos

08-23-2022 12:14:12 AM

View Replies

Latest Reply

Kaniz
Community Manager

08-25-2022 5:54:01 AM

3 kudos

Thank you, @Isaac Gritz , for sharing this fantastic post!

3 kudos

08-25-2022 5:54:01 AM

by Chandana • New Contributor II

06-28-2022 3:57:48 PM

478 Views
1 replies
3 kudos

What’s the USP? DB SQL serverless? When is it coming to Azure?

Data Engineering

478 Views
1 replies
3 kudos

06-28-2022 3:57:48 PM

View Replies

Latest Reply

User16752242622
Valued Contributor

08-25-2022 3:26:03 AM

3 kudos

Hi @Chandana Basani DB SQL Serverless in Azure is planned for GA on the Release quarter FY23 Q1 in the release month 2023-01Best,Akash

3 kudos

08-25-2022 3:26:03 AM

by sawya • New Contributor II

08-19-2022 12:56:21 AM

1249 Views
3 replies
0 kudos

Migrate workspaces to another AWS account

Hi everyone,I have a Databricks workspace in an AWS account that I have to migrate to a new AWS accountDo you know how I can do it ? Or it's better to recreate a new one and move all the workbooks and if I choose to create one new how can you export ...

Data Engineering

1249 Views
3 replies
0 kudos

08-19-2022 12:56:21 AM

View Replies

Latest Reply

Abishek
Valued Contributor

08-25-2022 2:00:24 AM

0 kudos

@AMADOU THIOUNE Can you check the below link to export the run jobs? https://docs.databricks.com/jobs.html#export-job-runs. Try to reuse the same job_id with the /update and /reset endpoints, it should allow you much better access to previous run re...

0 kudos

08-25-2022 2:00:24 AM

2 More Replies

by rdobbss • New Contributor II

06-17-2022 10:38:05 AM

794 Views
2 replies
0 kudos

RPC Disassociate error due to container threshold exceeding and garbage collector error when reading 23 gb multiline JSON file.

I am reading 23 gb multi line json file and flattening it using udf and writing datframe as parquet using psypark.Cluster I am using is 3 node (8 core) 64gb memory with limit to go upto 8 nodes.I am able to process 7gb file with no issue and takes ar...

Data Engineering

794 Views
2 replies
0 kudos

06-17-2022 10:38:05 AM

View Replies

Latest Reply

Vidula
Honored Contributor

08-25-2022 1:48:31 AM

0 kudos

Hi @Ravi Dobariya Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

08-25-2022 1:48:31 AM

1 More Replies

by Azeez • New Contributor II

07-27-2022 11:13:11 PM

2623 Views
8 replies
1 kudos

Resolved! BAD_REQUEST:Failed to get oauth access token.Please try logout and login again

We deployed a test databricks workspace cluster on GCP. A single cluster got spinned up.Later we deleted the workspace.Now when we are trying to create a new one.It is giving this error"BAD_REQUEST:Failed to get oauth access token.Please try logout ...

Data Engineering

2623 Views
8 replies
1 kudos

07-27-2022 11:13:11 PM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

08-02-2022 4:31:51 AM

1 kudos

@Azeez Sayyad you can try this workaround.Remove the Databricks App from your Google account. In Google account settings, go to "Manage third-paarty access", and remove Databricks from both Third-Paarty app with account access and Sign-in with Googl...

1 kudos

08-02-2022 4:31:51 AM

7 More Replies

by Philip_Budbee • New Contributor II

08-24-2022 11:50:39 PM

738 Views
0 replies
1 kudos

Github workflow integration error

Hi,We have a working Github integration in place for our production workspace which is running 14 different jobs that are scheduled during different intervals, but throughout the entire day.The issue over the past 3-4 weeks that we have encountered i...

Data Engineering

738 Views
0 replies
1 kudos

08-24-2022 11:50:39 PM

by kirankv • New Contributor

08-24-2022 12:46:14 PM

430 Views
0 replies
0 kudos

How to get notebookid programmatically using R

Hi, I would like to log the notebook id programmatically in R, Is there any command that exists in R so that I can leverage to grab the notebook id, I tried with python using the below command and grab it without any issues, and looking for similar f...

Data Engineering

430 Views
0 replies
0 kudos

08-24-2022 12:46:14 PM

by Sunny • New Contributor III

06-17-2022 5:04:18 AM

3841 Views
6 replies
1 kudos

Using Thread.sleep in Scala

We need to hit REST web service every 5 mins until success message is received. The Scala object is inside a Jar file and gets invoked by Databricks task within a workflow.Thread.sleep(5000) is working fine but not sure if it is safe practice or is t...

Data Engineering

3841 Views
6 replies
1 kudos

06-17-2022 5:04:18 AM

View Replies

Latest Reply

Vartika
Moderator

08-24-2022 9:18:18 AM

1 kudos

Hey there @Sundeep P Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.C...

1 kudos

08-24-2022 9:18:18 AM

5 More Replies

by KaushikMaji • New Contributor II

02-21-2022 12:12:52 AM

4516 Views
4 replies
3 kudos

Resolved! Connecting with Azure AD token in PowerBI

Hello,We are trying to connect Databricks SQL endpoint from PowerBi using Azure AD service principal, which has been added to Databricks workspace using SCIM APIs. Now, when we open connection to Databricks in powerbi desktop and provide Azure AD acc...

Data Engineering

4516 Views
4 replies
3 kudos

02-21-2022 12:12:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-21-2022 1:55:08 AM

3 kudos

At the moment I do not think that is possible.The help page mentions:An Azure Active Directory token (recommended), an Azure Databricks personal access token, or your Azure Active Directory account credentials.These methods are all user bound, so no ...

3 kudos

02-21-2022 1:55:08 AM

3 More Replies

by gmartinez • New Contributor III

08-15-2022 2:41:09 PM

2571 Views
6 replies
1 kudos

How do I downgrade my subscription from Premium to Standard?

Hello,I have tried getting in touch with the support and sales but I still have no answer. I tried Databricks and I wish to continue but with a Standard subscription. However, It won't let me do it by myself since I need to reach out to the sales tea...

Data Engineering

2571 Views
6 replies
1 kudos

08-15-2022 2:41:09 PM

View Replies

Latest Reply

gmartinez
New Contributor III

08-24-2022 6:27:45 AM

1 kudos

Hello @Mohan Mathews, Have you received any news from the support team?I canceled the previous subscription and acquired a new one on a standard plan. However, when signing into my Databricks account my plan still shows up as "Premium".Is there a wa...

1 kudos

08-24-2022 6:27:45 AM

5 More Replies

by Harsh1 • New Contributor II

08-23-2022 9:36:02 AM

745 Views
2 replies
1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs While implemen...

Data Engineering

745 Views
2 replies
1 kudos

08-23-2022 9:36:02 AM

View Replies

Latest Reply

Harsh1
New Contributor II

08-24-2022 5:43:27 AM

1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

1 kudos

08-24-2022 5:43:27 AM

1 More Replies

by Srishti • New Contributor

08-24-2022 5:24:37 AM

596 Views
0 replies
0 kudos

Want to extract the start time and end time of tasks under particular job.

I have one job that reran multiple times which took 101 hours. Ideally the execution of that job is 8 hours. Using jobs 2.1 API I am able to extract the start and end time of job ID and run ID. This only helps me to get the duration of 101 hours , bu...

Data Engineering

596 Views
0 replies
0 kudos

08-24-2022 5:24:37 AM

by as999 • New Contributor III

04-02-2022 8:59:09 AM

6804 Views
7 replies
4 kudos

Databrick hive metastore location?

In databrick, where is hive metastore location is it control plane or data plane? for prod systems In terms of security what preventions should be taken to secure hive metastore?

Data Engineering

6804 Views
7 replies
4 kudos

04-02-2022 8:59:09 AM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

05-18-2022 2:58:00 AM

4 kudos

@as999 The default metastore is managed by Databricks. If you are concerned about security and would like to have your own metastore you can go for the external metastore setup. You have the details steps in the below doc for setting up the external...

4 kudos

05-18-2022 2:58:00 AM

6 More Replies

by Niha1 • New Contributor III

08-23-2022 10:06:33 PM

596 Views
0 replies
1 kudos

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

file_path = f"{datasets_dir}/airbnb/sf-listings/sf-listings-2019-03-06-clean.parquet/"2airbnb_df = spark.read.format("parquet").load(file_path)34display(airbnb_df)AnalysisException: Path does not exist: dbfs:/user/nniha9188@gmail.com/dbacademy/machi...

Data Engineering

596 Views
0 replies
1 kudos

08-23-2022 10:06:33 PM

by Somi • New Contributor III

08-23-2022 10:47:47 AM

443 Views
0 replies
1 kudos

How to set sparkTrials? I am receiving this TypeError: cannot pickle '_thread.lock' object

Hey Sara, this Somayeh from VINN Automotive.As I had already shared with you, I am trying to distribute hyperparameter tuning using hyperopt on a tensorflow.keras model. I am using sparkTrials in my fmin:spark_trials = SparkTrials(parallelism=4)...be...

Data Engineering

443 Views
0 replies
1 kudos

08-23-2022 10:47:47 AM

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Connecting Applications and BI Tools to Databricks SQL

What’s the USP? DB SQL serverless? When is it coming to Azure?

Migrate workspaces to another AWS account

RPC Disassociate error due to container threshold exceeding and garbage collector error when reading 23 gb multiline JSON file.

Resolved! BAD_REQUEST:Failed to get oauth access token.Please try logout and login again

Github workflow integration error

How to get notebookid programmatically using R

Using Thread.sleep in Scala

Resolved! Connecting with Azure AD token in PowerBI

How do I downgrade my subscription from Premium to Standard?

Query on DBFS migration

Want to extract the start time and end time of tasks under particular job.

Databrick hive metastore location?

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

How to set sparkTrials? I am receiving this TypeError: cannot pickle '_thread.lock' object

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...