Data Engineering

Forum Posts

Sorted by:

by nhgil • New Contributor II

06-16-2022 11:16:18 PM

3399 Views
3 replies
0 kudos

Accessing Workspace?

Trying to follow along with the instructions for the 'AWS and Databricks' training yet the documentation gives no indication of how to access the workspace that is ostensibly generated in AWS. I subscribed through the marketplace and cloud formation...

Data Engineering

3399 Views
3 replies
0 kudos

06-16-2022 11:16:18 PM

View Replies

Latest Reply

nhgil
New Contributor II

08-25-2022 7:29:57 AM

0 kudos

Unfortunately by the time I received a response my trial had long lapsed so I had already unsubscribed in aws.

0 kudos

08-25-2022 7:29:57 AM

2 More Replies

by Chandana • New Contributor II

06-28-2022 3:57:48 PM

1555 Views
1 replies
3 kudos

What’s the USP? DB SQL serverless? When is it coming to Azure?

Data Engineering

1555 Views
1 replies
3 kudos

06-28-2022 3:57:48 PM

View Replies

Latest Reply

User16752242622
Databricks Employee

08-25-2022 3:26:03 AM

3 kudos

Hi @Chandana Basani DB SQL Serverless in Azure is planned for GA on the Release quarter FY23 Q1 in the release month 2023-01Best,Akash

3 kudos

08-25-2022 3:26:03 AM

by sawya • New Contributor II

08-19-2022 12:56:21 AM

4900 Views
3 replies
0 kudos

Migrate workspaces to another AWS account

Hi everyone,I have a Databricks workspace in an AWS account that I have to migrate to a new AWS accountDo you know how I can do it ? Or it's better to recreate a new one and move all the workbooks and if I choose to create one new how can you export ...

Data Engineering

4900 Views
3 replies
0 kudos

08-19-2022 12:56:21 AM

View Replies

Latest Reply

Abishek
Databricks Employee

08-25-2022 2:00:24 AM

0 kudos

@AMADOU THIOUNE Can you check the below link to export the run jobs? https://docs.databricks.com/jobs.html#export-job-runs. Try to reuse the same job_id with the /update and /reset endpoints, it should allow you much better access to previous run re...

0 kudos

08-25-2022 2:00:24 AM

2 More Replies

by rdobbss • New Contributor II

06-17-2022 10:38:05 AM

2503 Views
2 replies
0 kudos

RPC Disassociate error due to container threshold exceeding and garbage collector error when reading 23 gb multiline JSON file.

I am reading 23 gb multi line json file and flattening it using udf and writing datframe as parquet using psypark.Cluster I am using is 3 node (8 core) 64gb memory with limit to go upto 8 nodes.I am able to process 7gb file with no issue and takes ar...

Data Engineering

2503 Views
2 replies
0 kudos

06-17-2022 10:38:05 AM

View Replies

Latest Reply

Vidula
Databricks Partner

08-25-2022 1:48:31 AM

0 kudos

Hi @Ravi Dobariya Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

08-25-2022 1:48:31 AM

1 More Replies

by Philip_Budbee • New Contributor II

08-24-2022 11:50:39 PM

2088 Views
0 replies
1 kudos

Github workflow integration error

Hi,We have a working Github integration in place for our production workspace which is running 14 different jobs that are scheduled during different intervals, but throughout the entire day.The issue over the past 3-4 weeks that we have encountered i...

Data Engineering

2088 Views
0 replies
1 kudos

08-24-2022 11:50:39 PM

by kirankv • New Contributor

08-24-2022 12:46:14 PM

1212 Views
0 replies
0 kudos

How to get notebookid programmatically using R

Hi, I would like to log the notebook id programmatically in R, Is there any command that exists in R so that I can leverage to grab the notebook id, I tried with python using the below command and grab it without any issues, and looking for similar f...

Data Engineering

1212 Views
0 replies
0 kudos

08-24-2022 12:46:14 PM

by Sunny • New Contributor III

06-17-2022 5:04:18 AM

12382 Views
6 replies
1 kudos

Using Thread.sleep in Scala

We need to hit REST web service every 5 mins until success message is received. The Scala object is inside a Jar file and gets invoked by Databricks task within a workflow.Thread.sleep(5000) is working fine but not sure if it is safe practice or is t...

Data Engineering

12382 Views
6 replies
1 kudos

06-17-2022 5:04:18 AM

View Replies

Latest Reply

Vartika
Databricks Employee

08-24-2022 9:18:18 AM

1 kudos

Hey there @Sundeep P Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.C...

1 kudos

08-24-2022 9:18:18 AM

5 More Replies

by KaushikMaji • New Contributor II

02-21-2022 12:12:52 AM

10465 Views
4 replies
3 kudos

Resolved! Connecting with Azure AD token in PowerBI

Hello,We are trying to connect Databricks SQL endpoint from PowerBi using Azure AD service principal, which has been added to Databricks workspace using SCIM APIs. Now, when we open connection to Databricks in powerbi desktop and provide Azure AD acc...

Data Engineering

10465 Views
4 replies
3 kudos

02-21-2022 12:12:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-21-2022 1:55:08 AM

3 kudos

At the moment I do not think that is possible.The help page mentions:An Azure Active Directory token (recommended), an Azure Databricks personal access token, or your Azure Active Directory account credentials.These methods are all user bound, so no ...

3 kudos

02-21-2022 1:55:08 AM

3 More Replies

by gmartinez • New Contributor III

08-15-2022 2:41:09 PM

12738 Views
6 replies
1 kudos

How do I downgrade my subscription from Premium to Standard?

Hello,I have tried getting in touch with the support and sales but I still have no answer. I tried Databricks and I wish to continue but with a Standard subscription. However, It won't let me do it by myself since I need to reach out to the sales tea...

Data Engineering

12738 Views
6 replies
1 kudos

08-15-2022 2:41:09 PM

View Replies

Latest Reply

gmartinez
New Contributor III

08-24-2022 6:27:45 AM

1 kudos

Hello @Mohan Mathews, Have you received any news from the support team?I canceled the previous subscription and acquired a new one on a standard plan. However, when signing into my Databricks account my plan still shows up as "Premium".Is there a wa...

1 kudos

08-24-2022 6:27:45 AM

5 More Replies

by Harsh1 • New Contributor II

08-23-2022 9:36:02 AM

2376 Views
2 replies
1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs While implemen...

Data Engineering

2376 Views
2 replies
1 kudos

08-23-2022 9:36:02 AM

View Replies

Latest Reply

Harsh1
New Contributor II

08-24-2022 5:43:27 AM

1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

1 kudos

08-24-2022 5:43:27 AM

1 More Replies

by Srishti • New Contributor

08-24-2022 5:24:37 AM

1747 Views
0 replies
0 kudos

Want to extract the start time and end time of tasks under particular job.

I have one job that reran multiple times which took 101 hours. Ideally the execution of that job is 8 hours. Using jobs 2.1 API I am able to extract the start and end time of job ID and run ID. This only helps me to get the duration of 101 hours , bu...

Data Engineering

1747 Views
0 replies
0 kudos

08-24-2022 5:24:37 AM

by Niha1 • New Contributor III

08-23-2022 10:06:33 PM

1673 Views
0 replies
1 kudos

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

file_path = f"{datasets_dir}/airbnb/sf-listings/sf-listings-2019-03-06-clean.parquet/"2airbnb_df = spark.read.format("parquet").load(file_path)34display(airbnb_df)AnalysisException: Path does not exist: dbfs:/user/nniha9188@gmail.com/dbacademy/machi...

Data Engineering

1673 Views
0 replies
1 kudos

08-23-2022 10:06:33 PM

by trang_le • Databricks Employee

08-23-2022 4:57:42 PM

1141 Views
0 replies
1 kudos

Are you a university student or faculty member? Are you interested in getting trained by Databricks experts and getting Databricks accredited? The Dat...

Are you a university student or faculty member? Are you interested in getting trained by Databricks experts and getting Databricks accredited?The Databricks Lakehouse Platform Fundamentals Learning Plan will give you an overview of the platform and p...

Data Engineering

1141 Views
0 replies
1 kudos

08-23-2022 4:57:42 PM

by Somi • New Contributor III

08-23-2022 10:47:47 AM

1448 Views
0 replies
1 kudos

How to set sparkTrials? I am receiving this TypeError: cannot pickle '_thread.lock' object

Hey Sara, this Somayeh from VINN Automotive.As I had already shared with you, I am trying to distribute hyperparameter tuning using hyperopt on a tensorflow.keras model. I am using sparkTrials in my fmin:spark_trials = SparkTrials(parallelism=4)...be...

Data Engineering

1448 Views
0 replies
1 kudos

08-23-2022 10:47:47 AM

by saira1122 • New Contributor

08-23-2022 10:11:57 AM

790 Views
0 replies
0 kudos

bit.ly

If you are having trouble with any study problem, you should first find the source of the problem.https://bit.ly/3AoiotQ

Data Engineering

790 Views
0 replies
0 kudos

08-23-2022 10:11:57 AM

Databricks Community

Forum Posts

Accessing Workspace?

What’s the USP? DB SQL serverless? When is it coming to Azure?

Migrate workspaces to another AWS account

RPC Disassociate error due to container threshold exceeding and garbage collector error when reading 23 gb multiline JSON file.

Github workflow integration error

How to get notebookid programmatically using R

Using Thread.sleep in Scala

Resolved! Connecting with Azure AD token in PowerBI

How do I downgrade my subscription from Premium to Standard?

Query on DBFS migration

Want to extract the start time and end time of tasks under particular job.

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

Are you a university student or faculty member? Are you interested in getting trained by Databricks experts and getting Databricks accredited? The Dat...

How to set sparkTrials? I am receiving this TypeError: cannot pickle '_thread.lock' object

bit.ly

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template