cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

isaac_gritz
by Valued Contributor II
  • 1696 Views
  • 1 replies
  • 5 kudos

Resolved! CI/CD Best Practices

Best Practices for CI/CD on DatabricksFor CI/CD and software engineering best practices with Databricks notebooks we recommend checking out this best practices guide (AWS, Azure, GCP).For CI/CD and local development using an IDE, we recommend dbx, a ...

  • 1696 Views
  • 1 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Thank you, @Isaac Gritz​ , for sharing such a fantastic post!

  • 5 kudos
isaac_gritz
by Valued Contributor II
  • 1010 Views
  • 1 replies
  • 3 kudos

Connecting Applications and BI Tools to Databricks SQL

Access Data in Databricks Using an Application or your Favorite BI ToolYou can leverage Partner Connect for easy, low-configuration connections to some of the most popular BI tools through our optimized connectors. Alternatively, you can follow these...

  • 1010 Views
  • 1 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Thank you, @Isaac Gritz​ , for sharing this fantastic post!

  • 3 kudos
Chandana
by New Contributor II
  • 672 Views
  • 1 replies
  • 3 kudos

What’s the USP? DB SQL serverless? When is it coming to Azure?

What’s the USP? DB SQL serverless? When is it coming to Azure?

  • 672 Views
  • 1 replies
  • 3 kudos
Latest Reply
User16752242622
Valued Contributor
  • 3 kudos

Hi @Chandana Basani​ DB SQL Serverless in Azure is planned for GA on the Release quarter FY23 Q1 in the release month 2023-01Best,Akash

  • 3 kudos
sawya
by New Contributor II
  • 1728 Views
  • 3 replies
  • 0 kudos

Migrate workspaces to another AWS account

Hi everyone,I have a Databricks workspace in an AWS account that I have to migrate to a new AWS accountDo you know how I can do it ? Or it's better to recreate a new one and move all the workbooks and if I choose to create one new how can you export ...

  • 1728 Views
  • 3 replies
  • 0 kudos
Latest Reply
Abishek
Valued Contributor
  • 0 kudos

@AMADOU THIOUNE​ Can you check the below link to export the run jobs? https://docs.databricks.com/jobs.html#export-job-runs. Try to reuse the same job_id with the /update and /reset endpoints, it should allow you much better access to previous run re...

  • 0 kudos
2 More Replies
rdobbss
by New Contributor II
  • 1171 Views
  • 2 replies
  • 0 kudos

RPC Disassociate error due to container threshold exceeding and garbage collector error when reading 23 gb multiline JSON file.

I am reading 23 gb multi line json file and flattening it using udf and writing datframe as parquet using psypark.Cluster I am using is 3 node (8 core) 64gb memory with limit to go upto 8 nodes.I am able to process 7gb file with no issue and takes ar...

  • 1171 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Ravi Dobariya​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 0 kudos
1 More Replies
Azeez
by New Contributor II
  • 3791 Views
  • 8 replies
  • 1 kudos

Resolved! BAD_REQUEST:Failed to get oauth access token.Please try logout and login again

We deployed a test databricks workspace cluster on GCP. A single cluster got spinned up.Later we deleted the workspace.Now when we are trying to create a new one.It is giving this error"BAD_REQUEST:Failed to get oauth access token.Please try logout ...

  • 3791 Views
  • 8 replies
  • 1 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 1 kudos

@Azeez Sayyad​ you can try this workaround.Remove the Databricks App from your Google account. In Google account settings, go to "Manage third-paarty access", and remove Databricks from both Third-Paarty app with account access and Sign-in with Googl...

  • 1 kudos
7 More Replies
Philip_Budbee
by New Contributor II
  • 992 Views
  • 0 replies
  • 1 kudos

Github workflow integration error

Hi,We have a working Github integration in place for our production workspace which is running 14 different jobs that are scheduled during different intervals, but throughout the entire day.The issue over the past 3-4 weeks that we have encountered i...

  • 992 Views
  • 0 replies
  • 1 kudos
kirankv
by New Contributor
  • 593 Views
  • 0 replies
  • 0 kudos

How to get notebookid programmatically using R

Hi, I would like to log the notebook id programmatically in R, Is there any command that exists in R so that I can leverage to grab the notebook id, I tried with python using the below command and grab it without any issues, and looking for similar f...

  • 593 Views
  • 0 replies
  • 0 kudos
Sunny
by New Contributor III
  • 5142 Views
  • 6 replies
  • 1 kudos

Using Thread.sleep in Scala

We need to hit REST web service every 5 mins until success message is received. The Scala object is inside a Jar file and gets invoked by Databricks task within a workflow.Thread.sleep(5000) is working fine but not sure if it is safe practice or is t...

  • 5142 Views
  • 6 replies
  • 1 kudos
Latest Reply
Vartika
Moderator
  • 1 kudos

Hey there @Sundeep P​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.C...

  • 1 kudos
5 More Replies
KaushikMaji
by New Contributor II
  • 5475 Views
  • 4 replies
  • 3 kudos

Resolved! Connecting with Azure AD token in PowerBI

Hello,We are trying to connect Databricks SQL endpoint from PowerBi using Azure AD service principal, which has been added to Databricks workspace using SCIM APIs. Now, when we open connection to Databricks in powerbi desktop and provide Azure AD acc...

image.png
  • 5475 Views
  • 4 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

At the moment I do not think that is possible.The help page mentions:An Azure Active Directory token (recommended), an Azure Databricks personal access token, or your Azure Active Directory account credentials.These methods are all user bound, so no ...

  • 3 kudos
3 More Replies
gmartinez
by New Contributor III
  • 3730 Views
  • 6 replies
  • 1 kudos

How do I downgrade my subscription from Premium to Standard?

Hello,I have tried getting in touch with the support and sales but I still have no answer. I tried Databricks and I wish to continue but with a Standard subscription. However, It won't let me do it by myself since I need to reach out to the sales tea...

  • 3730 Views
  • 6 replies
  • 1 kudos
Latest Reply
gmartinez
New Contributor III
  • 1 kudos

Hello @Mohan Mathews​, Have you received any news from the support team?I canceled the previous subscription and acquired a new one on a standard plan. However, when signing into my Databricks account my plan still shows up as "Premium".Is there a wa...

  • 1 kudos
5 More Replies
Harsh1
by New Contributor II
  • 1029 Views
  • 2 replies
  • 1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs   While implemen...

  • 1029 Views
  • 2 replies
  • 1 kudos
Latest Reply
Harsh1
New Contributor II
  • 1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

  • 1 kudos
1 More Replies
Niha1
by New Contributor III
  • 835 Views
  • 0 replies
  • 1 kudos

Not able to install the AIRBNB dataset when trying to run in the notebook-"Scalable ML". I am getting the error as below-:AnalysisException: Path does not exist:

file_path = f"{datasets_dir}/airbnb/sf-listings/sf-listings-2019-03-06-clean.parquet/"2airbnb_df = spark.read.format("parquet").load(file_path)3​4display(airbnb_df)AnalysisException: Path does not exist: dbfs:/user/nniha9188@gmail.com/dbacademy/machi...

  • 835 Views
  • 0 replies
  • 1 kudos
Somi
by New Contributor III
  • 607 Views
  • 0 replies
  • 1 kudos

How to set sparkTrials? I am receiving this TypeError: cannot pickle '_thread.lock' object

Hey Sara, this Somayeh from VINN Automotive.As I had already shared with you, I am trying to distribute hyperparameter tuning using hyperopt on a tensorflow.keras model. I am using sparkTrials in my fmin:spark_trials = SparkTrials(parallelism=4)...be...

  • 607 Views
  • 0 replies
  • 1 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels