Community Platform Discussions

by priyakant1 • New Contributor II

09-07-2023 1:03:13 AM

664 Views
1 replies
0 kudos

Suspension of Data Engineer Professional exam

Hi Databricks TeamI had scheduled my exam on 6th sep 2023, during exam same pop up came up, stating that I am looking in some other direction. I told them that my laptop mouse is not working properly, so I was looking at it. But still they suspended ...

Community Platform Discussions

Reply

664 Views
1 replies
0 kudos

09-07-2023 1:03:13 AM

View Replies

Latest Reply

sirishavemula20
New Contributor III

09-10-2023 9:44:32 AM

0 kudos

Hi @priyakant1 ,Have you got any response from the team, like did they reschedule your exam?

0 kudos

09-10-2023 9:44:32 AM

by sirishavemula20 • New Contributor III

08-21-2023 1:11:38 AM

1869 Views
3 replies
1 kudos

Resolved! My exam has suspended , Need help Urgently (21/08/2023)

Hello Team,I encountered Pathetic experience while attempting my 1st DataBricks certification. Abruptly, Proctor asked me to show my desk, after showing he/she asked multiple times.. wasted my time and then suspended my exam.I want to file a complain...

Community Platform Discussions

Reply

1869 Views
3 replies
1 kudos

08-21-2023 1:11:38 AM

View Replies

Latest Reply

sirishavemula20
New Contributor III

09-10-2023 4:04:00 AM

1 kudos

Sub: My exam Datbricks Data Engineer Associate got suspended_need immediate help please (10/09/2023)I encountered Pathetic experience while attempting my DataBricks Data engineer certification. Abruptly, Proctor asked me to show my desk, after showin...

1 kudos

09-10-2023 4:04:00 AM

2 More Replies

by Policepatil • New Contributor III

09-07-2023 2:17:43 AM

2270 Views
2 replies
1 kudos

Resolved! Records are missing while filtering the dataframe in multithreading

Hi, I need to process nearly 30 files from different locations and insert records to RDS. I am using multi-threading to process these files parallelly like below. Test data: I have configuration like below based on column 4: If colum...

Community Platform Discussions

Reply

2270 Views
2 replies
1 kudos

09-07-2023 2:17:43 AM

View Replies

Latest Reply

sean_owen
Honored Contributor II

09-09-2023 8:49:24 AM

1 kudos

Looks like you are comparing to strings like "1", not values like 1 in your filter condition. It's hard to say, there are some details missing like the rest of the code and the DF schema, and what output you are observing.

1 kudos

09-09-2023 8:49:24 AM

1 More Replies

by VMeghraj • New Contributor II

09-06-2023 9:34:03 AM

1160 Views
2 replies
0 kudos

Increase cores for Spark History Server

By default SHS uses spark.history.fs.numReplayThreads = 25% of avaliable cores (Number of threads that will be used by history server to process event logs)How can we increase the number of cores for Spark History Server ?

Community Platform Discussions

Reply

1160 Views
2 replies
0 kudos

09-06-2023 9:34:03 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-06-2023 3:57:53 PM

0 kudos

Hi @VMeghraj, To increase the number of cores for Spark History Server, you can modify the spark.history.fs.numReplayThreads Configuration parameter. You can set the desired number of cores by modifying the value of spark.history.fs.numReplayThreads...

0 kudos

09-06-2023 3:57:53 PM

1 More Replies

by meystingray • New Contributor II

09-05-2023 8:44:07 AM

1151 Views
1 replies
0 kudos

Databricks Rstudio Init Script Deprecated

OK so I'm trying to use Open Source Rstudio on Azure Databricks.I'm following the instructions here: https://learn.microsoft.com/en-us/azure/databricks/sparkr/rstudio#install-rstudio-server-open-source-editionI've installed the necessary init script ...

Community Platform Discussions

Reply

1151 Views
1 replies
0 kudos

09-05-2023 8:44:07 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-07-2023 3:48:15 AM

0 kudos

Hi @meystingray, The error message you're encountering is indicating that the init script path is not absolute. According to the Databricks documentation, init scripts should be stored as workspace files. Here's how you can do it. 1. Store your ini...

0 kudos

09-07-2023 3:48:15 AM

by Policepatil • New Contributor III

09-06-2023 11:28:53 PM

5633 Views
1 replies
0 kudos

Is it good to process files in multithreading?

Hi,I need to process nearly 30 files from different locations and insert records to RDS.I am using multi-threading to process these files parallelly like below. def process_files(file_path): <process files here> 1. Find bad records based on fie...

Community Platform Discussions

Reply

5633 Views
1 replies
0 kudos

09-06-2023 11:28:53 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-07-2023 1:48:10 AM

0 kudos

Hi @Policepatil , - The approach of parallel processing files can increase the overall speed of the operation.- Multi-threading can optimize CPU usage but not necessarily make I/O operations faster.- I/O operations like reading and writing files are...

0 kudos

09-07-2023 1:48:10 AM

by bachan • New Contributor II

09-04-2023 9:10:44 AM

1309 Views
2 replies
0 kudos

Data Insertion

Scenario: Data from blob storage to SQL db once a week.I have 15(from current date to next 15 days) days data into the blob storage, stored date wise in parquet format, and after seven days the next 15 days data will be inserted. Means till 7th day t...

Community Platform Discussions

Reply

1309 Views
2 replies
0 kudos

09-04-2023 9:10:44 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-06-2023 7:28:39 AM

0 kudos

Hi @bachan, Based on your scenario, you might consider using Azure Data Factory (ADF) for your data pipeline. Azure Data Factory is a cloud-based data integration service that orchestrates and automates the movement and transformation of data. Here ...

0 kudos

09-06-2023 7:28:39 AM

1 More Replies

by Gilg • Contributor II

09-06-2023 3:58:00 PM

2819 Views
2 replies
0 kudos

Server error: OK - Notebook

Hi I am currently having a weird notebook behavior. Every time I write, I am getting the following error. My gut feeling is that it causes by the Auto-save feature.Cheers,Gil

Community Platform Discussions

Reply

2819 Views
2 replies
0 kudos

09-06-2023 3:58:00 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-06-2023 4:12:05 PM

0 kudos

Hi @Gilg , Based on the given information, it seems that the error you are experiencing is related to notebook autosaving. The error message "Failed to save revision: Notebook size exceeds limit" indicates that the notebook size is too large to be a...

0 kudos

09-06-2023 4:12:05 PM

1 More Replies

by Simon_T • New Contributor III

08-24-2023 7:51:24 AM

4903 Views
1 replies
0 kudos

Databricks Terraform Cluster Issue.

Error: default auth: cannot configure default credentials. Config: token=***. Env: DATABRICKS_TOKENon cluster.tf line 27, in data “databricks_spark_version” “latest_lts”:27: data “databricks_spark_version” “latest_lts” {

Community Platform Discussions

Reply

4903 Views
1 replies
0 kudos

08-24-2023 7:51:24 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-06-2023 4:08:40 PM

0 kudos

Hi @Simon_T , Based on the given error message and the provided information, it seems that the default authentication credentials are not properly configured for Databricks. To resolve this issue, you need to set up the authentication using a person...

0 kudos

09-06-2023 4:08:40 PM

by Sivaji • New Contributor

08-20-2023 10:52:14 PM

670 Views
1 replies
0 kudos

Databricks data engineer associate Exam got suspended.

Hello Team, I encountered Pathetic experience while attempting my 1st DataBricks certification. Abruptly, Proctor asked me to show my desk, after showing he/she asked multiple times.. wasted my time and then suspended my exam. I want to file a compla...

Community Platform Discussions

Data engineer Associate

Exam.

Reply

670 Views
1 replies
0 kudos

08-20-2023 10:52:14 PM

View Replies

Latest Reply

Cert-Team
Esteemed Contributor

09-06-2023 12:10:38 PM

0 kudos

Hi @Sivaji Sorry to hear you had a bad experience, and that you got a slow response here in the community. I see that you have taken and passed the exam, Congratulations!For the future, our support team handles cases from here first so it tends to be...

0 kudos

09-06-2023 12:10:38 PM

by Policepatil • New Contributor III

09-04-2023 4:25:54 AM

1298 Views
1 replies
0 kudos

Records are missing while creating new dataframe from one big dataframe using filter

Hi,I have data in file like belowI have different types of row in my input file, column number 8 defines the type of the record.In the above file we have 4 types of records 00 to 03My requirement is:There will be multiple files in the source path, ea...

Community Platform Discussions

Reply

1298 Views
1 replies
0 kudos

09-04-2023 4:25:54 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

09-06-2023 7:46:50 AM

0 kudos

Hi @Policepatil , The issue you're experiencing with missing records could be due to a variety of reasons. It could be related to how Spark handles data partitioning, or it might be due to some data quality issues in your input files. One possible ex...

0 kudos

09-06-2023 7:46:50 AM

by JRL • New Contributor II

08-18-2023 11:07:31 AM

959 Views
1 replies
0 kudos

Github "Danger Zone"

There is a "Danger zone" appearing in the Github indicating that the repositories I share on Databricks should be Suspended and possibly that Databricks should be uninstalled. This may be something standard in Github. Has anyone run across it?

Community Platform Discussions

Reply

959 Views
1 replies
0 kudos

08-18-2023 11:07:31 AM

View Replies

Latest Reply

sean_owen
Honored Contributor II

09-06-2023 6:56:39 AM

0 kudos

It's not telling you that you should do these things. It's telling you that you may break stuff by doing these things. Yes the "Danger Zone" is a thing on Github, it tries to warn you before you do things like click to delete a repo.

0 kudos

09-06-2023 6:56:39 AM

by jermaineharsh • New Contributor II

09-06-2023 6:26:00 AM

280 Views
0 replies
0 kudos

Notebook Langchain ModuleNotFoundError Problem

Specific line of code:from langchain.retrievers.merger_retriever import MergerRetrieverAll other langchain import works when this is commented out. Same line works on my local VS Code.Appreciate any help with this issue. Thanks!

Community Platform Discussions

Reply

280 Views
0 replies
0 kudos

09-06-2023 6:26:00 AM

by jermaineharsh • New Contributor II

09-05-2023 11:28:52 AM

425 Views
0 replies
0 kudos

How to switch from free trial to Community Edition of Databricks in my Azure workspace?

hello,I am trying to switch into Databricks Community Edition after a 14 day trial. I was able to register, but when I try to start my new cluster, I get an error message, "Cluster start feature is currently disabled, and the cluster does not run".In...

Community Platform Discussions

Reply

425 Views
0 replies
0 kudos

09-05-2023 11:28:52 AM

by Picci • New Contributor III

06-27-2023 11:57:00 PM

2759 Views
3 replies
3 kudos

Resolved! Databricks community edition still available?

Is Databricks platform still available in its Community edition (outside Azure, AWS or GCP)? Can someone share the updated link?Thanks,Elisa

Community Platform Discussions

Reply

2759 Views
3 replies
3 kudos

06-27-2023 11:57:00 PM

View Replies

Latest Reply

jamescw
New Contributor II

09-05-2023 11:10:15 AM

3 kudos

Look : it is still available but afaik always linked to azure/gcp/aws

3 kudos

09-05-2023 11:10:15 AM

2 More Replies

Databricks Community

Forum Posts

Suspension of Data Engineer Professional exam

Resolved! My exam has suspended , Need help Urgently (21/08/2023)

Resolved! Records are missing while filtering the dataframe in multithreading

Increase cores for Spark History Server

Databricks Rstudio Init Script Deprecated

Is it good to process files in multithreading?

Data Insertion

Server error: OK - Notebook

Databricks Terraform Cluster Issue.

Databricks data engineer associate Exam got suspended.

Records are missing while creating new dataframe from one big dataframe using filter

Github "Danger Zone"

Notebook Langchain ModuleNotFoundError Problem

How to switch from free trial to Community Edition of Databricks in my Azure workspace?

Resolved! Databricks community edition still available?

Connect with Databricks Users in Your Area

Data bricks Bundle - 'include' section does not ma...

exclude (not like) filter using pyspark

Curl command working in 12.2 but not in 13.3

Databricks Cluster job failure issue

I cannot curl a URL in a notebook