cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Policepatil
by New Contributor III
  • 1223 Views
  • 0 replies
  • 0 kudos

Missing records while using limit in multithreading

Hi,I need to process nearly 30 files from different locations and insert records to RDS. I am using multi-threading to process these files parallelly like below. Test data:             I have configuration like below based on column 4: If column 4=0:...

image.png
  • 1223 Views
  • 0 replies
  • 0 kudos
Kratik
by New Contributor III
  • 2865 Views
  • 1 replies
  • 0 kudos

--files in spark submit task

Regarding --files option in spark submit task of Databricks jobs, would like to understand how it works and what is the syntax to pass multiple files to --files? I tried using --files and --py-files and my understanding is, it should make available t...

  • 2865 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, could you please check if this helps: https://docs.databricks.com/en/files/index.html Also please tag @Debayan​ with your next response which will notify me, Thank you!

  • 0 kudos
Phani1
by Valued Contributor II
  • 1458 Views
  • 1 replies
  • 0 kudos

Hive Migration best practices

Hi ,Could you please share with us the approach and best practices for migrating from hadoop-hive to Databricks?Regards,Phanindra

  • 1458 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can try checking the below resources:  https://www.databricks.com/resources/ebook/migration-guide-hadoop-to-databricks https://www.databricks.com/solutions/migration/hadoop https://www.databricks.com/blog/2021/08/06/5-key-steps-to-successfull...

  • 0 kudos
Phani1
by Valued Contributor II
  • 1865 Views
  • 1 replies
  • 0 kudos

Sqoop Migration best practices

Hi ,Could you please share with us the approach and best practices for migrating from hadoop-SQOOP to Databricks?

  • 1865 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can try checking the below resources:  https://www.databricks.com/resources/ebook/migration-guide-hadoop-to-databricks https://www.databricks.com/solutions/migration/hadoop https://www.databricks.com/blog/2021/08/06/5-key-steps-to-successfull...

  • 0 kudos
Phani1
by Valued Contributor II
  • 1223 Views
  • 1 replies
  • 0 kudos

OOZE Jobs migration to databricks

Hi ,Could you please share with us the approach and best practices for migrating from hadoop-ooze jobs to Databricks?

  • 1223 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can try checking the below resources on Hadoop migration:  https://www.databricks.com/resources/ebook/migration-guide-hadoop-to-databricks https://www.databricks.com/solutions/migration/hadoop https://www.databricks.com/blog/2021/08/06/5-key-...

  • 0 kudos
Phani1
by Valued Contributor II
  • 2494 Views
  • 1 replies
  • 0 kudos

HDFS to Databricks

Hi ,Could you please share with us the approach and best practices for migrating from hadoop-HDFS to Databricks?

  • 2494 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can try checking the below resources:  https://www.databricks.com/resources/ebook/migration-guide-hadoop-to-databricks https://www.databricks.com/solutions/migration/hadoop https://www.databricks.com/blog/2021/08/06/5-key-steps-to-successfull...

  • 0 kudos
jlmontie
by New Contributor II
  • 1508 Views
  • 1 replies
  • 0 kudos

Notebook runs with error when run as a job

I am using a notebook to copy over my database on a schedule (I had no success connecting through the Data Explorer UI). When I run the notebook on its own, it works. When I run it as a scheduled job, I get this error. org.apache.spark.SparkSQLExcept...

  • 1508 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, the error code is minimal, could you please post the whole error if that is possible?  Also please tag @Debayan​ with your next response which will notify me, Thank you!

  • 0 kudos
priyakant1
by New Contributor II
  • 1281 Views
  • 1 replies
  • 0 kudos

Suspension of Data Engineer Professional exam

Hi Databricks TeamI had scheduled my exam on 6th sep 2023, during exam same pop up came up, stating that I am looking in some other direction. I told them that my laptop mouse is not working properly, so I was looking at it. But still they suspended ...

  • 1281 Views
  • 1 replies
  • 0 kudos
Latest Reply
sirishavemula20
New Contributor III
  • 0 kudos

Hi @priyakant1 ,Have you got any response from the team, like did they reschedule your exam?

  • 0 kudos
sirishavemula20
by New Contributor III
  • 2867 Views
  • 1 replies
  • 0 kudos

My exam has suspended , Need help Urgently (21/08/2023)

Hello Team,I encountered Pathetic experience while attempting my 1st DataBricks certification. Abruptly, Proctor asked me to show my desk, after showing he/she asked multiple times.. wasted my time and then suspended my exam.I want to file a complain...

  • 2867 Views
  • 1 replies
  • 0 kudos
Latest Reply
sirishavemula20
New Contributor III
  • 0 kudos

Sub: My exam Datbricks Data Engineer Associate got suspended_need immediate help please (10/09/2023)I encountered Pathetic experience while attempting my DataBricks Data engineer certification. Abruptly, Proctor asked me to show my desk, after showin...

  • 0 kudos
Policepatil
by New Contributor III
  • 3389 Views
  • 1 replies
  • 1 kudos

Resolved! Records are missing while filtering the dataframe in multithreading

 Hi, I need to process nearly 30 files from different locations and insert records to RDS. I am using multi-threading to process these files parallelly like below.   Test data:               I have configuration like below based on column 4: If colum...

Policepatil_0-1694077661899.png
  • 3389 Views
  • 1 replies
  • 1 kudos
Latest Reply
sean_owen
Databricks Employee
  • 1 kudos

Looks like you are comparing to strings like "1", not values like 1 in your filter condition. It's hard to say, there are some details missing like the rest of the code and the DF schema, and what output you are observing.

  • 1 kudos
Rex1
by New Contributor III
  • 9694 Views
  • 1 replies
  • 0 kudos

Resolved! Recover Account Owner

Need help recovering account owner.Problem: Account owner cannot sign in with its password after SSO was configured. Account owner is a DL for team ownership so it doesn't have an AWS account and can't configure in AD group since it has "+" in the em...

  • 9694 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rex1
New Contributor III
  • 0 kudos

Resolved by temporarily disabling SSO with Active directory that wasn't allowing an email to be created with "+"

  • 0 kudos
Juroy_Lim
by New Contributor III
  • 3135 Views
  • 1 replies
  • 0 kudos

Will the cells in the notebook keep running even if the browser is closed?

The Execute ML Model Pipeline celll has finished running, it took 2.27 days to finish running. However, the code in the following cell called Process JSON Output needs to take a very long time again to run. Can I simply close the browser and shut dow...

Execute ML Model Pipeline.png Process JSON Output.png
Get Started Discussions
azure
Databricks
Notebook
  • 3135 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, I have just tested it internally, even if the browser is closed the notebook keeps on running. You can start with a quick job to test it. Also, please tag @Debayan with your next response so that I will get notified. 

  • 0 kudos
Policepatil
by New Contributor III
  • 6339 Views
  • 0 replies
  • 0 kudos

Is it good to process files in multithreading?

Hi,I need to process nearly 30 files from different locations and insert records to RDS.I am using multi-threading to process these files parallelly like below. def process_files(file_path):    <process files here>    1. Find bad records based on fie...

  • 6339 Views
  • 0 replies
  • 0 kudos
bachan
by New Contributor II
  • 1937 Views
  • 1 replies
  • 0 kudos

Data Insertion

Scenario: Data from blob storage to SQL db once a week.I have 15(from current date to next 15 days) days data into the blob storage, stored date wise in parquet format, and after seven days the next 15 days data will be inserted. Means till 7th day t...

  • 1937 Views
  • 1 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels