cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dicer
by Valued Contributor
  • 17115 Views
  • 13 replies
  • 13 kudos

Resolved! Failed to convert Spark.sql to Pandas Dataframe using .toPandas()

I wrote the following code:​data = spark.sql (" SELECT A_adjClose, AA_adjClose, AAL_adjClose, AAP_adjClose, AAPL_adjClose FROM deltabase.a_30min_delta, deltabase.aa_30min_delta, deltabase.aal_30min_delta, deltabase.aap_30min_delta ,deltabase.aapl_30m...

  • 17115 Views
  • 13 replies
  • 13 kudos
Latest Reply
Dicer
Valued Contributor
  • 13 kudos

I just discovered a solution.Today, I opened Azure Databricks. When I imported python libraries. Databricks told me that toPandas() was deprecated and it suggested me to use toPandas.The following solution works: Use toPandas instead of toPandas() da...

  • 13 kudos
12 More Replies
Taha_Hussain
by Valued Contributor II
  • 813 Views
  • 0 replies
  • 5 kudos

Databricks Office Hours Register for Office Hours to participate in a live Q&A session and receive technical support directly from Databricks expe...

Databricks Office HoursRegister for Office Hours to participate in a live Q&A session and receive technical support directly from Databricks experts! Our next events are scheduled for July 13th & July 27th from 8:00am - 9:00am PT | 3:00pm - 4:00pm GM...

  • 813 Views
  • 0 replies
  • 5 kudos
RS1
by New Contributor III
  • 504 Views
  • 0 replies
  • 1 kudos

I attended the Advanced Machine Learning with Databricks training last week virtually I am still unable to get the day 2 session videos of any of the...

I attended the Advanced Machine Learning with Databricks training last week virtually I am still unable to get the day 2 session videos of any of the Instructor led Paid Trainings. They are supposed to be available for replay with in 24 hours but I ...

  • 504 Views
  • 0 replies
  • 1 kudos
RiyazAli
by Valued Contributor
  • 2366 Views
  • 2 replies
  • 3 kudos

Errors in notebooks of Scalable Machine Learning with Apache Spark course in Databricks academy.

HI there,I'm following the course mentioned from Databricks Academy. I downloaded the .dbc archiive and working along side the videos from academy. In ML-08 - Hyperopt notebook, I see the following error in cmd 13. best_hyperparam = fmin(fn=objectiv...

hyperopt_implementation hyperopt problem with "max_features"
  • 2366 Views
  • 2 replies
  • 3 kudos
Latest Reply
RiyazAli
Valued Contributor
  • 3 kudos

Tagging @Kaniz Fatma​ as there was no response what so ever!By any chance, do you know how to resolve these errors in the notebook?Thanks!

  • 3 kudos
1 More Replies
SusuTheSeeker
by New Contributor III
  • 3723 Views
  • 8 replies
  • 3 kudos

Kernel switches to unknown using pyspark

I am working in jupyter hub in a notebook. I am using pyspark dataframe for analyzing text. More precisely I am doing sentimment analysis of newspaper articles. The code works until I get to some point where the kernel is busy and after approximately...

  • 3723 Views
  • 8 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Suad Hidbani​ â€‹, We haven’t heard from you on the last responses from us, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will...

  • 3 kudos
7 More Replies
yitao
by New Contributor III
  • 2836 Views
  • 6 replies
  • 11 kudos

Resolved! How to make sparklyr extension work with Databricks runtime?

Hello. I'm the current maintainer of sparklyr (a R interface for Apache Spark) and a few sparklyr extensions such as sparklyr.flint.Sparklyr was fortunate to receive some contribution from Databricks folks, which enabled R users to run `spark_connect...

  • 2836 Views
  • 6 replies
  • 11 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 11 kudos

Hi @yitao​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 11 kudos
5 More Replies
cmotla
by New Contributor III
  • 2008 Views
  • 3 replies
  • 8 kudos

Issue with complex json based data frame select

We are getting the below error when trying to select the nested columns (string type in a struct) even though we don't have more than a 1000 records in the data frame. The schema is very complex and has few columns as struct type and few as array typ...

  • 2008 Views
  • 3 replies
  • 8 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 8 kudos

Hi @Chaitanya Motla​ , Just a friendly follow-up. Do you still need help, or did you find the solution? Please let us know.

  • 8 kudos
2 More Replies
Surendra
by New Contributor III
  • 7731 Views
  • 5 replies
  • 8 kudos

Resolved! Databricks notebook is taking 2 hours to write to /dbfs/mnt (blob storage). Same job is taking 8 minutes to write to /dbfs/FileStore. I would like to understand why write performance is different in both cases.

Problem statement:Source file format : .tar.gzAvg size: 10 mbnumber of tar.gz files: 1000Each tar.gz file contails around 20000 csv files.Requirement : Untar the tar.gz file and write CSV files to blob storage / intermediate storage layer for further...

databricks_write_to_dbfsMount databricks_write_to_dbfsMount
  • 7731 Views
  • 5 replies
  • 8 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 8 kudos

Hi @Hubert Dudek​ , I Just wanted to thank you. We’re so lucky to have customers like you!The way you are helping our community is incredible.

  • 8 kudos
4 More Replies
HashMan
by New Contributor III
  • 2883 Views
  • 7 replies
  • 4 kudos

Resolved! Learn Apache Spark

I want to learn Apache Spark for Developer, where do I start and want materials are recommended.

  • 2883 Views
  • 7 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

If you are a databricks customer (any paid subscription like Azure databricks), please register through https://databricks.com/learn/training/home to Academy using email from subscription.Course there are the best on the internet.If you will not see ...

  • 4 kudos
6 More Replies
Constantine
by Contributor III
  • 2009 Views
  • 1 replies
  • 4 kudos

Resolved! How to process a large delta table with UDF ?

I have a delta table with about 300 billion rows. Now I am performing some operations on a column using UDF and creating another columnMy code is something like thisdef my_udf(data): return pass   udf_func = udf(my_udf, StringType()) data...

  • 2009 Views
  • 1 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

That udf code will run on driver so better not use it for such a big dataset. What you need is vectorized pandas udf https://docs.databricks.com/spark/latest/spark-sql/udf-python-pandas.html

  • 4 kudos
USHAK
by New Contributor II
  • 844 Views
  • 1 replies
  • 0 kudos

Hi , I am trying to schedule - Exam: Databricks Certified Associate Developer for Apache Spark 3.0 - Python.In the cart --> I couldn't proceed ...

Hi , I am trying to schedule - Exam: Databricks Certified Associate Developer for Apache Spark 3.0 - Python.In the cart --> I couldn't proceed without entering voucher. I do not have voucher.Please help

  • 844 Views
  • 1 replies
  • 0 kudos
Latest Reply
USHAK
New Contributor II
  • 0 kudos

Can someone Please respond to my above question ? Can i write certification test without Voucher ?

  • 0 kudos
Personal1
by New Contributor II
  • 1712 Views
  • 2 replies
  • 2 kudos
  • 1712 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Abhishek Pradhan​ ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Than...

  • 2 kudos
1 More Replies
All_users_grou1
by New Contributor II
  • 1334 Views
  • 3 replies
  • 2 kudos

Resolved! Haven't received Databricks Certified Associate Developer for Apache Spark 3.0 certification yet

I took the exam on 04-01-2022 and passed with 80% though I haven't received my certification yet. I had also raised a query regarding this, is there an update on request #00129864?

  • 1334 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Ayush Kumar Singh​ - We have an answer for you. Please check out these announcements.https://community.databricks.com/s/question/0D53f00001ebiUOCAY/databricks-courseshttps://community.databricks.com/s/feed/0D53f00001dq6W6CAI

  • 2 kudos
2 More Replies
Itachi_Naruto
by New Contributor II
  • 7987 Views
  • 7 replies
  • 2 kudos

Resolved! hdbscan package error

I try to import **hdbscan** but it throws this following error/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level) 156 # Import the desired module. ...

  • 7987 Views
  • 7 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Rajamannar Aanjaram​ , It looks like there's a compatibility issue with the hdbscan library.You may check out the Github issue which addresses a similar issue.In case the above Github issue doesn't solve your issue, I would request to open a new ...

  • 2 kudos
6 More Replies
Mateo
by New Contributor II
  • 1042 Views
  • 2 replies
  • 0 kudos

Hi all, I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Devel...

Hi all,I'm having some trouble with my Certification Transcript in the Academy Portal. I've passed "Databricks Certified Associate Developer for Apache Spark 3.0" last year and everything seemed fine (apart from the fact that I've been issued two sep...

  • 1042 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mateo
New Contributor II
  • 0 kudos

Hey @Piper Wilson​ ! Thank you for your response. Unfortunately, I already created a support ticket through the address provided in this post you mentioned. And I got a 'case closed' e-mail after over two weeks with no response and no fix (certificat...

  • 0 kudos
1 More Replies
Labels