cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

irfanaziz
by Contributor II
  • 3842 Views
  • 3 replies
  • 1 kudos

Resolved! What is the difference between passing the schema in the options or using the .schema() function in pyspark for a csv file?

I have observed a very strange behavior with some of our integration pipelines. This week one of the csv files was getting broken when read with read function given below.def ReadCSV(files,schema_struct,header,delimiter,timestampformat,encode="utf8...

  • 3842 Views
  • 3 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi @nafri A​ ,What is the error you are getting, can you share it please? Like @Hubert Dudek​ mentioned, both will call the same APIs

  • 1 kudos
2 More Replies
boomerangairpla
by New Contributor
  • 610 Views
  • 0 replies
  • 0 kudos

Liftndrift is a paper airplane blog, that helps the world to learn paper airplanes through easy and simple illustrated instructions, we are specialize...

Liftndrift is a paper airplane blog, that helps the world to learn paper airplanes through easy and simple illustrated instructions, we are specialized in teaching how to make a paper airplane. how to make the world record paper airplane

  • 610 Views
  • 0 replies
  • 0 kudos
wyzer
by Contributor II
  • 2614 Views
  • 2 replies
  • 1 kudos

Resolved! Are we using the advantage of "Map & Reduce" ?

Hello,We are new on Databricks and we would like to know if our working method are good.Currently, we are working like this :spark.sql("CREATE TABLE Temp (SELECT avg(***), sum(***) FROM aaa LEFT JOIN bbb WHERE *** >= ***)")With this method, are we us...

  • 2614 Views
  • 2 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Spark will handle the map/reduce for you.So as long as you use Spark provided functions, be it in scala, python or sql (or even R) you will be using distributed processing.You just care about what you want as a result.And afterwards when you are more...

  • 1 kudos
1 More Replies
Databricks_7045
by New Contributor III
  • 3431 Views
  • 3 replies
  • 0 kudos

Resolved! Encapsulate Databricks Pyspark/SparkSql code

Hi All ,I have Custom code ( Pyspark & SparkSQL) (notebooks) which I want to deploy at customer location and encapsulate so that end customers don't see the actual code. Currently we have all code in Notebooks (Pyspark/spark sql). Could you please l...

  • 3431 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

With notebooks that is not possible.You can write your code in scala/java and build a jar, which you then run with spark-submit.(example)Or use python and deploy a wheel.(example)This can become quite complex when you have dependencies.Also: a jar et...

  • 0 kudos
2 More Replies
ghiet
by New Contributor II
  • 4539 Views
  • 7 replies
  • 6 kudos

Resolved! Cannot sign up to Databricks community edition - CAPTCHA error

Hello. I cannot sign up to have access to the community edition. I always get an error message "CAPTCHA error... contact our sales team". I do not have this issue if I try to create a trial account for Databricks hosted on AWS. However, I do not have...

  • 4539 Views
  • 7 replies
  • 6 kudos
Latest Reply
joao_hoffmam
New Contributor III
  • 6 kudos

Hi @Guillaume Hiet​ ,I was facing the same issue. Try signing up using your mobile phone, it worked for me!

  • 6 kudos
6 More Replies
philm
by New Contributor
  • 2008 Views
  • 0 replies
  • 0 kudos

set_experiment_description / set_runid_description

Is there anyway at all to update both experiment / run descriptions via the API, the only option I see is note.context for experiment - but that is a limited view and doesnt allow for hyper links.Id like to automate exp/run descriptions externally t...

  • 2008 Views
  • 0 replies
  • 0 kudos
tusworten
by New Contributor II
  • 7995 Views
  • 3 replies
  • 3 kudos

Spark SQL Group by duplicates, collect_list in array of structs and evaluate rows in each group.

I'm begginner working with Spark SQL in Java API. I have a dataset with duplicate clients grouped by ENTITY and DOCUMENT_ID like this:.withColumn( "ROWNUMBER", row_number().over(Window.partitionBy("ENTITY", "ENTITY_DOC").orderBy("ID")))I added a ROWN...

1
  • 7995 Views
  • 3 replies
  • 3 kudos
Latest Reply
tusworten
New Contributor II
  • 3 kudos

Hi @Kaniz Fatma​ Her answer didn't solve my problem but it was useful to learn more about UDFS, which I did not know.

  • 3 kudos
2 More Replies
Christina_Marx
by New Contributor
  • 1872 Views
  • 0 replies
  • 0 kudos

New LMS: What happened to the Developer Badges and Capstones which were in the old portal?

I was still busy with the Developer Foundations capstone. I received the Developer essentials badge, and was going to do the ILT and capstone for the Solutions Architect Badge in Jan 2022. Then we got the notice about the Academy migration, and I dec...

  • 1872 Views
  • 0 replies
  • 0 kudos
Smart_City_Laho
by New Contributor
  • 635 Views
  • 0 replies
  • 0 kudos

sigmaproperties.com.pk

smart city lahore lahore smart city locationlahore smart city payment plansmart city lahore locationcapital smart city lahore

  • 635 Views
  • 0 replies
  • 0 kudos
gbrueckl
by Contributor II
  • 6527 Views
  • 6 replies
  • 4 kudos

Resolved! CREATE FUNCTION from Python file

Is it somehow possible to create an SQL external function using Python code?the examples only show how to use JARshttps://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.htmlsomething like:CREATE TEMPORAR...

  • 6527 Views
  • 6 replies
  • 4 kudos
Latest Reply
pts
New Contributor II
  • 4 kudos

As a user of your code, I'd find it a less pleasant API because I'd have to some_module.some_func.some_func() rather than just some_module.some_func()No reason to have "some_func" exist twice in the hierarchy. It's kind of redundant. If some_func is ...

  • 4 kudos
5 More Replies
GC-James
by Contributor II
  • 2872 Views
  • 2 replies
  • 3 kudos

Resolved! Working locally then moving to databricks

Hello DataBricks,Struggling with a workflow issue and wondering if anyone can help. I am developing my project in R and sometimes Python locally on my laptop, and committing the files to a git repo. I can then clone that repo in databricks, and *see*...

  • 2872 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

This is separate script which than need to be run from notebook (or job). I am not using R but in Python and Scala it works the same. In Python I am just importing it in notebook ("from folder_structure import myClass") in R probably similar. There ...

  • 3 kudos
1 More Replies
Idm_Crack
by New Contributor II
  • 1299 Views
  • 1 replies
  • 0 kudos

goharpc.com

IDM Crack with Internet Download Manager (IDM) is a tool to increase download speeds, resume, and schedule downloads.

  • 1299 Views
  • 1 replies
  • 0 kudos
Latest Reply
Idm_Crack
New Contributor II
  • 0 kudos

IDM Crack with Internet Download Manager (IDM) is a tool to increase download speeds, resume, and schedule downloads.

  • 0 kudos
al_joe
by Contributor
  • 2884 Views
  • 2 replies
  • 2 kudos

Resolved! Execute a notebook cell with a SINGLE mouse-click?

Currently it takes two mouse-clicks to execute each cell in a DB notebook.I know there is a keyboard shortcut (Ctrl+Enter) to execute the current cellBut is there a way to execute a cell with a single mouse-click?I could use a greasemonkey script or ...

image.png
  • 2884 Views
  • 2 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

Simple answer: no.

  • 2 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels