cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

THIAM_HUATTAN
by Valued Contributor
  • 1627 Views
  • 2 replies
  • 0 kudos

Resolved! Save data from Spark DataFrames to TFRecords

https://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/deep-learning/tfrecords-save-load.htmlI could not run the Cell # 2java.lang.ClassNotFoundException: --------------------------------------------------------------------------- Py4JJ...

  • 1627 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @THIAM HUAT TAN​,Which DBR version are you using? are you using the ML runtime?

  • 0 kudos
1 More Replies
Raie
by New Contributor III
  • 7542 Views
  • 3 replies
  • 4 kudos

Resolved! How do I specify column's data type with spark dataframes?

What I am doing:spark_df = spark.createDataFrame(dfnew)spark_df.write.saveAsTable("default.test_table", index=False, header=True)This automatically detects the datatypes and is working right now. BUT, what if the datatype cannot be detected or detect...

  • 7542 Views
  • 3 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

just create table earlier and set column types (CREATE TABLE ... LOCATION ( path path)in dataframe you need to have corresponding data types which you can make using cast syntax, just your syntax is incorrect, here is example of correct syntax:from p...

  • 4 kudos
2 More Replies
Jeff1
by Contributor II
  • 2287 Views
  • 3 replies
  • 5 kudos

Resolved! Understand Spark DataFrames verse R DataFrames

CommunityI’ve been struggling with utilizing R language in databricks and after reading “Mastering Spark with R,” I believe my initial problems stemmed from not understating the difference between Spark DataFrames and R DataFrames within the databric...

  • 2287 Views
  • 3 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

As Spark dataframes are handled in distributed way on workers it is better just to use Spark dataframes. Additionally collect is executed on driver and takes whole dataset into memory so it is shouldn't be used in production.

  • 5 kudos
2 More Replies
Labels