cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Recommended database when using R in databricks

Jeff1
Contributor II

I'm new to integrating the sparklyr / R interface in databricks. In particular it appears that sparklyr and R commands and functions are dependent upon the type of dataframe one is working with (hive, Spark R etc). Is there a recommend best practice as to which dataframe I should start with while working with R in databricks?

Jeff

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Recommended is delta format in data lake. Here is code example https://docs.databricks.com/delta/quick-start.html#language-r

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

Recommended is delta format in data lake. Here is code example https://docs.databricks.com/delta/quick-start.html#language-r

Ok then as I'm reading through the reference material I'm not finding how to convert a Hive table to the delta format. I'm assuming my initial data is a Hive table as I've had to use tbl() to read in the data. Would I simply us a SQL statement to read in the data as a delta table then write it back out?

Hubert-Dudek
Esteemed Contributor III

Hi, if your hive table is registered in metastore yes you can use SQL syntax.

Than is enough to use COPY INTO..

if your table is not registered please map it in metastore

CREATE TABLE IF NOT EXISTS tableName (fields) USING data_format LOCATION (path=)

then you can create another table USING delta format and than copy between tables.

Jeff1
Contributor II

@Hubert Dudek​ , Ok - that's helpful. As I'm reading the databricks documentation it appears when I'm reading in my file using the sparklyr tbl() function in databrick it returns a sparklyr

object ("tbl_spark" "tbl_sql" "tbl_lazy" "tbl ''). So does your previous reply still hold true. Either way based upon you oridginal reploy it woudl be to my benefir to convert the sparklyr object into a delta table - yes. If that's true that's what I'm seeking in the documentation or how to do that.

Hubert-Dudek
Esteemed Contributor III

Hi, have you found how to convert it?

Kaniz
Community Manager
Community Manager

Hi @Jeff Reichman​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's response help you to find the solution? Please let us know.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.