- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 07:57 AM
I'm new to integrating the sparklyr / R interface in databricks. In particular it appears that sparklyr and R commands and functions are dependent upon the type of dataframe one is working with (hive, Spark R etc). Is there a recommend best practice as to which dataframe I should start with while working with R in databricks?
Jeff
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 08:00 AM
Recommended is delta format in data lake. Here is code example https://docs.databricks.com/delta/quick-start.html#language-r
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 08:00 AM
Recommended is delta format in data lake. Here is code example https://docs.databricks.com/delta/quick-start.html#language-r
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 09:26 AM
Ok then as I'm reading through the reference material I'm not finding how to convert a Hive table to the delta format. I'm assuming my initial data is a Hive table as I've had to use tbl() to read in the data. Would I simply us a SQL statement to read in the data as a delta table then write it back out?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 10:17 AM
Hi, if your hive table is registered in metastore yes you can use SQL syntax.
Than is enough to use COPY INTO..
if your table is not registered please map it in metastore
CREATE TABLE IF NOT EXISTS tableName (fields) USING data_format LOCATION (path=)
then you can create another table USING delta format and than copy between tables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-09-2022 10:55 AM
@Hubert Dudek , Ok - that's helpful. As I'm reading the databricks documentation it appears when I'm reading in my file using the sparklyr tbl() function in databrick it returns a sparklyr
object ("tbl_spark" "tbl_sql" "tbl_lazy" "tbl ''). So does your previous reply still hold true. Either way based upon you oridginal reploy it woudl be to my benefir to convert the sparklyr object into a delta table - yes. If that's true that's what I'm seeking in the documentation or how to do that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2022 02:30 AM
Hi, have you found how to convert it?

