cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

SparkR or sparklyr not showing history

Sagas
New Contributor II

Hi,

for some reason Azure Databricks doesn't show History if the data is saved with SparkR (2 in the figure below) or Sparklyr (3), but it does show it with Data Ingestion (0) or with PySpark (1). Is this a known bug or am I doing something wrong? Is it possible to save data with R while getting the UserId and Username?

Databricks_history.PNG

PySpark code:

 

test_data = [[4,"d",30],[5,"e",70]]
df = spark.createDataFrame(test_data,['a','b','c']) df.write.mode('append').format("delta").saveAsTable("catalog.schema.table")

 

SparkR df and save to table:

SparkR.PNG

Sparklyr df and save to table:

Sparklyr.PNG

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @Sagas, Let’s address your questions regarding Azure Databricks, SparkR, and Sparklyr.

  1. History in Azure Databricks:

  2. SparkR and Sparklyr:

    • SparkR:
      • SparkR provides functions for SparkSQL tables and Spark DataFrames. It doesn’t translate dplyr functions into SQL query plans like sparklyr does.
      • You can use SparkR to save data to Delta tables, but it won’t automatically capture the userId and userName.
    • Sparklyr:
  3. Saving Data with R and Capturing UserId/Username:

    • To save data with R while capturing the userId and userName, consider the following approach:
      • Use SparkR or sparklyr to save data to a Delta table.
      • Before saving, manually add a column for userId and userName (e.g., df$userId <- "your_user_id").
      • Ensure that you populate these columns with appropriate values.
      • Save the modified DataFrame to the Delta table.
    • This way, you can associate the data with the user who performed the operation.

Remember to adjust your approach based on your specific requirements and security considerations. If you encounter any issues, feel free to seek further assistance or explore additional resources123.

Sagas
New Contributor II

Thank you for your response! I'll consider capturing UserId for each row. But do you mean that also sparklyr can't automatically capture the userId for the Delta table history?