Hello All!
My team is previewing Databricks and are contemplating the steps to take to perform one-time migrations of datasets from Redshift to Delta. Based on our understandings of the tool, here are our initial thoughts:
- Export data from Redshift-2-S3 (UNLOAD) as compressed (CSV)
- Map S3 bucket to Databricks environment
- For each table, run CREATE TABLE statements, with the CSV data as the source; also stores in Delta format
- Validate and then remove the temporary CSV data files
Has anyone else undertaken this effort (or similar)? Thanks in advance for your thoughts and feedback!