cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Data preparation in Databricks

Priyag1
Honored Contributor II

Data preparation in Databricks

Good data is important to ensure accurate and useful results. To get good data following tasks must be done

  • Cleaning and formatting data - Handling missing values or outliers, ensuring data is in the correct format, and removing unneeded columns.
  • Preprocessing data- Numerical transformations, aggregating data, encoding text or image data, and creating new features.
  • Combining data.- Joining tables or merging datasets.

Data preparation resources

  1. Medallion lakehouse architecturehttps://docs.databricks.com/lakehouse/medallion.html
  2. Delta Live Tables - https://docs.databricks.com/delta-live-tables/index.html
  3. Databricks Partner Connect - https://docs.databricks.com/partner-connect/prep.html
  4. Release notes - https://docs.databricks.com/release-notes/runtime/releases.html

 

 

 

4 REPLIES 4

Anonymous
Not applicable

Hi @Priyadarshini G​ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

bharats
New Contributor III

Useful Information. Hope u do more summarized posts on these concepts

Sandro
New Contributor II

Great introduction, for some cases, I would add some other dimensions of data quality, such as completeness of data and referential integrity validation.

dplante
Contributor II

Data governance and data lineage are other things to call out.

Here's a cheat sheet  that is also useful -> Data Preparation Cheatsheet

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now