cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Source to Bronze Organization + Partition

ChristianRRL
Valued Contributor

Hi there, I hope I have what is effectively a simple question. I'd like to ask for a bit on guidance if I am structuring my source-to-bronze auto loader data properly. Here's what I have currently:

/adls_storage/<data_source_name>/<category>/autoloader/<oem_shortname>/<linted_database_shortname>/<linted_table_name>/source/<project_id_partition>/<full_file_name>.csv

For this example, I'm trying to set up a source-to-*bronze* pipeline. Some examples I've seen online are a bit closer to the following:

.../autoloader/<oem>/<database>/<table>/source/project_id={x}/*.csv (aka: original raw data)
.../autoloader/<oem>/<database>/<table>/bronze (aka: the ingested bronze data)
.../autoloader/<oem>/<database>/<table>/checkpoint (still a bit unfamiliar with this one)
.../autoloader/<oem>/<database>/<table>/schema (i.e. keep track of current or evolving schema)

At the end of the day, I'm essentially wanting to have a clean database.table format that looks like the following:

{data_source_name}_{category}.bronze_{oem}_{database}_{table}
...
{data_source_name}_{category}.silver_{database}_{table}

Does this seem right or am I missing anything?

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now