Databricks Community

ConSpooky · ‎10-14-2022

My apologies in advance for sounding like a newbie. This is really just a curiosity question I have as an outsider observing my team clash with our client. Please ask any questions you have, and I will try my best to answer it.

Currently, we are storing our transformation queries in a DynamoDB table. When needed, we pull into Databricks and execute the query. Simple as that. Our client has called this out as “hard coding” (more on that soon)

Our client has come up with an alternative that involves creating JSON config files containing the transformation rules (all tables/attributes required, target table names, Alias names, join keys, etc. etc.). From here, the SQL query is dynamically created. This approach is still “hard coding” since these config files would need to be manually edited anytime there is a change in the rules.

The way I see this: I think storing the transform rules in JSON is more business user friendly, but that’s about where I see the pros end. It brings in much more complexity to the code and likely will need to be continuously developed to support new queries. Also, I don’t see anyway to prevent “hard coding”. The client business leads seem to think there is some magical tool to convert plain English text to complex SQL queries

I just wanted to get some experts thoughts on this. Which solution is better, or is there another approach that should be taken?

Hubert-Dudek · ‎10-20-2022

I agree with you, and I had similar experiences in the past. The business user would like to have everything magically automated and invite JSON, which will be edited by non-developer users. Additionally, it is still hard coded as you need to maintain that JSON and then read them by some script which will make ineffective queries that can easily break after the change in JSON

My blog: https://databrickster.medium.com/

raymond021 · ‎10-21-2022

These modifications might be as straightforward as filtering the rows or eliminating a column, or they can be as widespread as using the first row as the table header. Along with basic transformation choices, there are also more complex ones like merge, append, group by, pivot, and unpivot.

Anonymous · ‎11-25-2022

Hi @Nick Connors

Hope all is well!

Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!