10-07-2022 02:46 AM
10-09-2022 07:14 PM
Depends on the environment I guess on azure Ive used Azure datafactory ,Synapse/databricks
on AWS more snowflake +matillion
10-10-2022 01:45 AM
Interesting thanks for sharing. Have you ever used matillion on Azure? I would like to hear some experiences on that as i hear more azure customers going for matillion from an end to end platform
10-10-2022 10:53 AM
Matillion does have connectors for azure I am pretty sure it supports Azure well.... I think choice of tool varies from organisation to organisation depends on the skill set of the team ,how much they like to write code (lots of ETL/ELT is lowcode)
10-11-2022 02:53 AM
Matillion does work very well with Azure without a doubt.... here is some good comparing analysis of the tools, https://www.trustradius.com/compare-products/azure-data-factory-vs-matillion
Of course if you're cloud agnostic then matillion makes more sense as you can help your customers or if your org has a multi-cloud strategy with a single end to end data transformation platform
10-10-2022 01:28 AM
Azure here: Data Factory + Databricks + Synapse
10-10-2022 01:47 AM
what about on AWS or GCP? Even on azure i hear great feedback on matillion - if you have any experience pls share
10-10-2022 01:50 AM
Easy: for transformations, code first approach is superior to low/no code tools.
So this leaves extract and orchestration. Keep that as cheap as possible, no need for Matillion.
Can't speak for people who do not like code though.
On AWS I'd try Glue.
10-10-2022 02:10 AM
code is superior to low/no code tools? that is a debate itself - many tech guys have a very different view on this specially when you consider all the time you free up when you have a low/no code specially from a CIO $$ perspective 🙂
On Data Factory, is a good tool though only has Azure sources and targets or destinations and comparing to other tools like matillion limited transformation needs. If you need to work with different cloud warehouse providers then you will need to learn all other tools...
Glue is suitable for any developer looking to make use of their data within their cloud data platform, on here Glue works side by side with matillion (many feedback i heard) because they say matillion differentiators are auto-documentation, cloud agnostic, customer connector, data ingesion etc
Thanks for your opinion @Werner Stinckens always valuable 🙂
10-10-2022 04:50 AM
"code is superior to low/no code tools? that is a debate itself "
The debate is mainly driven by low/no code vendors, but that is fine of course. To each his own.
"On Data Factory, is a good tool though only has Azure sources and targets or destinations and comparing to other tools like matillion limited transformation needs."
That is incorrect. Even though I am not a fan of Data Factory at all, it is able to read/write non-Azure sources.
I agree that the transformations are limited, but that is not the main purpose of Data Factory. Transformations were added afterwards (and running on Spark).
If you need to work with different cloud warehouse providers then you will need to learn all other tools...
You are 100% right on this one. One could argue that it is good to learn multiple tools.
I don't wanna diss Matillion though, it is certainly a fine product; and credit given where credit is due, they were cloud based way before the big fellas!!!
10-10-2022 06:53 AM
Really appreciate your comments...
Thought you have DF as a transformation tool Easy: for transformations, code first approach is superior to low/no code tools?
It's a good discussion to have as there's no right or wrong on this occasion as preference is what drives what to use but at the same time look for what is best, efficient and it's everyone-ready - stack-ready - future-ready
11-17-2022 02:16 PM
it all boils down to ability to maintain code + find talent + costs .
most approaches have overlapping features . can't really go wrong with any one.
I personally prefer code first approach.
11-21-2022 11:22 AM
Pretty Much the same Azure Data Factory for Orchestration, Databricks for cleaning/modeling/serving data. Basically, Data Factory is the No Code solution when it makes sense, and all of the ingestion processes start in Data Factory to keep a centralized strategy. If it's an API or something that requires cleaning/modifying the Data Factory will launch a notebook to do the lift.
10-10-2022 11:32 AM
I have been using Azure: Databricks for compute, ADF for orchestrating databricks notebooks plus executing stored procs and some copy activities, Azure synapse for final destination. But, thinking of using pipeline within Databricks itself rather than in ADF. Let me know anybody has any suggestions around this.
11-16-2022 07:37 PM
Hi @Douglas Carvalho-Ribeiro
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group