Rjdudley
Honored Contributor

@cdn_yyz_yul wrote:

But I had received a proposal which suggested a scenario of not "cataloging" Raw, instead, using another tool to achieve the need of searching files in Raw. 
I would like to understand if there are benefits in doing so,  from the community. 


This is where "it depends" on what your company's setup is, but maybe I can provide some food for thought.  Do you already have this other tool, or is it a new purchase?  Did this proposal come from a tool vendor or a consultant, or from your CTO?  Do you have an ongoing need to search the raw files which would require an additional tool?  What business capabilities is this tool going to fulfill--just cataloging and searching, or is it a governance tool like Atlan/Alation/Collibra?

Under most use cases you would not need an additional tool to search raw files.  You'd either transform the data or create table metadata in Unity Catalog from the files and work with the files directly.  The advantage to Unity Catalog is you have all of the same security settings and data classifications, a familiar UI, and only one thing to administer.

We're in the second year of our Databricks implementation.  My approach has been to wait and see if an actual need arises, and if Databricks doesn't come out with a feature which solves my need.  We saw some really nice tools at Data+AI Summit, shiny new things are easy to get excited about, but I always assess the actual business need and any lack of features before I expand.

If it's not abundantly clear why you need this extra tooling, ask back "what business capabilities we are realizing, or what features are we lacking".  If the features of the tool overlap features in Databricks, my experience in using Databricks has been positive largely because of the integration and simple management.  Hope that helps.

View solution in original post