Hi,I have an external table which is created out of a S3 bucket. The first time I am creating the table I am using the following command : query = """CREATE TABLE IF NOT EXISTS catalog.schema.external_table_s3 USING PARQUET LOCAT...
Hi @Kaniz thank you for reply, how can we handle the schema changes in the external location, what if there are additions or deletions on the schema, will the refresh table work then too?
Hi,I want to access the Databricks Audit Logs to check the table usage information.I created a Databricks workspace on the premium pricing tier and enabled it for the Unity Catalogue.I configured Audit logs to be sent to Azure Diagnostic log delivery...
Execute the below Python code in your databricks workspace to enable lineage system tables import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
api_url = ctx.tags().get("browserHostName").get()
api_token = ctx.apiTo...
I have created an Azure data factory pipeline with a copy data function to copy data from adls path to a delta table .In the delta table drop downs i am able to see only the hive metastore database and tables only but the unity catalog tables are not...
Hi, I have the same issue.Additional information: the linked service created in azure data factory using azure databricks deltalake connector is using system-managed-identity rather than Token. Could we have an update? Thank you in advance.
Did any one tried UC DLT, we are trying to create multiple catalogs for UC DLTnote: as per limitations it won't support Locations, supports managed locationsOur use case: we want to create 2 catalogs one for dev and other for production. Dev catalog ...
Hi @karthik_p, It’s important to note that DLT pipelines can read from anywhere but cannot create DLT tables in any place outside of the catalog/schema specified in the pipeline settings. If you’re trying to write to both locations in the same pipel...
Hi all,we are in the process of rolling out a new unity-enabled databricks env with 2 tiers: dev and prod.Initially we had the plan to completely decouple dev and prod, each with their own data lake as storage.While this is the safest option, it does...
That is the way I am working at right now. Assign workspace to the catalog and set to read-only if necessary.It would be easier though if it was possible to define a 2nd external location in read-only, as this cannot break anything (of course in rea...
Dear all,We are working on the column masking topic recently, using the column filter feature of Unity Catalog.We recently face the problem of masking a nested column (a sub-column within a STRUCT type column).We just wonder if this is even possible ...
Same as my concern. I try to mask with Decimal datatype but It doesn't work as well. The example of DBX for column mask maybe work well with simple datatype like string. Somehow it doesn't meet our requirements for data governance.
I'm the account admin for the subscription where the databricks workspace is created. However If I open the account console, its prompting me to select one of the databricks workspaces. Please note that its my own subscription via visual studio benef...
If there is an existing metastore for the region, doesn't mean my organization store all the meta data across the workspaces in a specific metastore or data lake ? Databricks suggests to have one metastore per region. How the project specific metadat...
When managing and safeguarding your internal data, data governance is crucial. It works like insurance, ensuring that all the data you gather is appropriately disseminated and kept safe within your company.Let's discuss data governance and how to imp...
We have setup the metastore with Manged Identity and when trying to create a managed table in the default location I am hitting below error. The storage is ADLS Gen2. AbfsRestOperationException: Operation failed: "This request is not authorized to pe...
Have you followed all the steps in here to create the metastore? https://app.getreprise.com/launch/wy1Y2ly/Do you have all the necessary permissions granted to create a Managed Table? https://docs.databricks.com/en/data-governance/unity-catalog/manag...
Hello,I am currently working with System Tables on Unity Catalogue part. I have loaded all the schemas in the catalogue and I am using PowerBI to directly access these tables. But while connecting PowerBI to Databricks, I am not able to see System Ta...
I had/have the same issue with the Tableau desktop. I'm not able to select the "Billing" schema because I don't see the "System" catalog. However, I found a workaround to resolve this issue. In databricks console, go to the "Catalog Explore" and sele...
Hi @Gilg, In an Azure Databricks workspace enabled for Unity Catalog, you can leverage the power of Unity Catalog to manage data access and identity federation.
Here are the key points:
Unity Catalog Enablement:
When you enable a workspace for Un...
Looking at Databricks’ suggested use of catalogs. My instincts are now leading me to the conclusion having separate metastore for each SDLC environment (dev, test, prod) is preferable. I think if this pattern were followed, this means due to current ...
You can create multiple metastores for each region within an account. This is not a hard constraint, reach out to account team and they can make an exception. Before doing that, consider what kind of securable sharing you will need between dev, test ...
Hi,Is it possible to make delta shares work for a Matlab client (Just the way we share delta tables to a PowerBI workbench)?We have Unity Catalog enabled and I wanted to explore more about data governance with aforesaid integration.Wanted to discuss ...
Hi @marc88, Certainly! Let’s explore the integration of Delta Sharing and Unity Catalog with a focus on sharing data with a Matlab client.
Here are some insights:
Unity Catalog and Data Governance:
Unity Catalog is a powerful solution for data g...
I'm trying to store MLlib instances in Unity Catalog Volumes. I think volumes are a great way to keep things organized.I can save to a volume without any issues and I can access the data using spark.read and with plain python open(). However, when I ...
Just to supplement that if the ML model is saved and then loaded within the same execution, calling load() will not cause the mentioned exception. Copying the model directory from UC volume to ephemeral storage attached to the driver node is also a w...