end-to-end lineage using Unity Catalog
Hi Team,what is the Possibilities of getting end-to-end lineage using Unity CatalogRegards,Phanindra
- 811 Views
- 0 replies
- 1 kudos
Hi Team,what is the Possibilities of getting end-to-end lineage using Unity CatalogRegards,Phanindra
I'm trying to figure out what's the best way to "de-duplicate" data via DLT. Currently, my only leads are:Manage data quality with Delta Live Tables | Databricks on AWSVia "Drop invalid records"Constraints on Databricks | Databricks on AWSVia "pre-de...
Hey @ChristianRRL ,Based on my understanding you want to de-duplicate your data during your DLT pipeline processing unfortunately I was not able to find a solution to this when I ran into this problem due to the native feature limitations.Limitations...
Hi Team ,I would appreciate your suggestion on which scenario to choose between ADF (Azure Data Factory) and Databricks for orchestration, as well as any significant differences between them.Regards,Phanindra
Hi, I work with both, so it depends on the usecase.ADF is easy to set up and good for data integration, e.g. "copy data" job to transfer files from storage 1 to storage 2ADF data flows (data transformations) can be used to some level, but when the tr...
Dear Community MembersThis question is about debugging performance issue of DLT pipeline with unity catalog.I had a DLT pipeline in Azure Databricks running on local store i.g. hive_metastore. And the processes took about 2 hour with the auto scalain...
Hey Harvey, I getting around the same performance problems as you:From around 25 minutes in a normal workspace to an 1 hour and 20mins in UC workspace. Which is roughly 3x slower.Did you manage to solve this? I've also noticed dbutil.fs.ls() is much ...
Not sure if this has come up before, but I'm wondering if Databricks has any kind of functionality to "watch" an API call for changes?E.g. Currently I have a frequently running job that pulls data via an API call and overwrites the old data. This see...
Hello,When I check the system table's status, it seems that they are in UNAVAILABLE state. I would like to know if anyone have faced this issue ?Because of that, I can't enable the system table. {"schemas":[{"schema":"storage","state":"UNAVAILABLE"},...
Hi Team,I am unable to connect (SSH connection) from Azure Databricks Notebook to Azure Kafka server.Kafka Server and Databricks both are under same resource group and region. Also in Inbound rule the port is added in Kafka server.Please help me to r...
Hi, This looks like issue with networking config. Could you please check on the routing configs, firewall routes etc to make sure destination IP to 9092 is added in the Azure console?
HelloSince December 2023, I cannot anymore invite users to connect to my workspace as I used to. For no reason, the users I add through my admin dashboard do not receive the invitation email and thus the link to connect to the workspace. I tried my...
Hi, Could you also please try to add users through account console if Identity Federation is enabled? Refer: https://docs.databricks.com/en/administration-guide/users-groups/users.html#assign-a-user-to-a-workspace-using-the-account-console
Hi,Quite excited to see the new release of databricks-connect, I started writing unit tests running pyspark on a databricks cluster using databricks-connect.After some successful basic unit tests, I tested just more chained transformations on a dataf...
I doubled the `spark.connect.grpc.maxInboundMessageSize` parameter to 256mb but that didn't appear to resolve anything.
Hi, when i am running the below simple code over my Unity Catalog on a Shared cluster, it works very well.But on a Single User - i am getting : Failed to acquire a SAS token for list on /__unitystorage/schemas/1bb5b053-ac96-471b-8077-8288c56c8a20/tab...
Hi, Could you please refer to the limitations here: https://docs.databricks.com/en/compute/access-mode-limitations.html . Please let us know if this helps.
I want to acces data in another databricks in my databricks, how to do that
Hello, many thanks for your question, to be able to provide you with a more precise response we required some additional information:1. When you refer databricks in my databricks are you refering to access data that is in one workspace to another wor...
If I want to move multiple (hundreds of) notebooks at the same time from one folder to another, what is the best way to do that? Other than going to each individual notebook and clicking "Move".Is there a way to programmatically move notebooks? Like ...
You should be redirected to the KB page, but this is the information contained: Problem How to migrate Shared folders and the notebooks Cause Shared notebooks are not migrated into new workspace by default Solution Please find the script to migrate t...
We can access the Azure databricks API using the personal access token which is created by us manually.The objective is that client don’t want to store the personal access token which may not be secure .Do we have option to generate the token during ...
Hi @Phani1 ,Yes, now you can use databricks Create a user token API for create access token via automated API call.Please refer below doc - Create a user token | Token API | REST API reference | Azure Databricks
I used to use dbfs with mounted directories and now I want to switch to Volumes for storing my jars and application.conf for pipelines. I see the file my application.conf in Data Explorer > Catalog > Volumes, I also see the file with dbutils.fs.ls("/...
Volumes mount are accessible using scala code only on a shared cluster. On single user mode this features is not supported yet. We use init scripts to move contents from Volumes to clusters local drive, when we need to access files from Native Scala ...
Super basic question. For DLT pipelines I see there's an option to add multiple "Paths". Is it generally best practice to completely separate `bronze` from `silver` notebooks? Or is it more recommended to bundle both raw `bronze` and clean `silver` d...
This is great! I completely missed the list view before.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group