cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ChristianRRL
by Contributor II
  • 2469 Views
  • 2 replies
  • 1 kudos

DLT Primary Key Deduplication: Expectations vs. Constraints vs. Other?

I'm trying to figure out what's the best way to "de-duplicate" data via DLT. Currently, my only leads are:Manage data quality with Delta Live Tables | Databricks on AWSVia "Drop invalid records"Constraints on Databricks | Databricks on AWSVia "pre-de...

Get Started Discussions
Auto Loader
autoloader
Delta Live Table
Delta Live Table Pipeline
dlt
  • 2469 Views
  • 2 replies
  • 1 kudos
Latest Reply
Palash01
Contributor III
  • 1 kudos

Hey @ChristianRRL ,Based on my understanding you want to de-duplicate your data during your DLT pipeline processing unfortunately I was not able to find a solution to this when I ran into this problem due to the native feature limitations.Limitations...

  • 1 kudos
1 More Replies
Phani1
by Valued Contributor
  • 6612 Views
  • 3 replies
  • 1 kudos

ADF vs Databricks

Hi Team ,I would appreciate your suggestion on which scenario to choose between ADF (Azure Data Factory) and Databricks for orchestration, as well as any significant differences between them.Regards,Phanindra

  • 6612 Views
  • 3 replies
  • 1 kudos
Latest Reply
Michael_Galli
Contributor II
  • 1 kudos

Hi, I work with both, so it depends on the usecase.ADF is easy to set up and good for data integration, e.g. "copy data" job to transfer files from storage 1 to storage 2ADF data flows (data transformations) can be used to some level, but when the tr...

  • 1 kudos
2 More Replies
harvey-c
by New Contributor III
  • 1559 Views
  • 5 replies
  • 0 kudos

DLT Performance question with Unity Catalog

Dear Community MembersThis question is about debugging performance issue of DLT pipeline with unity catalog.I had a DLT pipeline in Azure Databricks running on local store i.g. hive_metastore. And the processes took about 2 hour with the auto scalain...

  • 1559 Views
  • 5 replies
  • 0 kudos
Latest Reply
Mystagon
New Contributor II
  • 0 kudos

Hey Harvey, I getting around the same performance problems as you:From around 25 minutes in a normal workspace to an 1 hour and 20mins in UC workspace. Which is roughly 3x slower.Did you manage to solve this? I've also noticed dbutil.fs.ls() is much ...

  • 0 kudos
4 More Replies
ChristianRRL
by Contributor II
  • 558 Views
  • 1 replies
  • 0 kudos

DQ Expectations Best Practice

Hi there, I hope this is a fairly simple and straightforward question. I'm wondering if there's a "general" consensus on where along the DLT data ingestion + transformation process should data quality expectations be applied? For example, two very si...

  • 558 Views
  • 1 replies
  • 0 kudos
Latest Reply
ilarsen
Contributor
  • 0 kudos

I'll offer my opinion.  I see both of those checks (and treatments, if you're converting types for example) as something for the clean/silver/staging/whatever-you-call-it layer.  For us, our bronze layer represents the source data as-is, with SCD typ...

  • 0 kudos
SamyA
by New Contributor II
  • 5794 Views
  • 7 replies
  • 3 kudos

System table with state UNAVAILABLE

Hello,When I check the system table's status, it seems that they are in UNAVAILABLE state. I would like to know if anyone have faced this issue ?Because of that, I can't enable the system table. {"schemas":[{"schema":"storage","state":"UNAVAILABLE"},...

  • 5794 Views
  • 7 replies
  • 3 kudos
Latest Reply
D365
New Contributor II
  • 3 kudos

May Be internal IssueFollow

  • 3 kudos
6 More Replies
abhijit007
by New Contributor
  • 1507 Views
  • 1 replies
  • 0 kudos

Unable to connect Azure kafka server with public IP from databricks notebook

Hi Team,I am unable to connect (SSH connection) from Azure Databricks Notebook to Azure Kafka server.Kafka Server and Databricks both are under same resource group and region. Also in Inbound rule the port is added in Kafka server.Please help me to r...

abhijit007_0-1704345593867.png abhijit007_1-1704346105457.png
  • 1507 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, This looks like issue with networking config. Could you please check on the routing configs, firewall routes etc to make sure destination IP to 9092 is added in the Azure console? 

  • 0 kudos
311102
by New Contributor
  • 817 Views
  • 1 replies
  • 0 kudos

user email invitation to workspace not received

HelloSince December 2023, I cannot anymore invite users to connect to my workspace as I used to. For no reason, the users I add through my admin dashboard do not receive the invitation email and thus the link to connect to the workspace.   I tried my...

  • 817 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, Could you also please try to add users through account console if Identity Federation is enabled? Refer: https://docs.databricks.com/en/administration-guide/users-groups/users.html#assign-a-user-to-a-workspace-using-the-account-console

  • 0 kudos
thibault
by Contributor II
  • 8873 Views
  • 11 replies
  • 6 kudos

databricks-connect 13.1.0 limitations

Hi,Quite excited to see the new release of databricks-connect, I started writing unit tests running pyspark on a databricks cluster using databricks-connect.After some successful basic unit tests, I tested just more chained transformations on a dataf...

  • 8873 Views
  • 11 replies
  • 6 kudos
Latest Reply
jackson-nline
New Contributor III
  • 6 kudos

I doubled the `spark.connect.grpc.maxInboundMessageSize` parameter to 256mb but that didn't appear to resolve anything.

  • 6 kudos
10 More Replies
Khushisi
by New Contributor II
  • 396 Views
  • 1 replies
  • 0 kudos

Databricks to make a machine learning model

Hey all,I've been using a voice cloning AI and it's working well. I'm thinking of using Databricks to make a machine learning model for speech tech. I want to start with personal content creation. Any tips or advice would be great!

  • 396 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Khushisi, That sounds like an exciting project! Databricks is a great platform for developing machine learning models, including those for speech technology.   Here are some steps you can follow to build your model on Databricks:   Prepare your d...

  • 0 kudos
margutie
by New Contributor
  • 578 Views
  • 2 replies
  • 0 kudos

Error from Knime trought proxy

I want to connect to Databricks from Knime on a company computer that uses a proxy. The error I'm encountering is as follows: ERROR Create Databricks Environment 3:1 Execute failed: Could not open the client transport with JDBC URI: jdbc:hive2://adb-...

  • 578 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?This...

  • 0 kudos
1 More Replies
Phani1
by Valued Contributor
  • 887 Views
  • 2 replies
  • 1 kudos

Billing usage per user

Hi Team ,Unity catalog is not enabled in our workspace, We would like to know the billing usage information per user ,could you please help us how to get these details( by using notebook level script).Regards,Phanindra

  • 887 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?This...

  • 1 kudos
1 More Replies
udi_azulay
by New Contributor II
  • 492 Views
  • 2 replies
  • 0 kudos

Running sql command on Single User cluster vs Shared.

Hi, when i am running the below simple code over my Unity Catalog on a Shared cluster, it works very well.But on a Single User - i am getting : Failed to acquire a SAS token for list on /__unitystorage/schemas/1bb5b053-ac96-471b-8077-8288c56c8a20/tab...

  • 492 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?This...

  • 0 kudos
1 More Replies
Databricks_Work
by New Contributor II
  • 2071 Views
  • 1 replies
  • 0 kudos

how to access data in one databricks in another databricks

I want to acces data in another databricks in my databricks, how to do that

  • 2071 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Hello, many thanks for your question, to be able to provide you with a more precise response we required some additional information:1. When you refer databricks in my databricks are you refering to access data that is in one workspace to another wor...

  • 0 kudos
hbs59
by New Contributor III
  • 4507 Views
  • 4 replies
  • 2 kudos

Resolved! Move multiple notebooks at the same time (programmatically)

If I want to move multiple (hundreds of) notebooks at the same time from one folder to another, what is the best way to do that? Other than going to each individual notebook and clicking "Move".Is there a way to programmatically move notebooks? Like ...

  • 4507 Views
  • 4 replies
  • 2 kudos
Latest Reply
Walter_C
Honored Contributor
  • 2 kudos

You should be redirected to the KB page, but this is the information contained: Problem How to migrate Shared folders and the notebooks Cause Shared notebooks are not migrated into new workspace by default Solution Please find the script to migrate t...

  • 2 kudos
3 More Replies
Phani1
by Valued Contributor
  • 1992 Views
  • 2 replies
  • 2 kudos

Databricks API using the personal access token

We can access the Azure databricks API using the personal access token which is created by us manually.The objective is that client don’t want to store the personal access token which may not be secure .Do we have option to generate the token during ...

  • 1992 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?This...

  • 2 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors