cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

dvmentalmadess
by Valued Contributor
  • 6687 Views
  • 10 replies
  • 2 kudos

Resolved! Data Explorer minimum permissions

What are the minimum permissions are required to search and view objects in Data Explorer? For example, does a user have to have `USE [SCHEMA|CATALOG]` to search or browse in the Data Explorer? Or can anyone with workspace access browse objects and, ...

  • 6687 Views
  • 10 replies
  • 2 kudos
Latest Reply
bearded_data
New Contributor III
  • 2 kudos

Circling back to this.  With one of the recent releases you can now GRANT BROWSE at the catalog level!  Hopefully they will be rolling this feature out at every object level (schemas and tables specifically).

  • 2 kudos
9 More Replies
dollyb
by Contributor
  • 2031 Views
  • 2 replies
  • 0 kudos

Resolved! Differences between Spark SQL and Databricks

Hello,I'm using a local Docker Spark 3.5 runtime to test my Databricks Connect code. However I've come across a couple of cases where my code would work in one environment, but not the other.Concrete example, I'm reading data from BigQuery via spark....

  • 2031 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@dollyb That's because when you've added another dependency on Databricks, it doesn't really know which one it should use. By default it's using built-in com.google.cloud.spark.bigquery.BigQueryRelationProvider.What you can do is provide whole packag...

  • 0 kudos
1 More Replies
thiagoawstest
by Contributor
  • 945 Views
  • 1 replies
  • 0 kudos

Azure Devops - Entra ID - AWS Databricks

Hi, I need to integrate Azure Devops repos with AWS Databricks, but not via personal token.I need it via main service, integrated with Azure Entra ID, using Azure Databricks when I go to create main service, "Entra ID application ID" appears, but in ...

  • 945 Views
  • 1 replies
  • 0 kudos
christian_chong
by New Contributor III
  • 1141 Views
  • 1 replies
  • 0 kudos

Resolved! unity catalog with external table and column masking

Hi everbody, I am facing a issue with spark structured steaming. here is a sample of my code:   df = spark.readStream.load(f"{bronze_table_path}") df.writeStream \ .format("delta") \ .option("checkpointLocation", f"{silver_checkpoint}") \ .option("me...

  • 1141 Views
  • 1 replies
  • 0 kudos
Latest Reply
christian_chong
New Contributor III
  • 0 kudos

My first message was not well formatted. i wrote :  df = spark.readStream.load(f"{bronze_table_path}") df.writeStream \ .format("delta") \ .option("checkpointLocation", f"{silver_checkpoint}") \ .option("mergeSchema", "true") \ .trigger(availabl...

  • 0 kudos
philipkd
by New Contributor III
  • 2104 Views
  • 1 replies
  • 0 kudos

Cannot get past Query Data tutorial for Azure Databricks

I created a new workspace on Azure Databricks, and I can't get past this first step in the tutorial: DROP TABLE IF EXISTS diamonds; CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", hea...

  • 2104 Views
  • 1 replies
  • 0 kudos
Latest Reply
dollyb
Contributor
  • 0 kudos

Struggling with this as well. So using dbfs:/ with CREATE TABLE statement works on AWS, but not Azure?

  • 0 kudos
Devsql
by New Contributor III
  • 3575 Views
  • 1 replies
  • 0 kudos

Measure size of all tables in Azure databricks

Hi Team,Currently I am trying to find size of all tables in my Azure databricks, as i am trying to get idea of current data loading trends, so i can plan for data forecast ( i.e. Last 2 months, approx 100 GB data came-in, so in next 2-3 months there ...

  • 3575 Views
  • 1 replies
  • 0 kudos
Latest Reply
Devsql
New Contributor III
  • 0 kudos

Hi @Retired_mod,1-  Regarding this issue i had found below link:https://kb.databricks.com/sql/find-size-of-table#:~:text=You%20can%20determine%20the%20size,stats%20to%20return%20the%20sizeNow to try above link, I need to decide: Delta-Table Vs Non-De...

  • 0 kudos
yvuignie
by Contributor
  • 1038 Views
  • 1 replies
  • 0 kudos

Asset Bundles webhook not working

Hello,The webhook notifications in databricks jobs defined in the asset bundles are not taken into account and therefore not created. For instance, this is not working:resources: jobs: job1: name: my_job webhook_notifications: on...

  • 1038 Views
  • 1 replies
  • 0 kudos
Latest Reply
yvuignie
Contributor
  • 0 kudos

Hello @Retired_mod ,Thank you for your help.However we did check the job configuration multiple time. If we substitue 'webhook_notifications' with 'email_notifications' it works, so the syntax is correct. Here is a sample of our configuration:For the...

  • 0 kudos
N_M
by Contributor
  • 1736 Views
  • 1 replies
  • 0 kudos

Access historical injected data of COPY INTO command

Dear Community,I'm using the COPY INTO command to automate the staging of files that I get in an S3 bucket into specific delta tables (with some transformation on the fly).The command works smoothly, and files are indeed inserted only once (writing i...

  • 1736 Views
  • 1 replies
  • 0 kudos
ChingizK
by New Contributor III
  • 3285 Views
  • 2 replies
  • 1 kudos

Resolved! Workflow Failure Alert Webhooks for OpsGenie

I'm trying to set up a Workflow Job Webhook notification to send an alert to OpsGenie REST API on job failure. We've set up Teams & Email successfully.We've created the Webhook and when I configure "On Failure" I can see it in the JSON/YAML view. How...

Screenshot 2024-04-12 at 1.15.33 PM.png Screenshot 2024-04-12 at 1.17.27 PM.png
Data Engineering
jobs
opsgenie
webhooks
Workflows
  • 3285 Views
  • 2 replies
  • 1 kudos
Latest Reply
portoedu
New Contributor III
  • 1 kudos

Hi guys,I found a workaround by creating an email integration in opsgenie and then creating a databricks notification destination with that email.

  • 1 kudos
1 More Replies
AdventureAce
by New Contributor III
  • 637 Views
  • 0 replies
  • 0 kudos

Short-live token from Unity Catalog

What is this short-lived token shared by unity-catalog in step 4 and 5 here? And how does the cloud storage authenticate the token generated by unity catalog?  

AdventureAce_0-1718918698276.png
  • 637 Views
  • 0 replies
  • 0 kudos
Pálmi
by New Contributor II
  • 1119 Views
  • 2 replies
  • 1 kudos

IoT hub with kafka connector - how to decode the enqueued timestamp and device id

I'm reading data from the default endpoint of an IoT hub in azure using the kafka connector in Databricks.  Most data items are straight forward, but the device id and the timestamp I haven't been able to properly decodeFor example, the key-value map...

  • 1119 Views
  • 2 replies
  • 1 kudos
Latest Reply
Erik
Valued Contributor III
  • 1 kudos

https://github.com/Azure/azure-event-hubs-for-kafka/issues/56#issuecomment-1432006831

  • 1 kudos
1 More Replies
aozero
by New Contributor II
  • 1330 Views
  • 3 replies
  • 0 kudos

Deleting data programmatically from databricks live delta tables

Hello all, I am relatively new in data engineering and working on a project requiring me to programmatically delete data from delta live tables. However, I found that simply stopping the streaming job and deleting rows from the delta tables caused th...

  • 1330 Views
  • 3 replies
  • 0 kudos
Latest Reply
aozero
New Contributor II
  • 0 kudos

Hi @shan_chandra Full refreshing brings back the deleted data since it exists in the pubsub source.  

  • 0 kudos
2 More Replies
Eiki
by New Contributor
  • 403 Views
  • 1 replies
  • 0 kudos

How to use the same job cluster in diferents job runs inside the one workflow

I created a Workflow with notebooks and some job runs, but I would to use only one job cluster to run every job runs, without creating a new job cluster for each job run. Because I didn't want to increase the execution time with each new job cluster ...

  • 403 Views
  • 1 replies
  • 0 kudos
Latest Reply
brockb
Databricks Employee
  • 0 kudos

Hi,If I understand correctly, you are hoping to reduce overall job execution time by reducing the Cloud Service Provider instance provisioning time. Is that correct?If so, you may want to consider: Using a Pool of instances: https://docs.databricks.c...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels