cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

berk
by New Contributor II
  • 1284 Views
  • 2 replies
  • 1 kudos

Delete Managed Table from S3 Bucket

Hello,I am encountering an issue with our managed tables in Databricks. The tables are stored in S3 Bucket. When I drop a managed table (either through UI or through running a drop table code in a notebook), the associated data is not being deleted f...

  • 1284 Views
  • 2 replies
  • 1 kudos
Latest Reply
berk
New Contributor II
  • 1 kudos

@kenkoshaw, thank you for your reply. It is indeed interesting that the data isn't immediately deleted after the table is dropped, and that there's no way to force this process. I suppose I'll have to manually delete the files from the S3 Bucket if I...

  • 1 kudos
1 More Replies
Ian_Neft
by New Contributor
  • 10795 Views
  • 3 replies
  • 0 kudos

Data Lineage in Unity Catalog not Populating

I have been trying to get the data lineage to populate with the simplest of queries on a unity enabled catalog with a unity enabled cluster.  I am essentially running the example provided with more data to see how it works with various aggregates dow...

  • 10795 Views
  • 3 replies
  • 0 kudos
Latest Reply
AlexYu
New Contributor III
  • 0 kudos

You might need to update your outbound firewall rules to allow for connectivity to the Amazon Kinesis / Event Hubs endpoint.https://docs.databricks.com/en/data-governance/unity-catalog/data-lineage.html#:~:text=To%20view%20lineage%20for%20a,Runtime%2...

  • 0 kudos
2 More Replies
Roxio
by New Contributor II
  • 942 Views
  • 1 replies
  • 1 kudos

Resolved! Materilized view quite slower than table and lots of time on "Optimizing query & pruning files"

I have a query that calls different materialized views, anyway most of the time of the query is spent in "Optimizing query & pruning files" vs the execution.The difference is like 2-3 secs for the optimization and 300-400ms for the executionSimilar i...

  • 942 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Honored Contributor
  • 1 kudos

Hi Roxio, How are you doing today?The difference in query times between materialized views and tables likely comes from the complexity of the views, as they often involve more steps in the background. To reduce the optimization time, you can try simp...

  • 1 kudos
basit2point0
by New Contributor II
  • 401 Views
  • 1 replies
  • 0 kudos

Autloader error for assuming a role

Hi @Retired_mod I have seen numerous post by you. Thanks for continuously providing support. Can you or your colleagues help on this. We have a basic user which assumes a role with S3 policy to a specific bucket. When we try to read the bucket from D...

  • 401 Views
  • 1 replies
  • 0 kudos
Latest Reply
basit2point0
New Contributor II
  • 0 kudos

Py4JJavaError: An error occurred while calling o503.json. : java.nio.file.AccessDeniedException: s3a://xxxxxx.json: shaded.databricks.org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by AwsCredentialContextTokenProvid...

  • 0 kudos
manoj_
by New Contributor II
  • 663 Views
  • 1 replies
  • 0 kudos

Databricks view error

Data source errorDataSource.Error: ODBC: ERROR [42000] [Microsoft][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query.Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: [DEL...

  • 663 Views
  • 1 replies
  • 0 kudos
Latest Reply
manoj_
New Contributor II
  • 0 kudos

This view used to run till last week and suddenly started giving this error. So need to check what can be the reason for this issue

  • 0 kudos
ksilva
by New Contributor
  • 3865 Views
  • 4 replies
  • 1 kudos

Incorrect secret value when loaded as environment variable

I recently faced an issue that took good hours to identify. I'm loading an environment variable with a secretENVVAR: {{secrets/scope/key}}The secret is loaded in my application, I could verify it's there, but its value is not correct. I realised tha...

  • 3865 Views
  • 4 replies
  • 1 kudos
Latest Reply
danmlopsmaz
New Contributor II
  • 1 kudos

Hi team, is there an update or fix for this?

  • 1 kudos
3 More Replies
marcuskw
by Contributor II
  • 2069 Views
  • 5 replies
  • 5 kudos

Resolved! IDENTIFIER not working in UPDATE

The following code works perfectly fine: df = spark.createDataFrame([('A', 1), ('B', 2)]) df.createOrReplaceTempView('temp') spark.sql(""" SELECT IDENTIFIER(:col) FROM temp """, args={ "col": "_1" } ).display(...

  • 2069 Views
  • 5 replies
  • 5 kudos
Latest Reply
marcuskw
Contributor II
  • 5 kudos

If it helps anyone else I found this article that described a few limitations:https://community.databricks.com/t5/technical-blog/how-not-to-build-an-execute-immediate-demo/ba-p/82167 

  • 5 kudos
4 More Replies
leireroman
by New Contributor III
  • 1539 Views
  • 3 replies
  • 0 kudos

Resolved! RESOURCE_EXHAUSTED dbutils.jobs.taskValues.get

I've a job in Databricks running multiple tasks in parallel. Those tasks read parameters of the job using the utility of dbutils. I'm getting the following error when trying to read parameters in my different tasks:com.databricks.common.client.Databr...

image.png
  • 1539 Views
  • 3 replies
  • 0 kudos
Latest Reply
leireroman
New Contributor III
  • 0 kudos

Hi all,Our solution has been to use job parameters and dynamic value references. These are read using dbutils.widgets.get() instead of dbutils.jobs.taskValues.get(). Now, our ETL is working well again.Pass context about job runs into job tasks - Azur...

  • 0 kudos
2 More Replies
4kb_nick
by New Contributor III
  • 1478 Views
  • 3 replies
  • 0 kudos

Unity Catalog Lineage Not Working on GCP

Hello,We have set up a lakehouse in Databricks for one of our clients. One of the features our client would like to use is the Unity Catalog data lineage view. This is a handy feature that we have used with other clients (in both AWS and Azure) witho...

  • 1478 Views
  • 3 replies
  • 0 kudos
Latest Reply
4kb_nick
New Contributor III
  • 0 kudos

Hello,It's been a few months since this exchange. The feature limitation is not documented anywhere - documents imply that this should be working in GCP:https://docs.gcp.databricks.com/en/data-governance/unity-catalog/data-lineage.htmlIs this feature...

  • 0 kudos
2 More Replies
Valentin14
by New Contributor II
  • 7549 Views
  • 5 replies
  • 4 kudos

Import module never ends on random branches

Hello,Since a week ago, our notebook are stuck in running on the firsts cells which import python module from our github repository which is cloned in databricks.The cells stays in running state and when we try to manually cancel the jobs in databric...

  • 7549 Views
  • 5 replies
  • 4 kudos
Latest Reply
timo199
New Contributor II
  • 4 kudos

@Retired_mod 

  • 4 kudos
4 More Replies
SebastianCar28
by New Contributor
  • 281 Views
  • 0 replies
  • 0 kudos

How to implement Lifecycle of Data When Use ADLS

Hello everyone, nice to greet you. I have a question about the data lifecycle in ADLS. I know ADLS has its own rules, but they aren't working properly because I have two ADLS accounts: one for hot data and another for cool storage where the informati...

  • 281 Views
  • 0 replies
  • 0 kudos
weldermartins
by Honored Contributor
  • 7679 Views
  • 6 replies
  • 10 kudos

Resolved! Spark - API Jira

Hello guys. I use pyspark in my daily life. A demand has arisen to collect information in Jira. I was able to do this via Talend ESB, but I wouldn't want to use different tools to get the job done. Do you have any example of how to extract data from ...

  • 7679 Views
  • 6 replies
  • 10 kudos
Latest Reply
Marty73
New Contributor II
  • 10 kudos

Hi,There is also a new Databricks for Jira add-on on the Atlassian Marketplace. It is easy to setup and exports are directly created within Jira. They can be one-time, scheduled, or real-time. It can also export additional Jira data such as Assets, C...

  • 10 kudos
5 More Replies
lurban
by New Contributor
  • 5051 Views
  • 1 replies
  • 0 kudos

[INTERNAL_ERROR] The Spark SQL phase analysis failed with an internal error

Hello,I am currently working through an issue I am seeing when querying a Lakehouse Federation UC table in a workflow. I am using pyspark to query a table through Lakehouse Federation which returns a result based on the query. When running it in a ce...

  • 5051 Views
  • 1 replies
  • 0 kudos
Latest Reply
LindasonUk
New Contributor II
  • 0 kudos

I hit a similar error when trying to view FC data using a cluster with a lower Databricks Runtime.Needs to be DBR 13.1 or higher.

  • 0 kudos
pgrandjean
by New Contributor III
  • 13491 Views
  • 6 replies
  • 2 kudos

How to transfer ownership of a database and/or table?

We created a new Service Principal (SP) on Azure and would like to transfer the ownership of the databases and tables created with the old SP. The issue is that these databases and tables are not visible to the users using the new SP.I am using a Hiv...

  • 13491 Views
  • 6 replies
  • 2 kudos
Latest Reply
VivekChandran
New Contributor II
  • 2 kudos

Regarding the [PARSE_SYNTAX_ERROR] Syntax error at or near 'OWNER'.Remember to wrap the new owner name in the SQL statement with the Grave Accent (`) as the below sample. ALTER SCHEMA schema_name OWNER TO `new_oner_name`;  

  • 2 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels