Data Engineering

Forum Posts

Sorted by:

by stevenayers-bge • New Contributor III

05-02-2024 2:24:53 AM

847 Views
4 replies
2 kudos

Bug: Shallow Clone `create or replace` causing [TABLE_OR_VIEW_NOT_FOUND]

I am having an issue where when I do a shallow clone using :create or replace table `catalog_a_test`.`schema_a`.`table_a` shallow clone `catalog_a`.`schema_a`.`table_a` I get:[TABLE_OR_VIEW_NOT_FOUND] The table or view catalog_a_test.schema_a.table_a...

Data Engineering

847 Views
4 replies
2 kudos

05-02-2024 2:24:53 AM

View Replies

Latest Reply

Omar_hamdan
Community Manager

05-02-2024 7:19:45 AM

2 kudos

Hi StevenThis is really a strange issue. First let's exclude some possible causes for this. We need to check the following:- The permission to table A and Catalog B. take a look at the following link to check what permission is needed: https://docs.d...

2 kudos

05-02-2024 7:19:45 AM

3 More Replies

by gauravchaturved • New Contributor II

05-22-2024 7:08:20 AM

619 Views
1 replies
1 kudos

Resolved! Can I delete specific partition from a Delta Live Table?

if I have created a Delta Live Table with partition on a column (lets say a date column) from a Stream Source, can I delete the partition for specific date values later to save on cost & to keep the table lean? if I can, then -1- how to do it?2- do I...

Data Engineering

619 Views
1 replies
1 kudos

05-22-2024 7:08:20 AM

View Replies

Latest Reply

raphaelblg
Honored Contributor II

05-22-2024 3:52:30 PM

1 kudos

Hello @gauravchaturved , You can remove the partition by filtering it in your source code and triggering a full refresh in your pipeline. There is no need to run vacuum, as DLT has maintenance clusters that perform OPTIMIZE and VACUUM operations on y...

1 kudos

05-22-2024 3:52:30 PM

by StephanKnox • New Contributor II

05-20-2024 3:55:53 AM

1234 Views
3 replies
2 kudos

Unit Testing with PyTest in Databricks - ModuleNotFoundError

Dear all,I am following the guide in this article: https://docs.databricks.com/en/notebooks/testing.htmlhowever I am unable to run pytest due to the following error: ImportError while importing test module '/Workspace/Users/deadmanhide@gmail.com/test...

Data Engineering

1234 Views
3 replies
2 kudos

05-20-2024 3:55:53 AM

View Replies

Latest Reply

StephanKnox
New Contributor II

05-22-2024 11:01:12 AM

2 kudos

PS: I have restarted the cluster and ran my run_tests notebook again and now I am getting a different error:E File "/Workspace/Repos/SBIT/SBIT/test_trans.py", line 36 E from transform_functions import * E ^ E SyntaxError: import * only allowed at mod...

2 kudos

05-22-2024 11:01:12 AM

2 More Replies

by Paul92S • New Contributor III

02-20-2024 9:29:32 AM

3406 Views
3 replies
3 kudos

Resolved! DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Hi,I am having an issue of loading source data into a delta table/ unity catalog. The error we are recieving is the following:grpc_message:"[DELTA_EXCEED_CHAR_VARCHAR_LIMIT] Exceeds char/varchar type length limitation. Failed check: (isnull(\'metric_...

Data Engineering

3406 Views
3 replies
3 kudos

02-20-2024 9:29:32 AM

View Replies

Latest Reply

willflwrs
New Contributor II

05-22-2024 10:29:20 AM

3 kudos

Setting this config change before making the write command solved it for us: spark.conf.set("spark.sql.legacy.charVarcharAsString", True)

3 kudos

05-22-2024 10:29:20 AM

2 More Replies

by NarenderKumar • New Contributor III

03-22-2024 4:27:29 AM

1502 Views
3 replies
2 kudos

Unable to connect with Databricks Serverless SQL using Dbeaver

I am trying to connect to databricks serverless SQL pool using DBeaver as mentioned in the documentation below:https://learn.microsoft.com/en-us/azure/databricks/dev-tools/dbeaverI am trying to use the Browser based authentication i.e (OAuth user-to-...

Data Engineering

1502 Views
3 replies
2 kudos

03-22-2024 4:27:29 AM

View Replies

Latest Reply

binsel
New Contributor III

05-22-2024 9:01:49 AM

2 kudos

I'm having the same problem. Any update?

2 kudos

05-22-2024 9:01:49 AM

2 More Replies

by youcanlearn • New Contributor III

05-14-2024 7:27:58 AM

1059 Views
3 replies
2 kudos

Resolved! Databricks Expectations

In the example in https://docs.databricks.com/en/delta-live-tables/expectations.html#fail-on-invalid-records, it wrote that one is able to query the DLT event log for such expectations violation. In Databricks, I can use expectation to fail or drop r...

Data Engineering

1059 Views
3 replies
2 kudos

05-14-2024 7:27:58 AM

View Replies

Latest Reply

brockb
Valued Contributor

05-22-2024 6:57:59 AM

2 kudos

That's right, the "reason" would be "x1 is negative" in your example and "valid_max_length" in the example JSON payload that I shared.If you are looking for a descriptive reason, you would name the expectation accordingly such as: @Dlt.expect_or_fail...

2 kudos

05-22-2024 6:57:59 AM

2 More Replies

by guizsantos • New Contributor II

05-21-2024 10:32:58 AM

645 Views
2 replies
3 kudos

Resolved! How to obtain a query profile programatically?

Hi everyone! Does anyone know if there is a way to obtain the data used to create the graph showed in the "Query profile" section? Particularly, I am interested in the rows produced by the intermediary query operations. I can see there is "Download" ...

Data Engineering

645 Views
2 replies
3 kudos

05-21-2024 10:32:58 AM

View Replies

Latest Reply

guizsantos
New Contributor II

05-22-2024 6:15:08 AM

3 kudos

Hey @raphaelblg , thanks for you input!I understand that some info may be obtained by the `EXPLAIN` command, however, the output is not very clear on its meaning and definetely does not provide what is most interesting to us, which is the rows proces...

3 kudos

05-22-2024 6:15:08 AM

1 More Replies

by Sambit_S • New Contributor III

05-20-2024 3:57:13 AM

1352 Views
9 replies
0 kudos

Databricks Autoloader File Notification Not Working As Expected

Hello Everyone,In my project I am using databricks autoloader to incrementally and efficiently processes new data files as they arrive in cloud storage.I am using file notification mode with event grid and queue service setup in azure storage account...

Data Engineering

1352 Views
9 replies
0 kudos

05-20-2024 3:57:13 AM

View Replies

Latest Reply

matthew_m
New Contributor III

05-22-2024 5:23:11 AM

0 kudos

Hi @Sambit_S , I misread inputRows as inputFiles which aren't the same thing. Considering the limitation on Azure queue, if you are already at the limit then you may need to consider to switching to an event source such as Kafka or Event Hub to get b...

0 kudos

05-22-2024 5:23:11 AM

8 More Replies

by asingamaneni • New Contributor II

06-29-2023 1:29:48 PM

578 Views
1 replies
0 kudos

Databricks Summit 2023

Databricks summit 2023 have been fantastic and I got a chance to meet many authors and industry leaders whom I admire in the DataEngineering community! #DataAISummit

Data Engineering

578 Views
1 replies
0 kudos

06-29-2023 1:29:48 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

05-22-2024 4:42:47 AM

0 kudos

Hi @asingamaneni, We're thrilled to hear that you had a great experience at DAIS 2023! Your feedback is valuable to us, and we appreciate you taking the time to share it on the community platform. We wanted to let you know that the Databricks Communi...

0 kudos

05-22-2024 4:42:47 AM

by Tidaldata • New Contributor

06-28-2023 2:09:54 PM

520 Views
1 replies
0 kudos

Loveing Databricks Summit

Loving the summit so far, awesome keynote speakers, great trainers and paid courses. Finished certification #databrickslearning

Data Engineering

520 Views
1 replies
0 kudos

06-28-2023 2:09:54 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

05-22-2024 4:31:15 AM

0 kudos

Hi @Tidaldata, We're thrilled to hear that you had a great experience at DAIS 2023! Your feedback is valuable to us, and we appreciate you taking the time to share it on the community platform. We wanted to let you know that the Databricks Community ...

0 kudos

05-22-2024 4:31:15 AM

by ws4100e • New Contributor III

03-21-2024 8:47:56 AM

2954 Views
8 replies
0 kudos

DLT piplines with UC

I try to run a (very simple) DLT pipeline in with a resulting materialized table is published in UC schema with a managed storage location defined (within an existing EXTERNAL LOCATION). Accoding to the documentation: Publishing to schemas that speci...

Data Engineering

2954 Views
8 replies
0 kudos

03-21-2024 8:47:56 AM

View Replies

Latest Reply

DataGeek_JT
New Contributor II

04-12-2024 1:19:21 AM

0 kudos

Did this get resolved? I am getting the same issue.

0 kudos

04-12-2024 1:19:21 AM

7 More Replies

by Phani1 • Valued Contributor

05-21-2024 11:59:35 PM

321 Views
1 replies
0 kudos

Databricks Platform Cleanup and baseline activities.

Hi Team, Kindly share the best practices for managing Databricks Platform Cleanup and baseline activities.

Data Engineering

delta

321 Views
1 replies
0 kudos

05-21-2024 11:59:35 PM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

05-22-2024 3:18:50 AM

0 kudos

Hi @Phani1, Here are some best practices for managing Databricks Platform Cleanup and baseline activities: Platform Administration: Regularly monitor and manage your Databricks platform to ensure optimal performance.Compute Creation: Choose the ri...

0 kudos

05-22-2024 3:18:50 AM

by dataslicer • Contributor

05-19-2024 3:07:29 PM

701 Views
2 replies
0 kudos

How to export/clone Databricks Notebook without results via web UI?

When a Databricks Notebook exceeds size limit, it suggests to `clone/export without results`. This is exactly what I want to do, but the current web UI does not provide the ability to bypass/skip the results in either the `clone` or `export` context...

Data Engineering

701 Views
2 replies
0 kudos

05-19-2024 3:07:29 PM

View Replies

Latest Reply

dataslicer
Contributor

05-21-2024 3:40:16 PM

0 kudos

Thank you @Yeshwanth for the response. I am looking for a way without clearing up the current outputs. This is necessary because I want to preserve the existing outputs and fork off another notebook instance to run with few parameter changes and come...

0 kudos

05-21-2024 3:40:16 PM

1 More Replies

by Ramana • Contributor

11-08-2023 1:51:36 PM

1309 Views
3 replies
0 kudos

SHOW GROUPS is not giving groups available at the account level

I am trying to capture all the Databricks groups and their mapping to user/ad group(s).I tried to do this by using show groups, show users, and show grants by following the examples mentioned in the below article but the show groups command only fetc...

Data Engineering

1309 Views
3 replies
0 kudos

11-08-2023 1:51:36 PM

View Replies

Latest Reply

Ramana
Contributor

11-10-2023 12:31:50 PM

0 kudos

Yes, I can use the Rest API but I am looking for a SQL or Programming way to do this rather than doing the API calls and building the Comex Datatype Dataframe and then saving it as a Table.ThanksRamana

0 kudos

11-10-2023 12:31:50 PM

2 More Replies

by kseyser • New Contributor II

05-19-2024 1:25:10 AM

702 Views
2 replies
1 kudos

Predicting compute required to run Spark jobs

Im working on a project to predict compute (cores) required to run spark jobs. Has anyone work on this or something similar before? How did you get started?

Data Engineering

702 Views
2 replies
1 kudos

05-19-2024 1:25:10 AM

View Replies

Latest Reply

Yeshwanth
Honored Contributor

05-20-2024 12:47:43 AM

1 kudos

@kseyser good day, This documentation might help you in your use-case: https://docs.databricks.com/en/compute/cluster-config-best-practices.html#compute-sizing-considerations Kind regards, Yesh

1 kudos

05-20-2024 12:47:43 AM

1 More Replies

User

Count

1603

744

348

285

247

Databricks Community

Forum Posts

Bug: Shallow Clone `create or replace` causing [TABLE_OR_VIEW_NOT_FOUND]

Resolved! Can I delete specific partition from a Delta Live Table?

Unit Testing with PyTest in Databricks - ModuleNotFoundError

Resolved! DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Unable to connect with Databricks Serverless SQL using Dbeaver

Resolved! Databricks Expectations

Resolved! How to obtain a query profile programatically?

Databricks Autoloader File Notification Not Working As Expected

Databricks Summit 2023

Loveing Databricks Summit

DLT piplines with UC

Databricks Platform Cleanup and baseline activities.

How to export/clone Databricks Notebook without results via web UI?

SHOW GROUPS is not giving groups available at the account level

Predicting compute required to run Spark jobs

Compute Policy Does Not Install Libraries

Is there a way to let the DLT pipeline retry by it...

Can't create Catalog on Databricks on AWS

Executing Notebooks - Run All Cells vs Run All Bel...

getting Status code: 301 Moved Permanently error