cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

halsgbs
by New Contributor III
  • 1499 Views
  • 5 replies
  • 3 kudos

Resolved! Warehouse ID specified in job yaml file for sql tasks

My goal is to trigger an alert I have through a job, and it seems I have to specify the warehouse id within the job yaml file itself. We have different environments with different warehouse ids, and the issue is that if I specify the warehouse id in ...

  • 1499 Views
  • 5 replies
  • 3 kudos
Latest Reply
halsgbs
New Contributor III
  • 3 kudos

Thank you! looks like the alert_id also needs to be parametised, and I was wondering if its possible to use a job parameter to do so? If I can use the alert name then that would be great but I believe it has to be the alert id, which will be differen...

  • 3 kudos
4 More Replies
ShivangiB1
by New Contributor III
  • 838 Views
  • 6 replies
  • 0 kudos

Sql server setup for lakeflow sql server connector to create ingestion

When i am executing below command change instance is getting created but without lakeflow as prefix, i read the documentation and it mentioned that to track schema evolution we need to have prefix, can I please get some assistance.Command Used:EXEC d...

  • 838 Views
  • 6 replies
  • 0 kudos
Latest Reply
ShivangiB1
New Contributor III
  • 0 kudos

and when i altered the table got below warning : WARNING: Table [dbo].[test_table] has a pre-existing capture instance named 'dbo_test_table' that was not created by lakeflow. Lakeflow will preserve this instance and create its own instance alongside...

  • 0 kudos
5 More Replies
ChrisRose
by Databricks Partner
  • 883 Views
  • 6 replies
  • 3 kudos

Resolved! Result Difference Between View and Manually Run View Query

I am experiencing an issue where a view does not display the correct results, but running the view query manually in either a new notebook or the SQL Editor displays different, correct results. I have tried switching the compute resource in the noteb...

  • 883 Views
  • 6 replies
  • 3 kudos
Latest Reply
bianca_unifeye
Databricks MVP
  • 3 kudos

There are 2 fixes that I can think off Option A:  Make first_value deterministic  first_value(Customer_ID, true) OVER ( PARTITION BY customer_name ORDER BY submitted ASC, event_id ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) U...

  • 3 kudos
5 More Replies
Malthe
by Valued Contributor II
  • 539 Views
  • 1 replies
  • 1 kudos

Resolved! Unable to update DLT-based materialized view if clustering key is missing

If we set up a materialized view with a clustering key, and then update the definition such that this key is no longer part of the table, Databricks complains:Run ALTER TABLE ... CLUSTER BY ... to repair Delta clustering metadata.But this is not poss...

  • 539 Views
  • 1 replies
  • 1 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 1 kudos

Hello @Malthe , Currently, there is no supported way to repair broken clustering metadata in Delta materialised views if you remove the clustering key from the definition, other than dropping and recreating the materialised view. Additionally, a full...

  • 1 kudos
manjeetgahlawat
by New Contributor II
  • 392 Views
  • 1 replies
  • 3 kudos

Resolved! DLT Pipeline issue

 Hello Everyone, I have setup a DLT pipeline and while running it first time, I am getting the below issue:key not found: test_bronze_dltNoSuchElementExceptionkey not found: test_bronze_dlt test_bronze_dlt - this my DLT table name that is expected to...

  • 392 Views
  • 1 replies
  • 3 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 3 kudos

Hello @manjeetgahlawat , NoSuchElementException: key not found: test_bronze_dlt occurs when the table/view in the pipeline references a LIVE dataset named test_bronze_dlt, but DLT cannot find a dataset with that exact name in the pipeline graph. (So ...

  • 3 kudos
Lon_Fortes
by Databricks Partner
  • 9794 Views
  • 4 replies
  • 2 kudos

Resolved! How can I check that column on a delta table has a "NOT NULL" constraint or not?

Title pretty much says it all - I'm trying to determine whether or not a column on my existing delta table was defined as NOT NULL or not. It does not show up in any of the metadata (describe detail, describe history, show tblproperties). Thanks in...

  • 9794 Views
  • 4 replies
  • 2 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 2 kudos

@muki , you can run SHOW CREATE TABLE <catalog>.<schema>.<table> and in that also you can also see the constraints applied.

  • 2 kudos
3 More Replies
yzhang
by Contributor
  • 3554 Views
  • 10 replies
  • 3 kudos

iceberg with partitionedBy option

I am able to create a UnityCatalog iceberg format table:    df.writeTo(full_table_name).using("iceberg").create()However, if I am adding option partitionedBy I will get an error.  df.writeTo(full_table_name).using("iceberg").partitionedBy("ingest_dat...

  • 3554 Views
  • 10 replies
  • 3 kudos
Latest Reply
LazyGenius
New Contributor III
  • 3 kudos

@Sanjeeb2024 If your question is for me, then I will say it depends on use case!!As if you have very big data to be ingested in table then you would prefer creating table and then ingest data into it using simultaneous jobs

  • 3 kudos
9 More Replies
Ajay-Pandey
by Databricks MVP
  • 4466 Views
  • 9 replies
  • 2 kudos

Databricks Job cluster for continuous run

Hi AllI am having situation where I wanted to run job as continuous trigger by using job cluster, cluster terminating and re-creating in every run within continuous trigger.I just wanted two know if we have any option where I can use same job cluster...

AjayPandey_0-1728973783760.png
  • 4466 Views
  • 9 replies
  • 2 kudos
Latest Reply
mukul1409
Contributor II
  • 2 kudos

Hi @Ajay-Pandey only solution for you 1. Create an all-purpose cluster called for example:     continuous-job-cluster and Disable auto-termination or set it to a large value.2. Configure job to use existing_cluster_id    In Jobs UI or DAB YAML:   exi...

  • 2 kudos
8 More Replies
parth_db
by New Contributor III
  • 1316 Views
  • 5 replies
  • 7 kudos

Resolved! AutoLoader Type Widening

I have a few doubts regarding AutoLoader behavior and capabilities. Please check and correct wherever my assumptions or understanding are incorrect, much appreciated. Below is my specific code Example scenario:Target Managed Delta Table (Type Widenin...

  • 1316 Views
  • 5 replies
  • 7 kudos
Latest Reply
Sanjeeb2024
Valued Contributor
  • 7 kudos

Thank you @nayan_wylde for the details. This is really useful.

  • 7 kudos
4 More Replies
vamsi_simbus
by Databricks Partner
  • 1232 Views
  • 2 replies
  • 3 kudos

Resolved! Databricks Apps - Auto Terminate Option

Hi Everyone,I’m exploring Databricks Apps and have two questions:Is there a way to automatically terminate an app after a certain period of inactivity?Does Databricks provide any scheduling mechanism for apps, similar to how Jobs can be scheduled?Any...

  • 1232 Views
  • 2 replies
  • 3 kudos
Latest Reply
Sanjeeb2024
Valued Contributor
  • 3 kudos

Hi @vamsi_simbus - One option you can explore to start and stop apps using Databricks API. Have a look on the below document link - https://docs.databricks.com/api/workspace/apps/stop

  • 3 kudos
1 More Replies
slangenborg
by Databricks Partner
  • 683 Views
  • 3 replies
  • 1 kudos

Resolved! DAB Job - Serverless Cluster using configured base environment

I have configured a base serverless environment for my workspace that includes libraries from a private repositoryThis base environment has been set to default, and behaves as expected when running notebooks manually in the workspace with Serverless ...

slangenborg_0-1767733286443.png
  • 683 Views
  • 3 replies
  • 1 kudos
Latest Reply
mukul1409
Contributor II
  • 1 kudos

Hi @slangenborg  According to the official Databricks Jobs REST API documentation, notebook tasks use the notebook environment only implicitly when no environment_key is provided. The API lets you explicitly configure environments only via an environ...

  • 1 kudos
2 More Replies
tonkol
by New Contributor II
  • 480 Views
  • 1 replies
  • 0 kudos

Migrate on-premise delta tables to Databricks (Azure)

Hi There,I have the situation that we've decided to migrate our on-premise delta-lake to Azure Databricks.Because of networking I can only "push" the data from on-prem to cloud.What would be the best way to replicate all tables: schema+partitioning i...

  • 480 Views
  • 1 replies
  • 0 kudos
Latest Reply
mukul1409
Contributor II
  • 0 kudos

The correct solution is not SQL based.Delta tables are defined by the contents of the delta log directory, not by CREATE TABLE statements. That is why SHOW CREATE TABLE cannot reconstruct partitions, properties or constraints.The only reliable migrat...

  • 0 kudos
dikla
by New Contributor II
  • 1109 Views
  • 4 replies
  • 1 kudos

Resolved! Issues Creating Genie Space via API Join Specs Are Not Persisted

Hi,I’m experimenting with the new API to create a Genie Space.I’m able to successfully create the space, but the join definitions are not created, even though I’m passing a join_specs object in the same format returned by GET /spaces/{id} for an exis...

  • 1109 Views
  • 4 replies
  • 1 kudos
Latest Reply
mtaran
Databricks Employee
  • 1 kudos

The serialized space JSON is incorrect. It has `join_specs` and `sql_snippets` nested under `data_sources`, but they should be nested under `instructions` instead. There they apply as expected.

  • 1 kudos
3 More Replies
Maxrb
by New Contributor III
  • 629 Views
  • 1 replies
  • 1 kudos

Resolved! Import functions in databricks asset bundles using source: WORKSPACE

Hi,We are using Databricks asset bundles, and we create functions which we import in notebooks, for instance:from utils import helperswhere utils is just a folder in our root. When running this with source: WORKSPACE, it will fail to resolve the impo...

  • 629 Views
  • 1 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

In Git folders, the repo root is auto-added to the Python path, so imports like from utils import helpers work, while in workspace folders, only the notebook’s directory is on the path, which is why it breaks. The quick fix is a tiny bootstrap that a...

  • 1 kudos
ramsai
by New Contributor II
  • 800 Views
  • 3 replies
  • 3 kudos

Resolved! Serverless Compute Access Restriction Not Supported at User Level

The requirement is to disable serverless compute access for specific users while allowing them to use only their assigned clusters, without restricting serverless compute at the workspace level. After reviewing the available configuration options, th...

  • 800 Views
  • 3 replies
  • 3 kudos
Latest Reply
Masood_Joukar
Contributor
  • 3 kudos

Hi @ramsai ,how about a workaround ?setting budget policies at account level.Attribute usage with serverless budget policies | Databricks on AWS

  • 3 kudos
2 More Replies
Labels