cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kfoster
by Contributor
  • 4643 Views
  • 7 replies
  • 7 kudos

Azure DevOps Repo - Invalid Git Credentials

I have a Repo in Databricks connected to Azure DevOps Repositories.The repo has been working fine for almost a month, until last week. Now when I try to open the Git settings in Databricks, I am getting "Invalid Git Credentials". Nothing has change...

  • 4643 Views
  • 7 replies
  • 7 kudos
Latest Reply
tbMark
New Contributor II
  • 7 kudos

Same symptoms, same issue. Azure support hasn't figured it out

  • 7 kudos
6 More Replies
sowj02
by New Contributor
  • 325 Views
  • 1 replies
  • 0 kudos

Stream-stream join using MongoDB sink

I am performing stream-to-stream join in Databricks using MongoDB as a source (readStream()). Both sources collections receive data at same time. Initially I tried with using watermarks orderWithWatermark = order \  .selectExpr("order_id AS orderId",...

  • 325 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

There is not enough information in this high-level error message. Please expand the full stacktrace and feel free to post it here

  • 0 kudos
jeremy98
by Contributor III
  • 856 Views
  • 9 replies
  • 0 kudos

restarting the cluster always running doesn't free the memory?

Hello community,I was working on optimising the driver memory, since there are code that are not optimised for spark, and I was planning temporary to restart the cluster to free up the memory.that could be a potential solution, since if the cluster i...

Screenshot 2025-03-04 at 14.49.44.png
  • 856 Views
  • 9 replies
  • 0 kudos
Latest Reply
jeremy98
Contributor III
  • 0 kudos

any suggestion Mr. @Alberto_Umana ?

  • 0 kudos
8 More Replies
jkb7
by New Contributor III
  • 307 Views
  • 1 replies
  • 0 kudos

How can we import the exception "MetadataChangedException"?

I regularly getMetadataChangedException: [DELTA_METADATA_CHANGED] MetadataChangedException: The metadata of the Delta table has been changed by a concurrent update. Please try the operation again.What is the recommended way to import this specific ty...

  • 307 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nik_Vanderhoof
Contributor
  • 0 kudos

Hi! It depends on whether you're using Scala or Python.If you're using Scala, you should be able to import `io.delta.exceptions.MetadataChangedException`, which you can see defined here: https://github.com/delta-io/delta/blob/master/spark/src/main/sc...

  • 0 kudos
Akshay_Petkar
by Contributor III
  • 401 Views
  • 1 replies
  • 0 kudos

Issue with Liquid Clustering on Partitioned Table in Databricks

 I recently tried applying Liquid Clustering to a partitioned table in Databricks and encountered the followingerror: [DELTA_ALTER_TABLE_CLUSTER_BY_ON_PARTITIONED_TABLE_NOT_ALLOWED] ALTER TABLE CLUSTER BY cannot be applied to a partitioned table. I u...

  • 401 Views
  • 1 replies
  • 0 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 0 kudos

Hi @Akshay_Petkar  Since we cannot use Liquid Clustering with a partitioned table, the only way I can think of is migrating from partitioning to Liquid Clustering. The same partitioning key columns and the additional columns you wanted to add can be ...

  • 0 kudos
joseph_sf
by New Contributor
  • 662 Views
  • 1 replies
  • 0 kudos

Implement Delta tables optimized for Databricks SQL service

This question is on  the Databricks Certified Data Engineer Professional exam in section 1: "Implement Delta tables optimized for Databricks SQL service"I do not understand what is being asked by this question. i would assume that their different way...

  • 662 Views
  • 1 replies
  • 0 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 0 kudos

Hi @joseph_sf , I assume you are referring to the exam guide PDF file. As you assumed, there are different techniques to optimize a Delta table. Some of them are already mentioned in the other bullet points in the same section 1, such as partitioning...

  • 0 kudos
drollason
by New Contributor II
  • 301 Views
  • 1 replies
  • 1 kudos

Resolved! Issue with UDF's and DLT where UDF is multi layered and externalized

Having issue getting UDF's to work within a DLT where the UDF is externalized outside of the notebook and it attempts to call other functions.  End goal to put unit test coverage around the various functions, hence the pattern. For test purpose I cre...

  • 301 Views
  • 1 replies
  • 1 kudos
Latest Reply
bgiesbrecht
New Contributor III
  • 1 kudos

Hi @drollason. In DLT pipelines, I would try packaging up your code as a wheel and then install it via pip. I had the same scenario as you and was able to bring in my custom code this way.

  • 1 kudos
nolanreilly
by New Contributor II
  • 1020 Views
  • 1 replies
  • 1 kudos

Impossible to read a custom pipeline? (Scala)

I have created a custom transformer to be used in a ml pipeline. I was able to write the pipeline to storage by extending the transformer class with DefaultParamsWritable. Reading the pipeline back in however, does not seem possible in Scala. I have...

  • 1020 Views
  • 1 replies
  • 1 kudos
Latest Reply
WarrenO
New Contributor III
  • 1 kudos

Hi, did you ever find a solution for this?

  • 1 kudos
NaeemS
by New Contributor III
  • 1247 Views
  • 2 replies
  • 4 kudos

Custom transformers with mlflow

Hi Everyone,I have created a spark pipeline in which I have a stage which is a Custom Transformer. Now I am using feature stores to log my model. But the issue is that the custom Transformer stage is not serialized properly and is not logged along wi...

  • 1247 Views
  • 2 replies
  • 4 kudos
Latest Reply
WarrenO
New Contributor III
  • 4 kudos

Hi @NaeemS,Did you ever get a solution to this problem? I've now encountered this myself. When I save the pipeline using ML Flow log_model, I am able to load the model fine. When I log it with Databricks Feature Engineering package, it throws an erro...

  • 4 kudos
1 More Replies
Unimog
by New Contributor III
  • 701 Views
  • 5 replies
  • 0 kudos

Resolved! Insert Into SQLServer Table

I'm trying to insert and update data in an SQLServer table from a python script.  No matter what I try, it seems to give me this error:The input query contains unsupported data source(s). Only csv, json, avro, delta, kafka, parquet, orc, text, unity_...

  • 701 Views
  • 5 replies
  • 0 kudos
Latest Reply
Nivethan_Venkat
Contributor II
  • 0 kudos

Hi @Unimog,Currently the support for data sources are limited to as mentioned in the General Limitations for serverless compute as of now: General Serverless Limitations Support for data sources is limited to AVRO, BINARYFILE, CSV, DELTA, JSON, KAFKA...

  • 0 kudos
4 More Replies
tommyhmt
by New Contributor II
  • 1537 Views
  • 2 replies
  • 0 kudos

Delta Live Table missing data

Got a very simple DLT which runs fine, but the final table "a" is missing data.I've found that after goes through a full refresh, if I rerun just the final table, then I get more records (from 1.2m to 1.4m) and the missing data then comes back.When I...

tommyhmt_0-1730972149476.png tommyhmt_1-1730972356391.png
  • 1537 Views
  • 2 replies
  • 0 kudos
Latest Reply
NSonam
New Contributor II
  • 0 kudos

To me it seems like timing or dependency issue. The missing data could be due to intermediate tables are not being properly refreshed or triggered during the full refresh. Please check if intermediate tables are being loaded properly before it start ...

  • 0 kudos
1 More Replies
nwong
by New Contributor II
  • 895 Views
  • 5 replies
  • 1 kudos

Error creating Unity Catalog external table

I tried creating an external table from a partitioned parquet folder in Unity Catalog. Initially, I created the table from the Data Ingestion UI. It worked but only a tiny portion of the table was actually loaded. Next, I tried running a SQL DDL CREA...

  • 895 Views
  • 5 replies
  • 1 kudos
Latest Reply
royvansanten
New Contributor II
  • 1 kudos

You can use recursiveFileLookup in OPTIONS, as shown in this topic: https://community.databricks.com/t5/data-engineering/external-table-from-external-location/td-p/69246

  • 1 kudos
4 More Replies
kolangareth
by New Contributor III
  • 6929 Views
  • 11 replies
  • 3 kudos

Resolved! to_date not functioning as expected after introduction of arbitrary replaceWhere in Databricks 9.1 LTS

I am trying to do a dynamic partition overwrite on delta table using replaceWhere option. This was working fine until I upgraded the DB runtime to 9.1 LTS from 8.3.x. I am concatenating 'year', 'month' and 'day' columns and then using to_date functio...

  • 6929 Views
  • 11 replies
  • 3 kudos
Latest Reply
ltreweek
New Contributor II
  • 3 kudos

SELECT TO_DATE('20250217','YYYYMMDD'); gives the error: PARSE_SYNTAX_ERROR  syntax error at or near 'select'. sqlstate: 42601.  It datagrip, it works no problem and displays the date.

  • 3 kudos
10 More Replies
kertsman_nm
by New Contributor
  • 992 Views
  • 0 replies
  • 0 kudos

Trying to use Broadcast to run Presidio distrubuted

Hello,I am currently evaluating using Microsoft's Presidio de-identification libraries for my organization and would like to see if we can take advantage to Sparks broadcast capabilities, but I am getting an error message:"[BROADCAST_VARIABLE_NOT_LOA...

  • 992 Views
  • 0 replies
  • 0 kudos
SamGreene
by Contributor II
  • 2825 Views
  • 10 replies
  • 0 kudos

Use Azure Service Principal to Access Azure Devops

There is another thread marked as answered, but it is not a working solution: Solved: How to use Databricks Repos with a service princip... - Page 2 - Databricks Community - 11789In Azure Devops, there doesn't seem to be a way to generate a PAT for a...

  • 2825 Views
  • 10 replies
  • 0 kudos
Latest Reply
KrunalG
New Contributor II
  • 0 kudos

what exactly is the "databricks_token" that you are using? If it's a personal access token generated using some user account again, I don't think you are solving the problem Sam is facing. 

  • 0 kudos
9 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels