cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jonathan-dufaul
by Valued Contributor
  • 583 Views
  • 1 replies
  • 0 kudos

where to report a bug in the sql formatter?

I was wondering where I go to report a bug in the sql formatter. I tried sending an email to the helpdesk but they think I'm asking for support. I'm not. I just want to report a bug in the application because I think they should know about it. I don'...

jonathandufaul_0-1733417237553.png jonathandufaul_1-1733417252375.png
  • 583 Views
  • 1 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

HI, @jonathan-dufaul For such reports, I think it would be appropriate to click on your profile icon in the top-right corner of the workspace and use the "Send Feedback" option. 

  • 0 kudos
jonathan-dufaul
by Valued Contributor
  • 3173 Views
  • 2 replies
  • 1 kudos

How do I specify column types when writing to a MSSQL server using the JDBC driver (

I have a pyspark dataframe that I'm writing to an on-prem MSSQL server--it's a stopgap while we convert data warehousing jobs over to databricks. The processes that use those tables in the on-prem server rely on the tables maintaining the identical s...

  • 3173 Views
  • 2 replies
  • 1 kudos
Latest Reply
dasanro
New Contributor II
  • 1 kudos

It's happenging to me too!Did you find any solution @jonathan-dufaul  ?Thanks!!

  • 1 kudos
1 More Replies
Yaadhudbe
by New Contributor II
  • 681 Views
  • 1 replies
  • 0 kudos

AWS Databricks- Out of Memory issue in Delta live tables

I have been using Delta live tables more than a year and have implemented good number of DLT pipelines ingesting the data from S3 bucket using the SQS. One of my pipelines process large volume of data. The DLT pipeline reads the data using CloudFiles...

Yaadhudbe_0-1733411659495.png
  • 681 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Yaadhudbe, We would need to review your DLT setup, cluster settings and spark processing to better understand the OOM errors and possible suggestions to mitigate the issue. I suggest to file a case with us to conduct a proper investigation.  http...

  • 0 kudos
Maatari
by New Contributor III
  • 2450 Views
  • 3 replies
  • 0 kudos

AvailableNow Trigger and failure

Hi, I wonder what is the supposed to be the behavior of spark structured streaming when using the AvailableNow Trigger and there is a query failure during the query ? More specifically, what happens to the initial end offset set ? Does it change ? Wh...

  • 2450 Views
  • 3 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The AvailableNow trigger processes all available data as a single batch and then stops. This is different from continuous or micro-batch processing where the system continuously checks for new data. When a query starts with the AvailableNow trigger, ...

  • 0 kudos
2 More Replies
Balram-snaplogi
by New Contributor II
  • 1325 Views
  • 1 replies
  • 1 kudos

How can we customize the access token expiry duration?

Hi,I am using OAuth machine-to-machine (M2M) authentication. I created a service principal and wrote a Java application that allows me to connect to the Databricks warehouse. My question is regarding the code below:String url = "jdbc:databricks://<se...

  • 1325 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

I would say that your token should be manually refreshed as mentioned in the following statement in docs: Databricks tools and SDKs that implement the Databricks client unified authentication standard will automatically generate, refresh, and use Dat...

  • 1 kudos
RateVan
by New Contributor II
  • 4585 Views
  • 4 replies
  • 0 kudos

Spark last window dont flush in append mode

The problem is very simple, when you use TUMBLING window with append mode, then the window is closed only when the next message arrives (+watermark logic). In the current implementation, if you stop incoming streaming data, the last window will NEVER...

3P1l3
  • 4585 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dtank
New Contributor II
  • 0 kudos

Do you have any solution for this ?

  • 0 kudos
3 More Replies
varunjaincse
by New Contributor III
  • 1524 Views
  • 2 replies
  • 1 kudos

Resolved! Databrick JDBC Driver making "List Column SQL" Query Everytime

I am trying to use the Databricks JDBC Spark Driver to run sql queries on the SQL WarehouseSample connection StringString TOKEN = "<token>";String HTTP_PATH = "/sql/1.0/warehouses/<sql-warehouse-id>";final String connStr = "jdbc:spark://discover.clou...

  • 1524 Views
  • 2 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Hello, thank you for your question. The initial metadata query (the "Listing Columns" query) is tied to the SparkGetColumnsOperation class, which is part of the Apache Hive ThriftServer and Spark's handling of JDBC metadata operations. Can you please...

  • 1 kudos
1 More Replies
roshanjoebenny
by New Contributor III
  • 1774 Views
  • 7 replies
  • 1 kudos

Unity Catalog

When I try to connect my local postgres with databricks unity catalog I am facing issues. Could you please explain the steps in doing that

  • 1774 Views
  • 7 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Databricks does not have connectivity to your local network out of the box.You should setup a VNet and VNet peering (and also firewall rules).Connect your Azure Databricks workspace to your on-premises network - Azure Databricks | Microsoft LearnNetw...

  • 1 kudos
6 More Replies
bmhardy
by New Contributor III
  • 3044 Views
  • 4 replies
  • 0 kudos

Hierarchy roll up aggregation

I have just learnt that recursive CTE's are not supported in Databricks SQL, however, we are looking to shift the complex aggregations into Databricks instead of relying on Azure SQL DB.We are using Azure SQL DB with CDC enabled in combination with A...

  • 3044 Views
  • 4 replies
  • 0 kudos
Latest Reply
bmhardy
New Contributor III
  • 0 kudos

We have gone down a different route where we are using SQL for our calculated layer and then a Python notebook for our aggregated layer. It is much easier to roll up data using a Python UDF than it is trying to work out how to do it in SQL.

  • 0 kudos
3 More Replies
srinivas_001
by New Contributor III
  • 3393 Views
  • 3 replies
  • 1 kudos

Autoloader configuration with data type casting

Hi1: I am reading a parquet file from AWS s3 storage using spark.read.parquet(<s3 path>) 2: An autoloader job has been configured to load this data into a external delta table.3: But before loading into this autoloader I need to do some typecasting o...

  • 3393 Views
  • 3 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

Data can be added to the rescued data column when types do not match and when implicit casting does not work. However, check to see if the typecasting you're trying to do is supported by Delta Lake's type widening feature, which gives more flexibilit...

  • 1 kudos
2 More Replies
oakhill
by New Contributor III
  • 6807 Views
  • 9 replies
  • 1 kudos

Is Delta Live Tables not supported anymore? How do I use it in Python?

Hi!Any time I try to import "dlt" in a notebook session to develop Pipelines, I get an error message saying DLT is not supported on Spark Connect clusters. These are very generic clusters, I've tried runtime 14, 15 and the latest 16, using shared clu...

  • 6807 Views
  • 9 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Oakhill, we do provide free onboard training. You might be interested in the "Get Started with Data Engineering on Databricks" session.  You can register here: https://www.databricks.com/training/catalog.  When you are searching the catalog of traini...

  • 1 kudos
8 More Replies
kurokaj
by New Contributor
  • 1646 Views
  • 1 replies
  • 0 kudos

DLT Autoloader stuck in reading Avro files from Azure blob storage

I have a DLT pipeline joining data from streaming tables to metadata of Avro files located in Azure blob storage. The avro files are loaded using autoloader. Up until 25.3. (about 20:00UTC) the pipeline worked fine, but then suddenly got stuck in ini...

image.png
Data Engineering
autoloader
AVRO
dlt
LTS
  • 1646 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Based off of your screenshot, a Spark job has started, and 33/34 tasks are completed. This is usually indicative of some kind of skewed processing. Please refer to this documentation for help identifying and resolving skew

  • 0 kudos
Nathant93
by New Contributor III
  • 43992 Views
  • 1 replies
  • 1 kudos

Autoloader exclude one directory

Hi,I have a bunch of csv files in directories within an azure blob container and I am using autoloader to ingest them into a raw (bronze) table, all csvs apart from one have the same schema. Is there a way to get autoloader to ignore the directory wi...

  • 43992 Views
  • 1 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

Auto Loader accept globs as input, including negative globs. You can use this to exclude a directory as long as the path is known ahead of time

  • 1 kudos
Menegat
by New Contributor
  • 1511 Views
  • 1 replies
  • 0 kudos

VACUUM seems to be deleting Autoloader's log files.

Hello everyone,I have a workflow setup that updates a few Delta tables incrementally with autoloader three times a day. Additionally, I run a separate workflow that performs VACUUM and OPTIMIZE on these tables once a week.The issue I'm facing is that...

  • 1511 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

The error message suggests that autoloader's state is being improperly deleted, most likely by a separate process. If your checkpoint exists inside of the root of a delta table, then VACUUM can delete its files. Make sure that you do not store checkp...

  • 0 kudos
Andolina
by New Contributor III
  • 606 Views
  • 1 replies
  • 0 kudos

Connectivity failure to on-prem databases

Hi All,We have more than 100 jobs right now in Databricks which connect to on-prem database like Oracle. Connection to oracle is made through notebooks using jdbc thin client and using com.oracle.ojdbc:ojdbc10:19.3.0.0 or com.oracle.ojdbc:ojdbc8:19.3...

  • 606 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Do you have any custom DNS in your set up? If yes, are you aware of any changes being performed on the same pointing to the databases?

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels