cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

john77
by New Contributor
  • 73 Views
  • 4 replies
  • 1 kudos

Why ETL Pipelines and Jobs

I do notice that ETL Pipelines let's you run declarative SQL syntax such as DLT tables but you can do the same with Jobs if you use SQL as your task type. So why and when to use ETL Pipelines?

  • 73 Views
  • 4 replies
  • 1 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 1 kudos

Hi @john77 SQL Task Type : Simple, one-off SQL operations or batch jobs + you need to orchestrate a mix of notebooks, Python/Scala code, and SQL in a single workflowLakeflow Declarative Pipelines : Complex , production ETL jobs requires lineage , mon...

  • 1 kudos
3 More Replies
GJ2
by New Contributor II
  • 8608 Views
  • 9 replies
  • 1 kudos

Install the ODBC Driver 17 for SQL Server

Hi,I am not a Data Engineer, I want to connect to ssas. It looks like it can be connected through pyodbc. however looks like  I need to install "ODBC Driver 17 for SQL Server" using the following command. How do i install the driver on the cluster an...

GJ2_1-1739798450883.png
  • 8608 Views
  • 9 replies
  • 1 kudos
Latest Reply
ghoriimanki
  • 1 kudos

The format of the `table_name` argument you're supplying to the `jdbc_writer` method appears to be the cause of the issue you're seeing. A string containing exactly one period is expected to divide into two pieces by the line `schema, table = table_n...

  • 1 kudos
8 More Replies
noorbasha534
by Valued Contributor II
  • 310 Views
  • 5 replies
  • 0 kudos

Figure out stale tables/folders being loaded by auto-loader

Hello allWe have a pipeline which uses auto-loader to load data from cloud object storage (ADLS) to a delta table. We use directory listing at the moment. And there exist around 20000 folders to be verified in ADLS every 30 mins to check for new data...

  • 310 Views
  • 5 replies
  • 0 kudos
Latest Reply
noorbasha534
Valued Contributor II
  • 0 kudos

@szymon_dybczak  ah sorry, let me rephrase. I tried the command initially on the delta table directly. That resulted the error. Then I tried on the check point. It did give me results though discovered on null for all the rows. Still, this does not s...

  • 0 kudos
4 More Replies
Brahmareddy
by Esteemed Contributor
  • 15 Views
  • 0 replies
  • 0 kudos

How Databricks Helped Me See Data Engineering Differently

Over the years working as a data engineer, I’ve started to see my role very differently. In the beginning, most of my focus was on building pipelines—extracting, transforming, and loading data so it could land in the right place. Pipelines were the g...

  • 15 Views
  • 0 replies
  • 0 kudos
devagya
by New Contributor
  • 1013 Views
  • 3 replies
  • 1 kudos

Infor Data Lake to Databricks

I'm working on this project which involves moving data from Infor to Databricks.Infor is somewhat of an enterprise solution. I could not find much resources on this. I could not even find any free trial option on their site.If anyone has experience w...

  • 1013 Views
  • 3 replies
  • 1 kudos
Latest Reply
Shirlzz
New Contributor II
  • 1 kudos

I specialise in data migration with Infor.What is your question, how to connect databricks to the infor datalake through the data fabric?

  • 1 kudos
2 More Replies
leireroman
by New Contributor III
  • 2510 Views
  • 2 replies
  • 2 kudos

Resolved! DBR 16.4 LTS - Spark 3.5.2 is not compatible with Delta Lake 3.3.1

I'm migrating to Databricks Runtime 16.4 LTS, which is using Spark 3.5.2 and Delta Lake 3.3.1 according to the documentation: Databricks Runtime 16.4 LTS - Azure Databricks | Microsoft LearnI've upgraded my conda environment to use those versions, bu...

Captura de pantalla 2025-06-09 084355.png
  • 2510 Views
  • 2 replies
  • 2 kudos
Latest Reply
SamAdams
Contributor
  • 2 kudos

@leireroman encountered the same and used an override (like a pip constraints.txt file or PDM resolution override specification) to make sure my local development environment matched the runtime.

  • 2 kudos
1 More Replies
adrianhernandez
by New Contributor III
  • 41 Views
  • 2 replies
  • 1 kudos

Resolved! Folder execute permissions

Hello,After reading multiple posts, going thru online forums, even asking AI I still don't have an answer for my questions. On the latest Databricks with unity catalog, what happens if I give users Execute permissions on a folder.Can they view the co...

  • 41 Views
  • 2 replies
  • 1 kudos
Latest Reply
adrianhernandez
New Contributor III
  • 1 kudos

Thanks for your response. That's what I imagined although could not confirm as my current project uses Unity Catalog and we are not allowed to run many commands including ACL related PySpark code.

  • 1 kudos
1 More Replies
bbastian
by Visitor
  • 42 Views
  • 1 replies
  • 0 kudos

[VARIANT_SIZE_LIMIT] Cannot build variant bigger than 16.0 MiB in parse_json

I have a table coming from postgreSql, with one column containing json data in string format. We have been using parse_json to convert that to a vraiant column. But lately it is failing with the SIZE_LIMIT error. When I isolated the row which gave er...

  • 42 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @bbastian ,Unfortunately, as of now there is strict limitation regarding size - a variant column cannot contain a value larger than 16 MiB.Variant support in Delta Lake | Databricks on AWSAnd tbh you cannot compare the size of this json string to ...

  • 0 kudos
sslyle
by New Contributor III
  • 7371 Views
  • 9 replies
  • 5 kudos

Resolved! Combining multiple Academy profiles

I have this profile @gmail.com; my personal professional profile.I also have a @mycompany.com profile.How do I combine both so I can leave my current job for a better life without losing the accolades I'm accumulated under my @mycompany.com login giv...

  • 7371 Views
  • 9 replies
  • 5 kudos
Latest Reply
jChantoHdz
Visitor
  • 5 kudos

I have the same question, how this can be done?

  • 5 kudos
8 More Replies
QuanSun
by New Contributor II
  • 1005 Views
  • 3 replies
  • 1 kudos

How to select performance mode for Databricks Delta Live Tables

Hi everyone,Based on the official link,For triggered pipelines, you can select the serverless compute performance mode using the Performance optimized setting in the pipeline scheduler. When this setting is disabled, the pipeline uses standard perfor...

  • 1005 Views
  • 3 replies
  • 1 kudos
Latest Reply
BF7
Contributor
  • 1 kudos

I would like an answer to his question also, I need to see how to turn this off, but any check box relating to performance optimization in my serverless pipeline does not show up.

  • 1 kudos
2 More Replies
EricCournarie
by New Contributor III
  • 89 Views
  • 6 replies
  • 9 kudos

ResultSet metadata does not return correct type for TIMESTAMP_NTZ

Hello, using the JDBC driver, when I retrieve the metadata of a ResultSet, the type for a TIMESTAMP_NTZ is not correct (it's a TIMESTAMP one).My SQL is a simple SELECT * on a table where you have a TIMESTAMP_NTZ columnThis works when retrieving metad...

  • 89 Views
  • 6 replies
  • 9 kudos
Latest Reply
EricCournarie
New Contributor III
  • 9 kudos

Hi, thanks for the response !ok,as it was working on table metadata, I thought the doc was not up to date.. so it's partially supported. do you know if there is any chance it will be fully supported in some 'near' future ?Thanks

  • 9 kudos
5 More Replies
gzr58l
by New Contributor
  • 47 Views
  • 1 replies
  • 0 kudos

How to setup lakeflow HTTP for connector with M2M Authentication

I am getting the following error about content-type with no option to pick a different content-type when configuring the lakeflow connectorThe OAuth token exchange failed with HTTP status code 415 Unsupported Media Type. The returned server response ...

  • 47 Views
  • 1 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @gzr58l are you configuring a custom Lakeflow connector or external connection in Databricks? Also, consider using a service principal or personal access token (PAT) for authentication as a temporary workaround.

  • 0 kudos
data-grassroots
by New Contributor III
  • 52 Views
  • 2 replies
  • 0 kudos

ExcelWriter and local files

I have a couple things going on here.First, to explain what I'm doing, I'm passing an array of objects in to a function that contain a dataframe per item. I want to write those dataframes to an excel workbook - one dataframe per worksheet. That part ...

  • 52 Views
  • 2 replies
  • 0 kudos
Latest Reply
data-grassroots
New Contributor III
  • 0 kudos

Here's a pretty easy way to recreate the issue - simplified to ignore the ExcelWriter part...You can see the file is copied and shows up when listed. But can't be find from Pandas. Same behavior on local_disk0 and /tmp

  • 0 kudos
1 More Replies
JeffSeaman
by New Contributor II
  • 432 Views
  • 8 replies
  • 1 kudos

Resolved! JDCB Error trying a get schemas call.

Hi Community,I have a free demo version and can create a jdbc connection and get metadata (schema, table, and columns structure info). Everything works as described in the docs, but when working with someone who has a paid version of databricks the s...

  • 432 Views
  • 8 replies
  • 1 kudos
Latest Reply
BigRoux
Databricks Employee
  • 1 kudos

@JeffSeaman , please let us know if any of my suggestions help get you on the right track. If they do, kindly mark the post as "Accepted Solution" so others can benefit as well. Cheers, Louis.

  • 1 kudos
7 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels