cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

aladda
by Databricks Employee
  • 5095 Views
  • 2 replies
  • 0 kudos
  • 5095 Views
  • 2 replies
  • 0 kudos
Latest Reply
aladda
Databricks Employee
  • 0 kudos

Here's the difference a View and Table in the context of a Delta Live Table PIpelineViews are similar to a temporary view in SQL and are an alias for some computation. A view allows you to break a complicated query into smaller or easier-to-understan...

  • 0 kudos
1 More Replies
BillBishop
by New Contributor III
  • 664 Views
  • 1 replies
  • 0 kudos

Resolved! Using initcap function in materialized view fails

This query works: select order_date, initcap(customer_name), count(*) AS number_of_ordersfrom ... The initcap does as advertised and capitalizes the customer_name column. However, if I wrap the same exact select in a create materialized view I get an...

  • 664 Views
  • 1 replies
  • 0 kudos
Latest Reply
BillBishop
New Contributor III
  • 0 kudos

NOTE: I got it to work by aliasing the customer_name column, it's documented here: https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-materialized-view#limitationsHowever, it wasn't clear that "Non-column reference expre...

  • 0 kudos
devpdi
by New Contributor
  • 3010 Views
  • 3 replies
  • 0 kudos

Re-use jobs as tasks with the same cluster.

Hello,I am facing an issue with my workflow.I have a job (name it main job) that, among others, runs 5 concurrent tasks, which are defined as jobs (not notebooks).Each of these jobs is identical to the others (name them sub-job-1), with the only diff...

  • 3010 Views
  • 3 replies
  • 0 kudos
Latest Reply
razi9126
New Contributor II
  • 0 kudos

Did you find any solution?

  • 0 kudos
2 More Replies
diguid
by New Contributor III
  • 5188 Views
  • 3 replies
  • 13 kudos

Using foreachBatch within Delta Live Tables framework

Hey there!​I was wondering if there's any way of declaring a delta live table where we use foreachBatch to process the output of a streaming query.​Here's a simplification of my code:​def join_data(df_1, df_2): df_joined = ( df_1 ...

  • 5188 Views
  • 3 replies
  • 13 kudos
Latest Reply
cgrant
Databricks Employee
  • 13 kudos

foreachBatch support in DLT is coming soon, and you now have the ability to write to non-DLT sinks as well

  • 13 kudos
2 More Replies
shan-databricks
by New Contributor III
  • 4077 Views
  • 1 replies
  • 0 kudos

LEGACY_ERROR_TEMP_DELTA_0007 A schema mismatch detected when writing to the Delta table.

Need help to resolve the issue Error : com.databricks.sql.transaction.tahoe.DeltaAnalysisException: [_LEGACY_ERROR_TEMP_DELTA_0007] A schema mismatch detected when writing to the Delta table.I am using the below code and my JSON is dynamically changi...

  • 4077 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

For datasets with constantly changing schemas, we recommend using the Variant type.

  • 0 kudos
thackman
by New Contributor III
  • 1171 Views
  • 1 replies
  • 0 kudos

Inconsistant handling of null structs vs strucs with all null values.

Summary:We have a weird behavior with structs that we have been trying (unsuccessfully) to track down.  We have a struct column in a silver table that should only have data for 1 in every 500 records. It's normally null. But for about 1 in every 50 r...

  • 1171 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Here are some strategies for debugging this: Before you perform each merge, write your source dataframe out as a table, and include the target table's version in the table's nameIf possible, enable the change data feed on your table so as to see chan...

  • 0 kudos
sachamourier
by Contributor
  • 1859 Views
  • 1 replies
  • 0 kudos

Install Python libraries on Databricks job cluster

Hello,I am trying to install some wheel file and requirements.txt file from my Unity Catalog Volumes on my Databricks job cluster using an init script but the results are very inconsistent.Does anyone have ever faced that ?What's wrong with my approa...

job_cluster_issue.png job_cluster_issue_1.png job_cluster_issue_2.png
  • 1859 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @sachamourier, Could you please clarify what is the inconsistency? are some packages missing or the incorrect library was loaded?

  • 0 kudos
LeenB
by New Contributor
  • 555 Views
  • 1 replies
  • 0 kudos

Running a notebook as 'Run all below' when sheduled via Azure DataFactory

We have a notebook with a lot of subsequent cells that can run independent from each other. When we execute the notebook manually via 'Run all', the runs stops when an error is thrown. When we execute manually via 'Run all below', the run proceeds ti...

  • 555 Views
  • 1 replies
  • 0 kudos
Latest Reply
PiotrMi
Contributor
  • 0 kudos

Hi @LeenB For example each cell execution you can build up with try except command. Example belowtry:     print("Hello world")    #your code of each cellexcept Exception as e:    print("Issue with printing hello world")For sure it is not recommended ...

  • 0 kudos
data_mifflin
by New Contributor III
  • 1369 Views
  • 6 replies
  • 1 kudos

Accessing Job parameters using cluster v15.4

After upgrading databricks cluster to version 15.4, is there any way to access job parameters in notebook except the following way ?dbutils.widgets.get("parameter_name")In v15.4, dbutils.notebook.entry_point.getCurrentBindings() has been discontinued...

  • 1369 Views
  • 6 replies
  • 1 kudos
Latest Reply
Pawan1979
New Contributor II
  • 1 kudos

For me it is working at 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12)

  • 1 kudos
5 More Replies
JW_99
by New Contributor II
  • 1004 Views
  • 2 replies
  • 2 kudos

PySparkRuntimeError: [CONTEXT_ONLY_VALID_ON_DRIVER]

I've troubleshot this like 20+ times. I am aware that the current code is causing the spark session to be passed to the workers, where it should only be applied to the driver. Can someone please help me resolve this (the schema is defined earlier)?--...

JW_99_0-1740614786516.png JW_99_1-1740614786523.png JW_99_2-1740614786524.png JW_99_3-1740614786524.png
  • 1004 Views
  • 2 replies
  • 2 kudos
Latest Reply
narasimha_reddy
New Contributor II
  • 2 kudos

You cannot use Spark session explicitly inside Executor logic. Here you are trying mapPartitions which makes the customlogic to get executed inside the executor thread. Either you need to change whole problem approach to segregate spark variable usag...

  • 2 kudos
1 More Replies
adhi_databricks
by Contributor
  • 2101 Views
  • 4 replies
  • 0 kudos

Connect snowflake to Databricks

Hey Folks,I just want to know if there is a way to mirror the Snowflake tables in Databricks , Meaning creating a table using format snowflake and give in options of table (host,user,pwd and dbtable in snowflake). I just tried it as per this code bel...

  • 2101 Views
  • 4 replies
  • 0 kudos
Latest Reply
adhi_databricks
Contributor
  • 0 kudos

Hi @Alberto_Umana , Just a QQ would we be able to change table properties like adding column details, column tagging and Column level masking on the snowflake tables that are under the foreign catalog created?

  • 0 kudos
3 More Replies
nikhilkumawat
by New Contributor III
  • 18939 Views
  • 11 replies
  • 15 kudos

Resolved! Get file information while using "Trigger jobs when new files arrive" https://docs.databricks.com/workflows/jobs/file-arrival-triggers.html

I am currently trying to use this feature of "Trigger jobs when new file arrive" in one of my project. I have an s3 bucket in which files are arriving on random days. So I created a job to and set the trigger to "file arrival" type. And within the no...

  • 18939 Views
  • 11 replies
  • 15 kudos
Latest Reply
Jaison
New Contributor III
  • 15 kudos

Issue with Databricks File Arrival Trigger – Missing File Name InformationThe File Arrival Trigger in Databricks is practically useless if it does not provide the file name and path of the triggering file. In Azure Blob Storage triggers (Function App...

  • 15 kudos
10 More Replies
jeremy98
by Honored Contributor
  • 1259 Views
  • 4 replies
  • 0 kudos

Resolved! how to read excel files inside a databricks notebook?

Hi community,Is it possible to read excel files from dbfs using a notebook file inside Databricks? If yes, how to do it?

  • 1259 Views
  • 4 replies
  • 0 kudos
Latest Reply
jeremy98
Honored Contributor
  • 0 kudos

amazing, yes that's is totally what I need! Thx Stefan! 

  • 0 kudos
3 More Replies
jakub_adamik
by New Contributor III
  • 2288 Views
  • 2 replies
  • 0 kudos

Resolved! Delta Live Tables - BAD_REQUEST: Pipeline cluster is not reachable.

Hi all,I have very simple pipeline: -- Databricks notebook source CREATE OR REFRESH STREAMING TABLE `catalog-prod`.default.dlt_table AS SELECT * FROM STREAM read_files('/Volumes/catalog-prod/storage/*', format=> 'json') -- COMMAND ---------- CREATE...

jakub_adamik_0-1740565875990.png
  • 2288 Views
  • 2 replies
  • 0 kudos
Latest Reply
jakub_adamik
New Contributor III
  • 0 kudos

Hi,thank you for your response. In the mean time I found the bug Databricks UI which caused this behaviour. I will raise ticket to Databricks. Please see the draft of the ticket bellow for workaround:  We’re facing an issue with Delta Live Tables pip...

  • 0 kudos
1 More Replies
wilmorlserios
by New Contributor
  • 834 Views
  • 1 replies
  • 0 kudos

Using databricks-sql-connector in Notebook

I am attempting to utilse the databricks-sql-connector python package within a generalised application deployed to run within a Databricks notebook. Upon attempting to import, I am receiving a  module not found error. However, the package is visible ...

wilmorlserios_0-1740576722972.png wilmorlserios_1-1740576738825.png
  • 834 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @wilmorlserios  The import it's incorrect. It should be: from databricks import sql

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels