Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have a query that I'm trying to insert overwrite into a table. In an effort to try and speed up the query I added a range join hint. After adding it I started getting the error below.I can get around this though by creating a temporary view of the ...
We want to use the INSERT INTO command with specific columns as specified in the official documentation. The only requirements for this are️ Databricks SQL warehouse version 2022.35 or higher️ Databricks Runtime 11.2 and aboveand the behaviour shou...
Hi @Fusselmanwog Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...
Hello. During some example cases we were running, in order to identify how Databricks treats possible wrong actions we could make, we noticed that merge doesn't fail while inserting different data type values from the ones in the corresponding table....
def upsertToDelta(microBatchOutputDF, batchId):
microBatchOutputDF.createOrReplaceTempView("updates")
microBatchOutputDF._jdf.sparkSession().sql("""
MERGE INTO old o
USING updates u
ON u.id = o.id
WHEN MATCHED THEN UPDATE SE...
Hello,
I am attempting to append new json files into an existing parquet table defined in Databricks.
Using a dataset defined by this command (dataframe initially added to a temp table):
val output = sql("select headers.event_name, to_date(from_unix...
We came across similar situation we are using spark 1.6.1, we have a daily load process to pull data from oracle and write as parquet files, this works fine for 18 days of data (till 18th run), the problem comes after 19th run where the data frame l...