- 6215 Views
- 11 replies
- 3 kudos
I am trying to do a dynamic partition overwrite on delta table using replaceWhere option. This was working fine until I upgraded the DB runtime to 9.1 LTS from 8.3.x. I am concatenating 'year', 'month' and 'day' columns and then using to_date functio...
- 6215 Views
- 11 replies
- 3 kudos
Latest Reply
SELECT TO_DATE('20250217','YYYYMMDD'); gives the error: PARSE_SYNTAX_ERROR syntax error at or near 'select'. sqlstate: 42601. It datagrip, it works no problem and displays the date.
10 More Replies
by
diguid
• New Contributor III
- 3565 Views
- 3 replies
- 13 kudos
Hey there!I was wondering if there's any way of declaring a delta live table where we use foreachBatch to process the output of a streaming query.Here's a simplification of my code:def join_data(df_1, df_2):
df_joined = (
df_1
...
- 3565 Views
- 3 replies
- 13 kudos
Latest Reply
foreachBatch support in DLT is coming soon, and you now have the ability to write to non-DLT sinks as well
2 More Replies
- 29865 Views
- 6 replies
- 10 kudos
- 29865 Views
- 6 replies
- 10 kudos
Latest Reply
Even if I vacuum and optimize, it keeps getting stuck.cluster type is r6gd.xlarge min:4, max:6driver type is r6gd.2xlarge
5 More Replies
- 10427 Views
- 12 replies
- 6 kudos
Suppose I have a Delta Live Tables framework with 2 tables: Table 1 ingests from a json source, Table 2 reads from Table 1 and runs some transformation.In other words, the data flow is json source -> Table 1 -> Table 2. Now if I find some bugs in the...
- 10427 Views
- 12 replies
- 6 kudos
Latest Reply
Answering my own question: nowadays (February 2024) this can all be done via the UI.When viewing your DLT pipeline there is a "Select tables for refresh" button in the header. If you click this, you can select individual tables, and then in the botto...
11 More Replies
- 24308 Views
- 9 replies
- 5 kudos
I have generated a result using SQL. But whenever I try to download the full result (1 million rows), it is throwing SparkException. I can download the preview result but not the full result. Why ? What happens under the hood when I try to download ...
- 24308 Views
- 9 replies
- 5 kudos
Latest Reply
ac567
New Contributor III
Job aborted due to stage failure: Task 6506 in stage 46.0 failed 4 times, most recent failure: Lost task 6506.3 in stage 46.0 (TID 12896) (10.**.***.*** executor 12): java.lang.OutOfMemoryError: Cannot reserve 4194304 bytes of direct buffer memory (a...
8 More Replies
- 4389 Views
- 6 replies
- 2 kudos
If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?Are there any other risks that I need to be aware of in terms of cancelling a job run half way t...
- 4389 Views
- 6 replies
- 2 kudos
Latest Reply
Hi, is there any way to ensure transaction control in delta protocol in 2024 across tables for failing jobs?
5 More Replies
- 13271 Views
- 10 replies
- 4 kudos
Hi Community,I have successfully run a job through the API but would need to be able to pass parameters (configuration) to the DLT workflow via the APII have tried passing JSON in this format:{
"full_refresh": "true",
"configuration": [
...
- 13271 Views
- 10 replies
- 4 kudos
Latest Reply
You cannot pass parameters from a Databricks job to a DLT pipeline. Atleast not yet. You can see from the DLT rest API that there is no option for it to accept any parameters.But there is a workaround.But there is a workaround.With the assumption tha...
9 More Replies
- 3469 Views
- 5 replies
- 0 kudos
working with delta files spark structure streaming , what is the maximum default chunk size in each batch?How do identify this type of spark configuration in databricks?#[Databricks SQL] #[Spark streaming] #[Spark structured streaming] #Spark
- 3469 Views
- 5 replies
- 0 kudos
Latest Reply
doc - https://docs.databricks.com/en/structured-streaming/delta-lake.html
Also, what is the challenge while using foreachbatch?
4 More Replies
- 3776 Views
- 5 replies
- 7 kudos
Following are the details of the requirement:1. I am using databricks notebook to read data from Kafka topic and writing into ADLS Gen2 container i.e., my landing layer.2. I am using Spark code to read data from Kafka and write into landing...
- 3776 Views
- 5 replies
- 7 kudos
Latest Reply
just to clarify, are you reading kafka and writing into adls in json files? like for each message from kafka is 1 json file in adls ?
4 More Replies
- 17671 Views
- 7 replies
- 6 kudos
Hello,I am trying to write Delta files for some CSV data. When I docsv_dataframe.write.format("delta").save("/path/to/table.delta")I get: AnalysisException: Found invalid character(s) among " ,;{}()\n\t=" in the column names of yourschema.Having look...
- 17671 Views
- 7 replies
- 6 kudos
Latest Reply
I still get the error when I try any method. The column names with spaces are throwing error [DELTA_INVALID_CHARACTERS_IN_COLUMN_NAMES] Found invalid character(s) among ' ,;{}()\n\t=' in the column names of your schema.df1.write.format("delta") \
.mo...
6 More Replies
by
DJey
• New Contributor III
- 15266 Views
- 6 replies
- 2 kudos
- 15266 Views
- 6 replies
- 2 kudos
Latest Reply
In Databricks Runtime 15.2 and above, you can specify schema evolution in a merge statement using SQL or Delta table APIs:MERGE WITH SCHEMA EVOLUTION INTO targetUSING sourceON source.key = target.keyWHEN MATCHED THENUPDATE SET *WHEN NOT MATCHED THENI...
5 More Replies
- 8542 Views
- 6 replies
- 3 kudos
Hello Databricks community,I'm working on a pipeline and would like to implement a common use case using Delta Live Tables. The pipeline should include the following steps:Incrementally load data from Table A as a batch.If the pipeline has previously...
- 8542 Views
- 6 replies
- 3 kudos
Latest Reply
I totally agree that this is a gap in the Databricks solution. This gap exists between a static read and real time streaming. My problem (and suspect there are many use cases) is that I have slowly changing data coming into structured folders via ...
5 More Replies
- 11126 Views
- 12 replies
- 13 kudos
We are considering moving to Delta Live tables from a traditional sql-based data warehouse. Worrying me is this FAQ on identity columns Delta Live Tables frequently asked questions | Databricks on AWS this seems to suggest that we basically can't cre...
- 11126 Views
- 12 replies
- 13 kudos
by
YFL
• New Contributor III
- 7326 Views
- 11 replies
- 6 kudos
Hi, I want to keep track of the streaming lag from the source table, which is a delta table. I see that in query progress logs, there is some information about the last version and the last file in the version for the end offset, but this don't give ...
- 7326 Views
- 11 replies
- 6 kudos
Latest Reply
Hey @Yerachmiel Feltzman I hope all is well.Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!
10 More Replies