- 500 Views
- 1 replies
- 2 kudos
Hi, I am facing an issue where one of my jobs taking so long since certain time, previously its only needs less than 1 hour to run a batch job that load json data and do a truncate and load to a delta table, but since june 2nd, it become so long that...
- 500 Views
- 1 replies
- 2 kudos
Latest Reply
Hi @krisna math​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
- 3945 Views
- 5 replies
- 2 kudos
Hello Databricks community,I'm working on a pipeline and would like to implement a common use case using Delta Live Tables. The pipeline should include the following steps:Incrementally load data from Table A as a batch.If the pipeline has previously...
- 3945 Views
- 5 replies
- 2 kudos
Latest Reply
Hi @Valentin Rosca​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...
4 More Replies
by
sanjay
• Valued Contributor II
- 8185 Views
- 20 replies
- 17 kudos
Hi,I am running batch job which processes incoming files. I am trying to limit number of files in each batch process so added maxFilesPerTrigger option. But its not working. It processes all incoming files at once.(spark.readStream.format("delta").lo...
- 8185 Views
- 20 replies
- 17 kudos
Latest Reply
Hi @Sanjay Jain​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...
19 More Replies
- 4602 Views
- 2 replies
- 0 kudos
I am trying to read messages from kafka topic using spark.readstream, I am using the following code to read it.My CODE:df = spark.readStream .format("kafka") .option("kafka.bootstrap.servers", "192.1xx.1.1xx:9xx") .option("subscr...
- 4602 Views
- 2 replies
- 0 kudos
Latest Reply
You can try this approach:https://stackoverflow.com/questions/57568038/how-to-see-the-dataframe-in-the-console-equivalent-of-show-for-structured-st/62161733#62161733ReadStream is running a thread in background so there's no easy way like df.show().
1 More Replies
by
huyd
• New Contributor III
- 770 Views
- 0 replies
- 4 kudos
I am doing a batch load, using the JDBC driver from a database table. I am noticing in Sparkui, that there is both memory and disk spill, but only on one executor. I am also, noticing that when trying to use the JDBC parallel read, it seems to run sl...
- 770 Views
- 0 replies
- 4 kudos