cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

iptkrisna
by New Contributor III
  • 469 Views
  • 1 replies
  • 2 kudos

Jobs Data Pipeline Runtime Increase Significantly

Hi, I am facing an issue where one of my jobs taking so long since certain time, previously its only needs less than 1 hour to run a batch job that load json data and do a truncate and load to a delta table, but since june 2nd, it become so long that...

  • 469 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @krisna math​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 661 Views
  • 1 replies
  • 4 kudos

spark 3.4 and databricks 13 introduce two new types of timestamps for handling time zone information:- TIMESTAMP WITH LOCAL TIME ZONE: This type assum...

spark 3.4 and databricks 13 introduce two new types of timestamps for handling time zone information:- TIMESTAMP WITH LOCAL TIME ZONE: This type assumes that the input data is in the session's local time zone and converts it to UTC before processing....

timezone
  • 661 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

This is helpful! Timestamps are always the reason to mess up the business logic as we know.

  • 4 kudos
nolanlavender00
by New Contributor
  • 1896 Views
  • 2 replies
  • 0 kudos

How to control garbage collection while using Autoloader File Notification?

I am using Autoloader to load files from a directory. I have set up File Notification with the Event Subscription. I have a backfill interval set to 1 day and have not run the stream for a week. There should only be about ~100 new files to pick up an...

  • 1896 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @nolanlavender008​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 0 kudos
1 More Replies
alejandrofm
by Valued Contributor
  • 1358 Views
  • 4 replies
  • 0 kudos

AppendDataExecV1 Taking a lot of time

Hi, I have a Pyspark job that takes about an hour to complete, when looking at the SQL tab on Spark UI I see this:Those processes run for more than 1 minute on a 60-minute process.This is Ganglia for that period (the last snapshot, will look into a l...

image image
  • 1358 Views
  • 4 replies
  • 0 kudos
Latest Reply
Vartika
Moderator
  • 0 kudos

Hi @Alejandro Martinez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 0 kudos
3 More Replies
srDataEngineer
by New Contributor II
  • 1930 Views
  • 4 replies
  • 2 kudos

Resolved! how does databricks time travel work

Hi, Since it is not very well explained, I want to know if the table history is a snapshot of the whole table at that point of time containing all the data or it tracks only some metadata of the table changes.To be more precise : if I have a table in...

  • 1930 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @data engineer​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 2 kudos
3 More Replies
tinendra
by New Contributor III
  • 1984 Views
  • 5 replies
  • 5 kudos

How to reduce time while loading data into the azure synapse table?

Hi All,I just wanted to know if is there any option to reduce time while loading Pyspark Dataframe into the Azure synapse table using Databricks.like..I have a pyspark dataframe that has around 40k records and I am trying to load data into the azure ...

  • 1984 Views
  • 5 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Tinendra Kumar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 5 kudos
4 More Replies
KVNARK
by Honored Contributor II
  • 662 Views
  • 2 replies
  • 4 kudos

How much time does it take for the databricks partner account to get created

How much time does it take for the databricks partner account to get created after we submit the application to databricks.?

  • 662 Views
  • 2 replies
  • 4 kudos
Latest Reply
Harshjot
Contributor III
  • 4 kudos

Hi @KVNARK .​ On training academy? It was instant for me.

  • 4 kudos
1 More Replies
stinodego
by New Contributor III
  • 1615 Views
  • 8 replies
  • 19 kudos

Python job run error messages are unreadable

This has been going on for some time now; all errors look like this (note the weird `[0;34m` marks everywhere). How can we fix this?We're not doing anything crazy, this is just the latest runtime with pretty much the simplest possible hello world pro...

image
  • 1615 Views
  • 8 replies
  • 19 kudos
Latest Reply
VaibB
Contributor
  • 19 kudos

Have you tried detaching and reattaching the notebook? Or Cluster restart? Did you check you are not importing any specific library someone else with the right access might have installed some library with install to all clusters as checked.

  • 19 kudos
7 More Replies
PaulP
by New Contributor II
  • 1112 Views
  • 4 replies
  • 6 kudos

What is the best expected starting time for a cluster when using a pool?

Hi! I'm doing some tests to get an idea of how much time could be saved starting a cluster by using a pool and was wondering if the results I get are what should be expected.We're using AWS Databricks and used i3.xlarge as instance type (if that matt...

  • 1112 Views
  • 4 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Paul Pelletier​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 6 kudos
3 More Replies
Dineshkumar_Raj
by New Contributor
  • 1673 Views
  • 2 replies
  • 1 kudos

why the job running time and command execution time not matching in databricks

I have a azure databricks job and it's triggered via ADF using a API call. I want see why the job has been taking n minutes to complete the tasks. When the job execution results, The job execution time says 15 mins and the individual cells/commands d...

  • 1673 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey there @DineshKumar​ Does @Prabakar Ammeappin​'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. Cheers!

  • 1 kudos
1 More Replies
dududu
by New Contributor II
  • 539 Views
  • 1 replies
  • 0 kudos

How to explain the huge time latency between two jobs? How to optimize the job to reduce the latency ?

I have met a problem , you can see in the picture as followed: there is some long delay between some jobs , I don't understand what happened and how to optimize the job ? Can anybody help me ? Thanks a lot.

image
  • 539 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @jieping zhang​,Did you check the driver's logs? do you see any error messages? please provide more details

  • 0 kudos
CHANDY
by New Contributor
  • 451 Views
  • 1 replies
  • 0 kudos

real time data processing

Say I am getting a customer record from an website. I want to read the massage & then insert/update that one to snowflake table , depending on the records insert/update is successful I need to respond back the success / failure massage in say 1 sec. ...

  • 451 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @CHANDAN NANDY​, This article explains how Snowflake uses Kafka to deliver real-time data capture, with results available on Tableau dashboards within minutes.

  • 0 kudos
kenldk
by New Contributor III
  • 1646 Views
  • 7 replies
  • 4 kudos

Resolved! When will the bills from Databricks arrive?

I am using Databricks for the first time and after 3 months I didn't see a single bill from Databricks. However, the accumulated usage has reached $180. Currently my workspace status is still running. Do I need to terminate my workspace to get billed...

  • 1646 Views
  • 7 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @Ken Lei​, Just a friendly follow-up. Do you still need help? Please let us know.

  • 4 kudos
6 More Replies
Labels