Data Engineering

Forum Posts

Sorted by:

by vicks • New Contributor III

06-05-2023 11:52:00 PM

12710 Views
5 replies
8 kudos

Resolved! Converting the mon-yy format to date, but showing null for output

I have a date column that comes with month-year format and I am trying to convert that into dd-mm-yyyy format in pyspark for example I have date column with value Jan-2019Feb-2020Mar-2020the output I am expecting is 01/01/201901/02/202001/03/2020here...

Data Engineering

12710 Views
5 replies
8 kudos

06-05-2023 11:52:00 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:04:49 AM

8 kudos

Hi @vikram sinhha We haven't heard from you since the last response from @Suteja Kanuri . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

8 kudos

06-14-2023 12:04:49 AM

4 More Replies

by KKo • Contributor III

01-06-2023 8:43:58 PM

3479 Views
2 replies
2 kudos

delete and append in delta path

I am deleting data from curated path based on date column and appending staged data on it on each run, using below script. My fear is, just after the delete operation, if any network issue appeared and the job stopped before it appended the staged da...

Data Engineering

3479 Views
2 replies
2 kudos

01-06-2023 8:43:58 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

01-07-2023 8:01:02 AM

2 kudos

thanks man

2 kudos

01-07-2023 8:01:02 AM

1 More Replies

by vr • Contributor III

11-26-2022 4:26:24 PM

15905 Views
11 replies
9 kudos

Why is execution too fast?

I have a table, full scan of which takes ~20 minutes on my cluster. The table has "Time" TIMESTAMP column and "day" DATE column. The latter is computed (manually) as "Time" truncated to day and used for partitioning.I query the table using predicate ...

Data Engineering

15905 Views
11 replies
9 kudos

11-26-2022 4:26:24 PM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

11-27-2022 6:40:45 AM

9 kudos

Hi @Vladimir Ryabtsev ,Because you are creating a delta table, I think that you are seeing a performance improvement because of Dynamic Partition pruning, According to the documentation, "Partition pruning can take place at query compilation time wh...

9 kudos

11-27-2022 6:40:45 AM

10 More Replies

by ckwan48 • New Contributor III

11-03-2022 8:34:25 PM

5378 Views
2 replies
4 kudos

Date schema issues with pyspark dataframe creation

I'm having some issues with creating a dataframe with a date column. Could I know what is wrong?from pyspark.sql import SparkSession from pyspark.sql.types import StructType from pyspark.sql.types import DateType, FloatType spark = SparkSession.bui...

Data Engineering

5378 Views
2 replies
4 kudos

11-03-2022 8:34:25 PM

View Replies

Latest Reply

ckwan48
New Contributor III

11-22-2022 9:22:05 AM

4 kudos

Hi @Kaniz Fatma,I actually changed the date format to 'M/d/Y' and it didn't throw any errors. I found in my csv file that it had dates like '3/1/2022'. Could that be the issue? But some dates also were like '12/1/2022. So I'm kind of confused.

4 kudos

11-22-2022 9:22:05 AM

1 More Replies

Databricks Community

Resolved! Converting the mon-yy format to date, but showing null for output

delete and append in delta path

Why is execution too fast?

Date schema issues with pyspark dataframe creation