cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

karolinalbinsso
by New Contributor II
  • 3097 Views
  • 2 replies
  • 3 kudos

Resolved! How to access the job-Scheduling Date from within the notebook?

I have created a job that contains a notebook that reads a file from Azure Storage. The file-name contains the date of when the file was transferred to the storage. A new file arrives every Monday, and the read-job is scheduled to run every Monday. I...

  • 3097 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Hi, I guess the files are in the same directory structure so that you can use cloud files autoloader. It will incrementally read only new files https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/auto-loaderSo it will ...

  • 3 kudos
1 More Replies
sp1
by New Contributor II
  • 13911 Views
  • 5 replies
  • 4 kudos

Resolved! Pass date value as parameter in Databricks SQL notebook

I want to pass yesterday date (In the example 20230115*.csv) in the csv file. Don't know how to create parameter and use it here.CREATE OR REPLACE TEMPORARY VIEW abc_delivery_logUSING CSVOPTIONS ( header="true", delimiter=",", inferSchema="true", pat...

  • 13911 Views
  • 5 replies
  • 4 kudos
Latest Reply
Asifpanjwani
New Contributor II
  • 4 kudos

@Retired_mod @sp1 @Chaitanya_Raju @daniel_sahal Hi Everyone,I need the same scenario on SQL code, because my DBR cluster not allowed me to run python codeError: Unsupported cell during execution. SQL warehouses only support executing SQL cells.I appr...

  • 4 kudos
4 More Replies
BeginnerBob
by New Contributor III
  • 34679 Views
  • 6 replies
  • 3 kudos

Convert Date to YYYYMMDD in databricks sql

Hi,I have a date column in a delta table called ADate. I need this in the format YYYYMMDD.In TSQL this is easy. However, I can't seem to be able to do this without splitting the YEAR, MONTH and Day and concatenating them together.Any ideas?

  • 34679 Views
  • 6 replies
  • 3 kudos
Latest Reply
JayDoubleYou42
New Contributor II
  • 3 kudos

I'll share I'm having a variant of the same issue. I have a varchar field in the form YYYYMMDD which I'm trying to join to another varchar field from another table in the form of MM/DD/YYYY. Does anyone know of a way to do this in SPARK SQL without s...

  • 3 kudos
5 More Replies
vicks
by New Contributor III
  • 9718 Views
  • 5 replies
  • 8 kudos

Resolved! Converting the mon-yy format to date, but showing null for output

I have a date column that comes with month-year format and I am trying to convert that into dd-mm-yyyy format in pyspark for example I have date column with value Jan-2019Feb-2020Mar-2020the output I am expecting is 01/01/201901/02/202001/03/2020here...

  • 9718 Views
  • 5 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hi @vikram sinhha​ We haven't heard from you since the last response from @Suteja Kanuri​  . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 8 kudos
4 More Replies
Chinu
by New Contributor III
  • 856 Views
  • 0 replies
  • 0 kudos

Pulling query history only for the last 5 mins using "/api/2.0/sql/history/queries" api

I know query history api provides filter_by option with start and end time in ms but I was wondering if I can get only the last 5 mins of query data every time I run the api call (using telegraf to call the api). Is it possible I can use relative dat...

  • 856 Views
  • 0 replies
  • 0 kudos
elgeo
by Valued Contributor II
  • 8773 Views
  • 4 replies
  • 0 kudos

Function returns UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY

Hello experts. The below function in Databricks gives UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY error. We didn't have this issue though in Oracle. Is this a limitation of Databricks? Just to note the final result returns only one row. Thank you in advan...

  • 8773 Views
  • 4 replies
  • 0 kudos
Latest Reply
TheofilosSt
New Contributor II
  • 0 kudos

Hello @Suteja Kanuri​  can we have any respond on the above?Thank you.

  • 0 kudos
3 More Replies
Pien
by New Contributor II
  • 10949 Views
  • 5 replies
  • 0 kudos

Resolved! Getting date out of year and week

Hi all,I'm trying to get a date out of the columns year and week. The week format is not recognized.  df_loaded = df_loaded.withColumn("week_year", F.concat(F.lit("3"),F.col('Week'), F.col('Jaar')))df_loaded = df_loaded.withColumn("date", F.to_date(F...

  • 10949 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pien Derkx​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 0 kudos
4 More Replies
Dataengineer_mm
by New Contributor
  • 5250 Views
  • 2 replies
  • 0 kudos

Passing a date parameter through workflow

Hi , when we pass the parameter through workflows in DB, should we need to manually provide the parameter all the time? or any dynamic way of passing the parameter?

  • 5250 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Menaka Murugesan​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 0 kudos
1 More Replies
Mado
by Valued Contributor II
  • 9235 Views
  • 1 replies
  • 1 kudos

Resolved! How to get today's date in the local time zone?

I am trying to get today's date in the local time zone:from pyspark.sql.functions import * date = to_date(from_utc_timestamp(current_timestamp(), 'Australia/Melbourne'))What I get using the above code is a column object. How can I get its value in a...

image
  • 9235 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hemant
Valued Contributor II
  • 1 kudos

Hi @Mohammad Saber​ , you can use pytz and datetime python package for your usecase,, attaching code snippet in below screen shot. 

  • 1 kudos
Naveen_KumarMad
by New Contributor III
  • 11599 Views
  • 13 replies
  • 14 kudos

Resolved! How to find the last modified date of a notebook?

I would like to find the notebooks that are not required and not being used and then I can review and delete them. If there is a way to find last modified date of a notebook programmatically then I can get a list of notebooks, which I can review and ...

  • 11599 Views
  • 13 replies
  • 14 kudos
Latest Reply
Amit_352107
New Contributor III
  • 14 kudos

Hi @Naveen Kumar Madas​ you can go through below code block%shls -lt /dbfs/

  • 14 kudos
12 More Replies
elgeo
by Valued Contributor II
  • 7361 Views
  • 2 replies
  • 0 kudos

Resolved! Convert date to integer

Hello. Is there a way in Databricks sql to convert a date to integer? In Db2, there is days function DAYS - IBM Documentation .For example '2023-03-01' is converted to 738580 value.Thank you in advance

  • 7361 Views
  • 2 replies
  • 0 kudos
Latest Reply
SergeRielau
Databricks Employee
  • 0 kudos

TRy this:CREATE OR REPLACE FUNCTION days(dt DATE) RETURN unix_date(dt) - unix_date(DATE'0001-01-01') + 1;SELECT current_date, days(current_date); 2023-03-09 738588I verified on Db2 for LUW and it matches up.

  • 0 kudos
1 More Replies
sanjay
by Valued Contributor II
  • 2978 Views
  • 4 replies
  • 1 kudos

Resolved! How can I get date when autoloader processes the file

Hi,I am running autoloader which is running continuously and checks for new file every 1 minute. I need to store when file was received/processed but its giving me date when autoloader started. Here is my code.df = (spark   .readStream   .format("clo...

  • 2978 Views
  • 4 replies
  • 1 kudos
Latest Reply
Lakshay
Databricks Employee
  • 1 kudos

Hi @Sanjay Jain​ , You can use the File Metadata column functionality to collect that information.Ref doc:- https://docs.databricks.com/ingestion/file-metadata-column.html

  • 1 kudos
3 More Replies
BF
by New Contributor II
  • 6072 Views
  • 3 replies
  • 2 kudos

Resolved! Pyspark - How do I convert date/timestamp of format like /Date(1593786688000+0200)/ in pyspark?

Hi all, I've a dataframe with CreateDate column with this format:CreateDate/Date(1593786688000+0200)//Date(1446032157000+0100)//Date(1533904635000+0200)//Date(1447839805000+0100)//Date(1589451249000+0200)/and I want to convert that format to date/tim...

  • 6072 Views
  • 3 replies
  • 2 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 2 kudos

Hi @Bruno Franco​ ,Can you please try the below code, hope it might for you.from pyspark.sql.functions import from_unixtime from pyspark.sql import functions as F final_df = df_src.withColumn("Final_Timestamp", from_unixtime((F.regexp_extract(col("Cr...

  • 2 kudos
2 More Replies
Neli
by New Contributor III
  • 4466 Views
  • 2 replies
  • 0 kudos

How to add Current date as one of the column in Databricks

I am trying to create new column "Ingest_date" in table which should contain current date. I am getting error "Current date cannot be used in a generated column". Can you please review and suggest alternative to get the current date in delta table.

image image
  • 4466 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

A generation expression can use any SQL functions in Spark that always return the same result when given the same argument valuesSource: https://docs.delta.io/latest/delta-batch.html#use-generated-columnsIt means that it's intended to not work.You ca...

  • 0 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 2735 Views
  • 3 replies
  • 9 kudos

one of the date datatype format issue in pysaprk

if anyone has encountered this date type format - 6/15/25 12:00 AM could you mention the right formatting to be used in Pyspark.Thanks in advance!

  • 2735 Views
  • 3 replies
  • 9 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 9 kudos

Without legacy, it will also work.SELECT to_timestamp('6/15/23 12:00 AM', 'M/dd/yy h:mm a')

  • 9 kudos
2 More Replies
Labels