Data Engineering

Forum Posts

Sorted by:

by Merchiv • New Contributor III

02-03-2023 7:34:30 AM

7577 Views
4 replies
3 kudos

Resolved! How can I add a duration in milliseconds to a timestamp?

Let's say I have a DataFrame with a timestamp and an offset column in milliseconds respectively in the timestamp and long format. E.g.from datetime import datetime df = spark.createDataFrame( [ (datetime(2021, 1, 1), 1500, ), (dat...

Data Engineering

7577 Views
4 replies
3 kudos

02-03-2023 7:34:30 AM

View Replies

Latest Reply

Merchiv
New Contributor III

03-01-2023 11:41:35 PM

3 kudos

Although @Lakshay Goel's solution works, we've been using an alternative approach, that we found to be a bit more readable:from pyspark.sql import Column, functions as f def make_dt_interval_sec(col: Column): return f.expr(f"make_dt_interval...

3 kudos

03-01-2023 11:41:35 PM

3 More Replies

by User16869510359 • Esteemed Contributor

06-25-2021 3:13:30 PM

7908 Views
2 replies
0 kudos

How to find Databricks runtime version of the cluster in an init script

Data Engineering

7908 Views
2 replies
0 kudos

06-25-2021 3:13:30 PM

View Replies

Latest Reply

Mooune_DBU
Valued Contributor

06-25-2021 3:37:39 PM

0 kudos

It's set as an environment variable called `DATABRICKS_RUNTIME_VERSION`In your init scripts, you just need to add a line to display or save the info (see python example below):import os print("DATABRICKS_RUNTIME_VERSION:",os.environ.get('DATABRICKS_R...

0 kudos

06-25-2021 3:37:39 PM

1 More Replies

by bchaubey • Contributor II

01-26-2023 9:54:41 PM

930 Views
2 replies
0 kudos

voucher

Did you receive your voucher?

Data Engineering

930 Views
2 replies
0 kudos

01-26-2023 9:54:41 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-01-2023 9:15:55 PM

0 kudos

Hi @Kashish Khetarpaul Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.

0 kudos

03-01-2023 9:15:55 PM

1 More Replies

by Databrick_begin • New Contributor

12-14-2022 9:07:10 AM

1228 Views
1 replies
0 kudos

Databrick notebook to Azure SQL server connection using private ip because Public access is Denied in Azure SQL database, and Databrick and Azure SQL both in same subscription but different Virtual Network.

We have created private endpoint for Azure SQL database which has private ip. and by making host file entry in my system i am able to resolve Ip for Azure sql server from my system and connect to Server. but unable to connect from Azure Databrick not...

Data Engineering

1228 Views
1 replies
0 kudos

12-14-2022 9:07:10 AM

View Replies

Latest Reply

Ryoma
New Contributor II

03-01-2023 9:05:32 PM

0 kudos

If vnet injection is not used, the connection could be established by setting up an init script with azure private resolver as nameserver.#!/bin/bashmv /etc/resolv.conf /etc/resolv.conf.origecho nameserver <your dns server ip> | sudo tee --append /e...

0 kudos

03-01-2023 9:05:32 PM

by THIAM_HUATTAN • Valued Contributor

08-16-2022 5:19:46 AM

7533 Views
6 replies
7 kudos

Is catalog a feature in the community version?

%sql create catalog if not exists catalog1I tried above, but it gives me error as below:com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Catalog namespace is not supported. at com.d...

Data Engineering

7533 Views
6 replies
7 kudos

08-16-2022 5:19:46 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-14-2022 12:39:00 AM

7 kudos

Hi @THIAM HUAT TAN, It would mean a lot if you could select the "Best Answer" to help others find the correct answer faster.This makes that answer appear right after the question, so it's easier to find within a thread.It also helps us mark the que...

7 kudos

11-14-2022 12:39:00 AM

5 More Replies

by Dataengineer_mm • New Contributor

02-09-2023 9:23:32 AM

765 Views
2 replies
0 kudos

Databricks workflow migration to higher environments

How do we migrate the databricks workflows to higher environment ? I do see an option for calling the tasks (notebooks,python) from the github repositories. But as such how do we migrate the entire workflow jobs to other environment ?

Data Engineering

765 Views
2 replies
0 kudos

02-09-2023 9:23:32 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

03-01-2023 10:55:39 AM

0 kudos

Hi @Menaka Murugesan,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

0 kudos

03-01-2023 10:55:39 AM

1 More Replies

by rbricks • New Contributor

02-08-2023 2:51:54 AM

684 Views
2 replies
0 kudos

numSourceRows greater than expected

HeyI am doing an upsert of a source DataFrame into a target table. Before said upsert, I print out the source DataFrame's row count, which is a bit smaller than what `numSourceRows` says after the operation completes and I check the operationMetrics....

Data Engineering

684 Views
2 replies
0 kudos

02-08-2023 2:51:54 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

03-01-2023 10:54:22 AM

0 kudos

could you share your code snippet please? also share the expected output.

0 kudos

03-01-2023 10:54:22 AM

1 More Replies

by 771407 • New Contributor II

02-07-2023 6:20:14 AM

878 Views
3 replies
3 kudos

Resolved! R code that works perfectly on Rstudio does not run here

Hi,I have a "simple" R script that I need to import into Databricks and am running into errors.For example:TipoB <- Techtb %>% dplyr::filter(grepl('being evaluated', Comentarios))#TipoB$yearsSpec <- NATipoB$yearsSpec <- str_replace(TipoB$Comentarios,...

Data Engineering

878 Views
3 replies
3 kudos

02-07-2023 6:20:14 AM

View Replies

Latest Reply

771407
New Contributor II

02-08-2023 4:09:01 AM

3 kudos

R studio version 2022.12.0.R latest version available on 08/FEB/2023. I don't know where to find the DBR version and configuration. Can you direct me?

3 kudos

02-08-2023 4:09:01 AM

2 More Replies

by MikeJohnsonZa • New Contributor

02-02-2023 1:05:49 AM

1187 Views
3 replies
0 kudos

Resolved! Importing irregularly formatted json files

HiI'm importing a large collection of json files, the problem is that they are not what I would expect a well-formatted json file to be (although probably still valid), each file consists of only a single record that looks something like this (this i...

Data Engineering

1187 Views
3 replies
0 kudos

02-02-2023 1:05:49 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

03-01-2023 10:37:07 AM

0 kudos

Hi @Michael Johnson,I would like to share the following notebook which contains examples on how to process complex data types, like JSON. Please check the following link and let us know if you still need help https://docs.databricks.com/optimization...

0 kudos

03-01-2023 10:37:07 AM

2 More Replies

by youssefmrini • Honored Contributor III

03-01-2023 6:26:39 AM

976 Views
1 replies
4 kudos

Resolved! Can I limit the max number of clusters per user ?

Data Engineering

976 Views
1 replies
4 kudos

03-01-2023 6:26:39 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

03-01-2023 6:27:29 AM

4 kudos

You can now use cluster policies to restrict the number of clusters a user can create. For more information https://docs.databricks.com/administration-guide/clusters/policies.html#cluster-limit

4 kudos

03-01-2023 6:27:29 AM

by youssefmrini • Honored Contributor III

03-01-2023 6:23:57 AM

1115 Views
1 replies
2 kudos

Resolved! Can I clone Apache Iceberg Tables ?

Data Engineering

1115 Views
1 replies
2 kudos

03-01-2023 6:23:57 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

03-01-2023 6:24:41 AM

2 kudos

Clone can now be used to create and incrementally update Delta tables that mirror Apache Parquet and Apache Iceberg tables. You can update your source Parquet table and incrementally apply the changes to their cloned Delta table with the clone comman...

2 kudos

03-01-2023 6:24:41 AM

by youssefmrini • Honored Contributor III

03-01-2023 6:19:39 AM

480 Views
1 replies
2 kudos

Resolved! Can I Authenticate to Power BI or Tableau using OAuth ?

Data Engineering

480 Views
1 replies
2 kudos

03-01-2023 6:19:39 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

03-01-2023 6:20:12 AM

2 kudos

You can now use OAuth to authenticate to Power BI and Tableau. For more information, see Configure OAuth (Public Preview) for Power BI and Configure OAuth (Public Preview) for Tableau.https://docs.databricks.com/integrations/configure-oauth-powerbi.h...

2 kudos

03-01-2023 6:20:12 AM

by 156190 • New Contributor III

02-20-2023 12:49:42 PM

1799 Views
6 replies
3 kudos

Resolved! Is 'run_as' user available from jobs api 2.1?

I know that the run_as user generally defaults to the creator_user, but I would like to find the defined run_as user for each of our jobs. Unfortunately, I'm unable to locate that field in the api.

Data Engineering

1799 Views
6 replies
3 kudos

02-20-2023 12:49:42 PM

View Replies

Latest Reply

Anonymous
Not applicable

02-21-2023 10:48:06 PM

3 kudos

Hi @Keller, Michael Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

3 kudos

02-21-2023 10:48:06 PM

5 More Replies

by SagarK1 • New Contributor

10-21-2021 2:41:08 AM

2114 Views
5 replies
2 kudos

Managing the permissions using MLFlow APIs

Hello All,I am trying to manage the permissions on the experiments using the MLFLow API. Do we have any MLFlow API which helps to manage the permissions of Can Read ,Can Edit , Can Manage.Example :I create the model using MLFlow APIs and through my c...

Data Engineering

2114 Views
5 replies
2 kudos

10-21-2021 2:41:08 AM

View Replies

Latest Reply

jsan
New Contributor II

03-01-2023 4:35:00 AM

2 kudos

Hey folks, did we get any workaround for this or what @Sean Owen said is true ?

2 kudos

03-01-2023 4:35:00 AM

4 More Replies

by zeta_load • New Contributor II

02-28-2023 6:47:42 AM

929 Views
1 replies
1 kudos

Resolved! Is it possible to restart a cluster from a Notebook without using the UI

I have some code that occasionally wrong executed, meaning that every n-th time a calculation in a table is wrong. If that happens, I want to be able to restart the cluster from the Notebook.- I'm therefore lookong for a piece of code that can accomp...

Data Engineering

929 Views
1 replies
1 kudos

02-28-2023 6:47:42 AM

View Replies

Latest Reply

daniel_sahal
Esteemed Contributor

02-28-2023 10:38:44 PM

1 kudos

@Lukas Goldschmied It is. You'll need to use Databricks API.Here you can find an example:https://learn.microsoft.com/en-us/azure/databricks/_extras/notebooks/source/clusters-long-running-optional-restart.html

1 kudos

02-28-2023 10:38:44 PM

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Resolved! How can I add a duration in milliseconds to a timestamp?

How to find Databricks runtime version of the cluster in an init script

voucher

Databrick notebook to Azure SQL server connection using private ip because Public access is Denied in Azure SQL database, and Databrick and Azure SQL both in same subscription but different Virtual Network.

Is catalog a feature in the community version?

Databricks workflow migration to higher environments

numSourceRows greater than expected

Resolved! R code that works perfectly on Rstudio does not run here

Resolved! Importing irregularly formatted json files

Resolved! Can I limit the max number of clusters per user ?

Resolved! Can I clone Apache Iceberg Tables ?

Resolved! Can I Authenticate to Power BI or Tableau using OAuth ?

Resolved! Is 'run_as' user available from jobs api 2.1?

Managing the permissions using MLFlow APIs

Resolved! Is it possible to restart a cluster from a Notebook without using the UI

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...