Data Engineering

Forum Posts

Sorted by:

by ranged_coop • Valued Contributor II

06-07-2023 3:52:10 AM

8465 Views
23 replies
22 kudos

Resolved! How to access a jar file stored in Databricks Workspace ?

Hi All,We have a couple of jars stored in a workspace folder.We are using init scripts to copy the jars in the workspace to the /databricks/jars path.The init scripts do not seem to be able to find the files. The scripts are failing saying the files ...

Data Engineering

8465 Views
23 replies
22 kudos

06-07-2023 3:52:10 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 11:27:49 PM

22 kudos

Hi @Bharath Kumar Ramachandran You're welcome! I'm glad you found the link useful. I empathize with your hope that Databricks would consider adding this option. It's possible that Databricks will take user feedback into account when planning future ...

22 kudos

06-14-2023 11:27:49 PM

22 More Replies

by Eelke • New Contributor II

04-12-2023 5:36:50 AM

1460 Views
3 replies
0 kudos

I want to perform interpolation on a streaming table in delta live tables.

I have the following code:from pyspark.sql.functions import * !pip install dbl-tempo from tempo import TSDF from pyspark.sql.functions import * # interpolate target_cols column linearly for tsdf dataframe def interpolate_tsdf(tsdf_data, target_c...

Data Engineering

1460 Views
3 replies
0 kudos

04-12-2023 5:36:50 AM

View Replies

Latest Reply

Eelke
New Contributor II

06-15-2023 1:30:57 AM

0 kudos

The issue was not resolved because we were trying to use a streaming table within TSDF which does not work.

0 kudos

06-15-2023 1:30:57 AM

2 More Replies

by HariharaSam • Contributor

01-12-2022 11:45:58 PM

14568 Views
10 replies
4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

Data Engineering

14568 Views
10 replies
4 kudos

01-12-2022 11:45:58 PM

View Replies

Latest Reply

GRCL
New Contributor III

06-15-2023 1:27:28 AM

4 kudos

Almost same advice than Hubert, I use the history of the delta table :df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

4 kudos

06-15-2023 1:27:28 AM

9 More Replies

by Merchiv • New Contributor III

03-28-2023 7:02:29 AM

5804 Views
8 replies
2 kudos

Resolved! AnalysisException when running SQL queries

When running some SQL queries using spark.sql(...), we sometimes get a variant of the following error:AnalysisException: Undefined function: current_timestamp. This function is neither a built-in/temporary function, nor a persistent function that is ...

Data Engineering

5804 Views
8 replies
2 kudos

03-28-2023 7:02:29 AM

View Replies

Latest Reply

ashish1
New Contributor III

04-24-2023 1:53:03 AM

2 kudos

This is most likely a conflict in the lib code, you can uninstall some libs on your cluster and try to narrow it down to the problematic one.

2 kudos

04-24-2023 1:53:03 AM

7 More Replies

by siddharthk • New Contributor II

06-08-2023 12:45:31 PM

743 Views
2 replies
2 kudos

Resolved! Reduce downtime of Postgres table - JDBC overwrite job

I want to overwrite a Postgresql table transactionStats which is used by the customer facing dashboards.This table needs to be updated every 30 mins. I am writing a AWS Glue Spark job via JDBC connection to perform this operation.Spark dataframe writ...

Data Engineering

743 Views
2 replies
2 kudos

06-08-2023 12:45:31 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-15-2023 12:09:32 AM

2 kudos

Hi @Siddharth Kanojiya We haven't heard from you since the last response from @werners (Customer) . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

2 kudos

06-15-2023 12:09:32 AM

1 More Replies

by Pras1 • New Contributor II

06-08-2023 10:34:03 AM

3592 Views
2 replies
2 kudos

Resolved! AZURE_QUOTA_EXCEEDED_EXCEPTION - even with more than vCPUs than Databricks recommends

I am running this Delta Live Tables PoC from databricks-industry-solutions/industry-solutions-blueprintshttps://github.com/databricks-industry-solutions/pos-dltI have Standard_DS4_v2 with 28GB and 8 cores x 2 workers - so a total of 16 cores. This is...

Data Engineering

3592 Views
2 replies
2 kudos

06-08-2023 10:34:03 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-15-2023 12:05:36 AM

2 kudos

Hi @Prasenjit Biswas We haven't heard from you since the last response from @Jose Gonzalez . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

2 kudos

06-15-2023 12:05:36 AM

1 More Replies

by Woody • New Contributor II

06-12-2023 8:45:50 PM

496 Views
2 replies
3 kudos

This site has alot of issues....

I need to login and I can't to try Databricks..so you have a OAUTH issue...I cant try Databricks at all because the country icon doesn't work and it sends a URI issue from your front end to the back end..."Request_URI=&Geo_country_code=&Geo_country_i...

Data Engineering

496 Views
2 replies
3 kudos

06-12-2023 8:45:50 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 11:50:03 PM

3 kudos

Hi @Jessica Woods Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

3 kudos

06-14-2023 11:50:03 PM

1 More Replies

by Westinghouse • New Contributor II

06-04-2023 6:30:29 PM

3621 Views
7 replies
3 kudos

Detectron2 install

I have been struggling to install Detectron 2. I think it is an issue with Cuda. Any adivise?install!pip install -q "detectron2@git+https://github.com/facebookresearch/detectron2.git@e2ce8dc#egg=detectron2"Error:qm8vsdal/detectron2_85faeed5ce7945dbad...

Data Engineering

3621 Views
7 replies
3 kudos

06-04-2023 6:30:29 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 11:16:32 PM

3 kudos

Hi @Joshua RobergeSorry for the inconvenience!Kindly review the solution offered by @Suteja Kanuri.

3 kudos

06-14-2023 11:16:32 PM

6 More Replies

by sriram_kumar • New Contributor II

06-06-2023 12:01:55 AM

1160 Views
4 replies
5 kudos

To do Optimization on the real time delta table

Hi Team,We have few prod tables which are created in s3 bucket, that have grown now very large, these tables are getting real time data continuously from round the clock databricks workflows; we would like run the optimization commands(optimize, zord...

Data Engineering

1160 Views
4 replies
5 kudos

06-06-2023 12:01:55 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 11:03:54 PM

5 kudos

Hi @Sriram Kumar We haven't heard from you since the last response from @Suteja Kanuri . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

5 kudos

06-14-2023 11:03:54 PM

3 More Replies

by jole3112 • New Contributor III

06-01-2023 9:04:30 AM

3620 Views
7 replies
8 kudos

virtual environment on azure databricks compute cluster

I'm using Azure Databricks and I'd like to create a project virtual environment, persisted on a shared compute cluster. As the cluster is shared for many projects, it is necessary to have virtual environments if I want to execute code runs from withi...

Data Engineering

3620 Views
7 replies
8 kudos

06-01-2023 9:04:30 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-12-2023 8:53:01 PM

8 kudos

Hi @Joshua L We haven't heard from you since the last response from @Debayan Mukherjee , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ot...

8 kudos

06-12-2023 8:53:01 PM

6 More Replies

by Matt1209 • New Contributor II

06-03-2023 10:08:44 PM

519 Views
1 replies
3 kudos

How to execute requests later for a number of times that exceeds the Maximum concurrent runs?

I am trying to start the same Jobs multiple times using the python sdk's "run_now" command.If the number of requests exceeds the Maximum concurrent runs, the status of the run will be Skipped and the run will not be executed.Is there any way to queue...

Data Engineering

519 Views
1 replies
3 kudos

06-03-2023 10:08:44 PM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

06-14-2023 10:42:34 PM

3 kudos

Hi, We do have a private preview feature which will be enabled shortly for queueing. Please tag me (@Debayan Mukherjee ) with your next update so that I will get notified.

3 kudos

06-14-2023 10:42:34 PM

by sevvalmehder • New Contributor II

06-14-2023 5:09:04 AM

1185 Views
3 replies
3 kudos

Databricks run-time 12.2 LTS drop function problem

I am getting an error about the `drop function of pyspark` at a cluster using 12.2 LTS. When I check the error I see spark solved that bug, see SPARK-42444. Also when I check maintenance updates page, I saw this solved issue included the Databricks R...

Data Engineering

1185 Views
3 replies
3 kudos

06-14-2023 5:09:04 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 10:27:21 PM

3 kudos

Hi @Sevval Mehder Elevate our community by acknowledging exceptional contributions. Your participation in marking the best answers is a testament to our collective pursuit of knowledge.

3 kudos

06-14-2023 10:27:21 PM

2 More Replies

by RamdasP • New Contributor

06-13-2023 2:20:10 PM

891 Views
2 replies
3 kudos

Resolved! Implement & Test DR Plan

Hi,Can you direct me to any documentation on how to implement and test Disaster Recovery for Databricks (PAAS) on Azure ?Thx & RgdsRamdas

Data Engineering

891 Views
2 replies
3 kudos

06-13-2023 2:20:10 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 9:40:02 PM

3 kudos

Hi @Ramdas Panicher Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

3 kudos

06-14-2023 9:40:02 PM

1 More Replies

by ayush1900 • New Contributor II

06-13-2023 4:52:13 AM

640 Views
2 replies
2 kudos

Resolved! I have successfully passed the test with 225.5 makrs. But I have'nt recieved any badge from your side as promised. I have been provided with a certificate. Please provide me with the badge. My cetificate ID is E-03DK31

Data Engineering

640 Views
2 replies
2 kudos

06-13-2023 4:52:13 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 8:34:22 PM

2 kudos

Hi @Ayush Raj Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

2 kudos

06-14-2023 8:34:22 PM

1 More Replies

by reachbharathan • New Contributor III

06-10-2023 7:31:59 PM

1545 Views
3 replies
4 kudos

Resolved! How to checkout specific commit version via databricks UI

I have integrated gitlab with my azure databricks repo, I am able to push and pull commits from the databricks UI, I want to checkout to a specific commit version via databricks UI.Note: I am aware that via the gitlab i have checkout to specific vers...

Data Engineering

1545 Views
3 replies
4 kudos

06-10-2023 7:31:59 PM

View Replies

Latest Reply

reachbharathan
New Contributor III

06-14-2023 7:52:01 PM

4 kudos

After getting more context on databricks repo in details,Currently databricks doesn't support checkout of repo to specific commit.databricks provides only limited functionality mentioned belowAdd a repo and connect remotely laterClone a repo connecte...

4 kudos

06-14-2023 7:52:01 PM

2 More Replies

User

Count

1602

736

343

284

247

Databricks

Forum Posts

Resolved! How to access a jar file stored in Databricks Workspace ?

I want to perform interpolation on a streaming table in delta live tables.

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Resolved! AnalysisException when running SQL queries

Resolved! Reduce downtime of Postgres table - JDBC overwrite job

Resolved! AZURE_QUOTA_EXCEEDED_EXCEPTION - even with more than vCPUs than Databricks recommends

This site has alot of issues....

Detectron2 install

To do Optimization on the real time delta table

virtual environment on azure databricks compute cluster

How to execute requests later for a number of times that exceeds the Maximum concurrent runs?

Databricks run-time 12.2 LTS drop function problem

Resolved! Implement & Test DR Plan

Resolved! I have successfully passed the test with 225.5 makrs. But I have'nt recieved any badge from your side as promised. I have been provided with a certificate. Please provide me with the badge. My cetificate ID is E-03DK31

Resolved! How to checkout specific commit version via databricks UI

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...