Data Engineering

Forum Posts

Sorted by:

by VikasSinha • New Contributor

07-13-2022 11:45:55 PM

2858 Views
2 replies
0 kudos

Which is better - Azure Databricks or GCP Databricks?

Which cloud hosting environment is best to use for Databricks? My question pins down to the fact that there must be some difference between the latency, throughput, result consistency & reproducibility between different cloud hosting environments of ...

Data Engineering

2858 Views
2 replies
0 kudos

07-13-2022 11:45:55 PM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 11:59:30 PM

0 kudos

Hi @Vikas Sinha Does @Prabakar Ammeappin response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

0 kudos

09-03-2022 11:59:30 PM

1 More Replies

by Vidyasankar • New Contributor

07-13-2022 3:39:50 PM

931 Views
3 replies
1 kudos

I am not able to see the data inside the notebook I imported from my local system. I am trying to open but it doesn't display anything.

Data Engineering

931 Views
3 replies
1 kudos

07-13-2022 3:39:50 PM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 11:56:34 PM

1 kudos

Hi @Vidya sankar Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

1 kudos

09-03-2022 11:56:34 PM

2 More Replies

by Vignesh2806 • New Contributor II

07-13-2022 4:46:23 AM

6356 Views
2 replies
3 kudos

I would like to get the below error solved Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zero

I am trying to run the databricks cluster, but at times the cluster takes long time to get set up & After some time it throws the below error. Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zeroThe init scr...

Data Engineering

6356 Views
2 replies
3 kudos

07-13-2022 4:46:23 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 11:45:40 PM

3 kudos

Hi @Vignesh Ravichandran Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

3 kudos

09-03-2022 11:45:40 PM

1 More Replies

by chandan_a_v • Valued Contributor

07-13-2022 12:38:20 AM

7703 Views
2 replies
4 kudos

Best way to run the Databricks notebook in a parallel way

Hi All,I need to run a Databricks notebook in a parallel way for different arguments. I tried with the threading approach but only the first 2 threads successfully execute the notebook and the rest fail. Please let me know if there is any best way to...

Data Engineering

7703 Views
2 replies
4 kudos

07-13-2022 12:38:20 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 10:46:25 PM

4 kudos

Hey there @Chandan Angadi Does @Hubert Dudek response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

4 kudos

09-03-2022 10:46:25 PM

1 More Replies

by lizou • Contributor II

07-12-2022 4:47:19 PM

1498 Views
4 replies
2 kudos

call saved query in sql warehouse

in python cursor.executecan you call a saved query with a parameter? like call a stored procedure in relational db?https://docs.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector#cursor-method

Data Engineering

1498 Views
4 replies
2 kudos

07-12-2022 4:47:19 PM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 10:32:00 PM

2 kudos

Hi @lizou Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

2 kudos

09-03-2022 10:32:00 PM

3 More Replies

by BeginnerBob • New Contributor III

07-12-2022 10:27:54 AM

8645 Views
4 replies
2 kudos

Flatten a complex JSON file and load into a delta table

Hi,I am loading a JSON file into Databricks by simply doing the following:from pyspark.sql.functions import *from pyspark.sql.types import *bronze_path="wasbs://....../140477.json"df_incremental = spark.read.option("multiline","true").json(bronze_pat...

Data Engineering

8645 Views
4 replies
2 kudos

07-12-2022 10:27:54 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 10:26:33 PM

2 kudos

Hi @Lloyd Vickery Does @Werner Stinckens response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

2 kudos

09-03-2022 10:26:33 PM

3 More Replies

by Dan-K • New Contributor III

08-31-2022 10:48:00 AM

10286 Views
6 replies
6 kudos

Resolved! How to display markdown output in databricks notebook from a python cell

With IPython/Jupyter it's possible to output markdown using the IPython display module and its `MarkDown`class. QuestionHow can I accomplish this with Azure Databricks?What I triedDatabricks `display`Tried using Databrick's display with the IPython M...

Data Engineering

10286 Views
6 replies
6 kudos

08-31-2022 10:48:00 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

09-02-2022 3:01:34 PM

6 kudos

Hi, Thanks for reaching out to community.databricks.com.In a notebook cell, type "%md" and type some markdown and it will render. Please refer: https://community.databricks.com/s/question/0D53f00001HKHhNCAX/markup-in-databricks-notebook

6 kudos

09-02-2022 3:01:34 PM

5 More Replies

by ArjunS310 • New Contributor III

08-28-2022 8:55:31 AM

551 Views
2 replies
2 kudos

Resolved! Did not receive a badge upon completing databricks fundamentals assessment

Team,I completed the training and assessment on databricks assesment and passed with 80% and received a certificate of completion but did not receive a badge as mentioned in the description of the course. Could you please help.

Data Engineering

551 Views
2 replies
2 kudos

08-28-2022 8:55:31 AM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 2:00:26 PM

2 kudos

Hi @Arjun Shaji , Thank you for reaching out!Let us look into this for you, and we'll follow up with an update.

2 kudos

09-03-2022 2:00:26 PM

1 More Replies

by parthibsg • New Contributor II

08-29-2022 7:56:29 PM

709 Views
2 replies
2 kudos

When to use Dataframes API over Spark SQL

Hello Experts,I am new to Databricks. Building data pipelines, I have both batch and streaming data.Should I use Dataframes API to read csv files then convert to parquet format then do the transformation? orwrite to table using CSV then use Spark SQL...

Data Engineering

709 Views
2 replies
2 kudos

08-29-2022 7:56:29 PM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 2:02:00 PM

2 kudos

Hi @Parthib Rathnam, Thank you for reaching out!Let us look into this for you, and we'll follow up with an update.

2 kudos

09-03-2022 2:02:00 PM

1 More Replies

by BananaHotSauce • New Contributor III

08-31-2022 12:17:58 AM

364 Views
2 replies
3 kudos

Can I use PrivateLink and Customer Managed Policy for Cross Account Role

Hello, Im trying to enable Privatelink on my AWS Databricks quickstart, I use the customer managed VPC policy for the cross account role and supply it on the template. Im having an error that it cannot create a VPC Endpoint.Do i need to change the...

Data Engineering

364 Views
2 replies
3 kudos

08-31-2022 12:17:58 AM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 1:51:39 PM

3 kudos

Hi @Chris Joshua Manuel , We haven’t heard from you on the last response from @Debayan Mukherjee, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be...

3 kudos

09-03-2022 1:51:39 PM

1 More Replies

by c038644 • New Contributor II

08-30-2022 4:58:07 AM

976 Views
4 replies
3 kudos

Use of venv pack

Hi, I very new so this probably sounds stupid...I'm following the blog on How to Manage Python Dependencies in PySpark:https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html...but when I try the packing works fin...

Data Engineering

976 Views
4 replies
3 kudos

08-30-2022 4:58:07 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

08-30-2022 9:54:24 PM

3 kudos

Can you try using an absolute path instead of a relative path for the same? For example: https://stackoverflow.com/questions/38661464/filenotfounderror-winerror-3

3 kudos

08-30-2022 9:54:24 PM

3 More Replies

by Giorgi • New Contributor III

08-30-2022 8:16:05 AM

2033 Views
2 replies
3 kudos

GitLab integration

I've followed instructions and did gitlab integration:Generated Personal Access Token from GitLabAdd token (from step 1) to User settings (GitLab, email, token)In Admin console -> Repos Git URL Allow List permissions: Disabled (no restrictions)In Adm...

Data Engineering

2033 Views
2 replies
3 kudos

08-30-2022 8:16:05 AM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 1:38:03 PM

3 kudos

Hi @Giorgi ARABIDZE , We haven't heard from you on the last response from @Hubert Dudek, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to oth...

3 kudos

09-03-2022 1:38:03 PM

1 More Replies

by him • New Contributor III

08-29-2022 9:34:17 PM

667 Views
2 replies
3 kudos

There is .py file in my local vs code ,how can i upload that file on databrick cluster using rest api and run it !!

Data Engineering

667 Views
2 replies
3 kudos

08-29-2022 9:34:17 PM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 1:34:41 PM

3 kudos

Hi @Himanshu yadav , We haven't heard from you on the last response from @Debayan Mukherjee , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful t...

3 kudos

09-03-2022 1:34:41 PM

1 More Replies

by al_joe • Contributor

07-11-2022 7:15:28 AM

767 Views
2 replies
0 kudos

Can we have a better UI for navigating Workspace and Repos?

Navigating through multiple vertical panes of information as we navigate deeper into a folder structure is not very convenient -- we lose the context of parent folder and sibling folders very soon.Can we not have a simple tree view (similar to VS Cod...

Data Engineering

767 Views
2 replies
0 kudos

07-11-2022 7:15:28 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 1:37:57 AM

0 kudos

Hey there @Al Jo Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

0 kudos

09-03-2022 1:37:57 AM

1 More Replies

by dbrick • New Contributor II

07-11-2022 5:57:39 AM

670 Views
2 replies
1 kudos

Multiple Jobs with different resource requirements on the same cluster

I have a big cluster with the auto-scaling(min:1, max: 25) feature enabled. I want to run multiple jobs on that cluster with different values of spark properties( `--executor-cores` and `–executor-memory) but I don't see any option to specify the sam...

Data Engineering

670 Views
2 replies
1 kudos

07-11-2022 5:57:39 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-03-2022 1:30:17 AM

1 kudos

Hi @Neelesh databricks Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell ...

1 kudos

09-03-2022 1:30:17 AM

1 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Which is better - Azure Databricks or GCP Databricks?

I am not able to see the data inside the notebook I imported from my local system. I am trying to open but it doesn't display anything.

I would like to get the below error solved Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zero

Best way to run the Databricks notebook in a parallel way

call saved query in sql warehouse

Flatten a complex JSON file and load into a delta table

Resolved! How to display markdown output in databricks notebook from a python cell

Resolved! Did not receive a badge upon completing databricks fundamentals assessment

When to use Dataframes API over Spark SQL

Can I use PrivateLink and Customer Managed Policy for Cross Account Role

Use of venv pack

GitLab integration

There is .py file in my local vs code ,how can i upload that file on databrick cluster using rest api and run it !!

Can we have a better UI for navigating Workspace and Repos?

Multiple Jobs with different resource requirements on the same cluster

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...