Data Engineering

Forum Posts

Sorted by:

by afshinR • New Contributor III

10-07-2021 9:09:42 AM

1311 Views
6 replies
3 kudos

Hi, I like to create a web form with displayHTML in a notebook cell and when the users presses the post button, i like to write the content of my text...

Hi,I like to create a web form with displayHTML in a notebook cell and when the users presses the post button, i like to write the content of my text area of my form back in to the code cell of the notebook.Example:displayHTML ("""<form><textarea> u...

Data Engineering

1311 Views
6 replies
3 kudos

10-07-2021 9:09:42 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

10-29-2021 3:52:13 PM

3 kudos

Hi @afshin riahi ,Did Dan's response helped you to solve your question? if it did, can you mark it as best answer? I will help to move the post to the top so other can quickly find the solution.

3 kudos

10-29-2021 3:52:13 PM

5 More Replies

by cig0 • New Contributor II

09-13-2021 8:01:16 AM

2537 Views
6 replies
2 kudos

Resolved! AWS VPC peering connection: can't make Databricks VPC reach our services on the accepter VPC

Hi,We followed this document (https://docs.databricks.com/administration-guide/cloud-configurations/aws/vpc-peering.html) describing how to establish a connection between two (or more) VPC in AWS, but so far we haven't been able to communicate with t...

Data Engineering

2537 Views
6 replies
2 kudos

09-13-2021 8:01:16 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

10-29-2021 3:49:25 PM

2 kudos

Hi @Martin Cigorraga ,If Huaming's fully answered your question, would you be happy to mark their answer as best so that others can quickly find the solution?

2 kudos

10-29-2021 3:49:25 PM

5 More Replies

by Ayman • New Contributor

09-23-2021 9:18:05 AM

3098 Views
4 replies
0 kudos

Resolved! what is the best way to create Tableau Hyper files in Databricks

Data Engineering

3098 Views
4 replies
0 kudos

09-23-2021 9:18:05 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

10-29-2021 3:46:24 PM

0 kudos

Hi @Ayman Alneser ,Did Huaming.lu's response worked for you? if it did, could you marked as the best solution so that other can quickly find it in the future.

0 kudos

10-29-2021 3:46:24 PM

3 More Replies

by TJS • New Contributor II

10-08-2021 8:24:48 AM

12903 Views
6 replies
5 kudos

Resolved! Can you help with this error please? Issue when using a new high concurrency cluster

Hello, I am trying to use MLFlow on a new high concurrency cluster but I get the error below. Does anyone have any suggestions? It was working before on a standard cluster. Thanks.py4j.security.Py4JSecurityException: Method public int org.apache.spar...

Data Engineering

12903 Views
6 replies
5 kudos

10-08-2021 8:24:48 AM

View Replies

Latest Reply

User16753724828
New Contributor III

10-19-2021 6:05:56 AM

5 kudos

@Tom Soto We have a workaround for this. This cluster spark configuration setting will disable py4jSecurity while still enabling passthrough spark.databricks.pyspark.enablePy4JSecurity false

5 kudos

10-19-2021 6:05:56 AM

5 More Replies

by William_Scardua • Valued Contributor

10-06-2021 8:18:06 AM

5455 Views
9 replies
2 kudos

Resolved! How many hours I can estimate to trainning in a Databricks Academy Self-Placed Trainning platform ?

I done the Data Engineering Profissional and others training in a Self-Placed Trainning (https://www.linkedin.com/posts/wscardua_data-engineering-professional-activity-6851487238774108160-IsTE) . How many hours can I estimate for this training (and o...

Data Engineering

5455 Views
9 replies
2 kudos

10-06-2021 8:18:06 AM

View Replies

Latest Reply

William_Scardua
Valued Contributor

10-26-2021 11:43:27 AM

2 kudos

Can anyone help ?

2 kudos

10-26-2021 11:43:27 AM

8 More Replies

by Anonymous • Not applicable

10-20-2021 8:19:15 AM

1099 Views
2 replies
4 kudos

Multi-task Job Run starting point

Hi community!I would like to know if it is possible to start a Multi-task Job Run from and specific task. The use case is as follows:I have a 17 tasks JobA task in the middle, let's say a task after 2 dependencies, failsI found the error and now it i...

Data Engineering

1099 Views
2 replies
4 kudos

10-20-2021 8:19:15 AM

View Replies

Latest Reply

BilalAslamDbrx
Honored Contributor II

10-29-2021 6:41:54 AM

4 kudos

+1 to what @Dan Zafar said. We're working **** ** this. Looking forward to bring this to you in the near future.

4 kudos

10-29-2021 6:41:54 AM

1 More Replies

by alexraj84 • New Contributor

08-04-2016 10:52:24 AM

8379 Views
2 replies
0 kudos

How to read a fixed length file in Spark using DataFrame API and SCALA

I have a fixed length file ( a sample is shown below) and I want to read this file using DataFrames API in Spark using SCALA(not python or java). Using DataFrames API there are ways to read textFile, json file and so on but not sure if there is a wa...

Data Engineering

8379 Views
2 replies
0 kudos

08-04-2016 10:52:24 AM

View Replies

Latest Reply

Nagendra
New Contributor II

10-29-2021 4:50:15 AM

0 kudos

Find the below solution which can be used. Let us consider this is the data in the file. EMP ID First Name Last Name 1Chris M 2John ...

0 kudos

10-29-2021 4:50:15 AM

1 More Replies

by aditya_raj_data • New Contributor II

10-28-2021 10:56:12 AM

4248 Views
4 replies
2 kudos

Hosting python application on Azure Databricks and exposing it's rest APIs

Hello, I am trying to host my application on Databricks and I want to expose rest APIs of my application to be accessed from postman but I am unable to find any documentation on how to do this. I tried to write simple flask "hello world" code to try ...

Data Engineering

4248 Views
4 replies
2 kudos

10-28-2021 10:56:12 AM

View Replies

Latest Reply

Manoj
Contributor II

10-28-2021 3:29:39 PM

2 kudos

I did this using Azure web app and exposed the APIs , was able to access that in Post Man and Data bricks. Not used python app on data bricks

2 kudos

10-28-2021 3:29:39 PM

3 More Replies

by User16753725182 • Contributor III

05-07-2021 7:43:49 AM

1279 Views
1 replies
0 kudos

How to setup a private git repository in my workspace?

Data Engineering

1279 Views
1 replies
0 kudos

05-07-2021 7:43:49 AM

View Replies

Latest Reply

atulsahu
New Contributor II

10-29-2021 1:25:39 AM

0 kudos

As a platform engineer, I would go to the admin console and click on "workspace settings" and start by looking into the below settings. Repos: true, so that Repos integration is possibleThe next two settings, are important to make the overall experi...

0 kudos

10-29-2021 1:25:39 AM

by Kaniz • Community Manager

10-28-2021 11:01:15 PM

444 Views
0 replies
0 kudos

How can I load my table from Data Lake(ADLS) to Azure Synapse via Databricks?

Data Engineering

444 Views
0 replies
0 kudos

10-28-2021 11:01:15 PM

by Rnmj • New Contributor III

10-25-2021 5:25:36 AM

8749 Views
5 replies
7 kudos

ConnectException: Connection refused (Connection refused) This is often caused by an OOM error

I am trying to run a python code where a json file is flattened to pipe separated file . The code works with smaller files but for huge files of 2.4 GB I get below error:ConnectException: Connection refused (Connection refused)Error while obtaining a...

Data Engineering

8749 Views
5 replies
7 kudos

10-25-2021 5:25:36 AM

View Replies

Latest Reply

Rnmj
New Contributor III

10-28-2021 8:58:14 PM

7 kudos

Hi @Jose Gonzalez , @Werner Stinckens @Kaniz Fatma ,Thanks for your response .Appreciate a lot. The issue was in the code, it was a python /panda code running on Spark. Due to this only driver node was being used. i did validate this by increasin...

7 kudos

10-28-2021 8:58:14 PM

4 More Replies

by Braxx • Contributor II

10-28-2021 1:10:53 PM

1436 Views
1 replies
3 kudos

Retry api request if fails

I have a simple API request to query a table and retrive data, which are then suited into a dataframe. May happened, it fails due to different reasons. How to retry it for let's say 5 times when any kind of error takes place? Here is an api request:d...

Data Engineering

1436 Views
1 replies
3 kudos

10-28-2021 1:10:53 PM

View Replies

Latest Reply

Manoj
Contributor II

10-28-2021 3:25:09 PM

3 kudos

@Bartosz Wachocki ,Use timeout, retry interval ,recursion and exception handling pseudo code belowtimeout = 300def exec_query(query,timeout): try: df = spark.createDataFrame(sf.bulk.MyTable.query(query)) except: if timeout > 0 : sleep(60) exec_que...

3 kudos

10-28-2021 3:25:09 PM

by trm • New Contributor II

10-27-2021 9:41:42 AM

1061 Views
2 replies
2 kudos

Resolved! mail configuration azure data bricks pyspark notebook

Hi All,i am new to azure databricks , i am using pyspark .. we need to configure mail alerts when notebook failed or succeeded ..please can some one help me in mail configuration azure data bricks .Thanks

Data Engineering

1061 Views
2 replies
2 kudos

10-27-2021 9:41:42 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-28-2021 2:15:28 AM

2 kudos

the easiest way to schedule notebooks in Azure is to use Data Factory.In Data Factory you can schedule the notebooks and define the alerts you want to send.The other option is the one Hubert mentioned.

2 kudos

10-28-2021 2:15:28 AM

1 More Replies

by dimoobraznii • New Contributor III

10-27-2021 5:38:28 PM

4925 Views
3 replies
9 kudos

databricks-connect' is not recognized as an internal or external command, operable program or batch file on windows

Hello,I've installed databricks-connect on Windows 10:C:\Users\danoshin>pip install -U "databricks-connect==9.1.*" Collecting databricks-connect==9.1.* Downloading databricks-connect-9.1.2.tar.gz (254.6 MB) |████████████████████████████████| 2...

Data Engineering

4925 Views
3 replies
9 kudos

10-27-2021 5:38:28 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

10-28-2021 1:57:27 AM

9 kudos

@Dmitry Anoshin , that seems messed up.the best you can do is to remove databricks connect and also to uninstall any pyspark installation.And then follow the installation guide.It should work after following the procedure.I use a Linux VM for this p...

9 kudos

10-28-2021 1:57:27 AM

2 More Replies

by Greg_Galloway • New Contributor III

10-18-2021 11:57:53 AM

3934 Views
5 replies
3 kudos

Resolved! Use of private endpoints for storage in workspace with EnableNoPublicIP=Yes and VnetInjection=No

We know that Databricks with VNET injection (our own VNET) allows is to connect to ADLS Gen2 over private endpoints. This is what we typically do.We have a customer who created Databricks with EnableNoPublicIP=Yes (secure cluster connectivity) and Vn...

Data Engineering

3934 Views
5 replies
3 kudos

10-18-2021 11:57:53 AM

View Replies

Latest Reply

User16871418122
Contributor III

10-26-2021 9:12:44 PM

3 kudos

Managed VNET is locked and allows very limited config tuning like VNET peering that too facilitated and needs to be done from Databricks UI. If they want more control on VNET they need to migrate to VNET injected workspace.

3 kudos

10-26-2021 9:12:44 PM

4 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Hi, I like to create a web form with displayHTML in a notebook cell and when the users presses the post button, i like to write the content of my text...

Resolved! AWS VPC peering connection: can't make Databricks VPC reach our services on the accepter VPC

Resolved! what is the best way to create Tableau Hyper files in Databricks

Resolved! Can you help with this error please? Issue when using a new high concurrency cluster

Resolved! How many hours I can estimate to trainning in a Databricks Academy Self-Placed Trainning platform ?

Multi-task Job Run starting point

How to read a fixed length file in Spark using DataFrame API and SCALA

Hosting python application on Azure Databricks and exposing it's rest APIs

How to setup a private git repository in my workspace?

How can I load my table from Data Lake(ADLS) to Azure Synapse via Databricks?

ConnectException: Connection refused (Connection refused) This is often caused by an OOM error

Retry api request if fails

Resolved! mail configuration azure data bricks pyspark notebook

databricks-connect' is not recognized as an internal or external command, operable program or batch file on windows

Resolved! Use of private endpoints for storage in workspace with EnableNoPublicIP=Yes and VnetInjection=No

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...