Data Engineering

Forum Posts

Sorted by:

by Confused • New Contributor III

11-16-2021 1:43:12 AM

4404 Views
6 replies
1 kudos

Hi Guys Is there any documentation on where the /databricks-datasets/ mount is actually served from?We are looking at locking down where our workspace...

Hi GuysIs there any documentation on where the /databricks-datasets/ mount is actually served from?We are looking at locking down where our workspace can reach out to via the internet and as it currently stands we are unable to reach this.I did look ...

Data Engineering

4404 Views
6 replies
1 kudos

11-16-2021 1:43:12 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-18-2021 11:50:40 AM

1 kudos

Hello Mat, Thanks for letting us know. Would you be happy to mark your answer as best if that will solve the problem for others? That way, members will be able to find the solution more easily.

1 kudos

11-18-2021 11:50:40 AM

5 More Replies

by MadelynM • Databricks Employee

11-08-2021 10:31:35 AM

2734 Views
2 replies
1 kudos

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

Thanks to everyone who joined the Best Practices for Your Data Architecture session on Getting Workloads to Production using CI/CD. You can access the on-demand session recording here, and the code in the Databricks Labs CI/CD Templates Repo. Posted ...

Data Engineering

2734 Views
2 replies
1 kudos

11-08-2021 10:31:35 AM

View Replies

Latest Reply

MadelynM
Databricks Employee

11-18-2021 1:00:21 PM

1 kudos

Here's the embedded links list!Jobs scheduling and orchestrationBuilt-in job scheduling: https://docs.databricks.com/jobs.html#schedule-a-job Periodic scheduling of the jobsExecute notebook / jar / Python script / Spark-submitMultitask JobsExecute no...

1 kudos

11-18-2021 1:00:21 PM

1 More Replies

by raymund • New Contributor III

11-03-2021 10:31:22 AM

4060 Views
7 replies
5 kudos

Resolved! Why adding the package 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1' failed in runtime 9.1.x-scala2.12 but was successful using runtime 8.2.x-scala2.12 ?

Using Databricks spark submit job, setting new cluster1] "spark_version": "8.2.x-scala2.12" => OK, works fine2] "spark_version": "9.1.x-scala2.12" => FAIL, with errorsException in thread "main" java.lang.ExceptionInInitializerError at com.databricks...

Data Engineering

4060 Views
7 replies
5 kudos

11-03-2021 10:31:22 AM

View Replies

Latest Reply

raymund
New Contributor III

11-10-2021 2:14:42 PM

5 kudos

this has been resolved by adding the following spark_conf (not thru --conf) "spark.hadoop.fs.file.impl": "org.apache.hadoop.fs.LocalFileSystem"example:------"new_cluster": { "spark_version": "9.1.x-scala2.12", ... "spark_conf": { "spar...

5 kudos

11-10-2021 2:14:42 PM

6 More Replies

by antoooks • New Contributor III

10-25-2021 1:10:39 AM

2903 Views
2 replies
4 kudos

Resolved! display() function always return connection refused on tunneling despite successfully retrieving the schema

Hi everyone,I am using SSH tunnelling with SSHTunnelForwarder to reach a target AWS RDS PostgreSQL database. The connection got through, however when I tried to display the retrieved data frame it always throws "connection refused" error. Please see ...

Data Engineering

2903 Views
2 replies
4 kudos

10-25-2021 1:10:39 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

11-12-2021 4:41:41 PM

4 kudos

hi @Kurnianto Trilaksono Sutjipto ,This seems like a connectivity issue with the url you are trying to connect to. It fails during the display() command because read is a lazy transformation and it will not be executed right away. On the other hand,...

4 kudos

11-12-2021 4:41:41 PM

1 More Replies

by Leszek • Contributor

11-17-2021 7:00:29 AM

4323 Views
5 replies
11 kudos

Resolved! Runtime SQL Configuration - how to make it simple

Hi, I'm running couple of Notebooks in my pipeline and I would like to set fixed value of 'spark.sql.shuffle.partitions' - same value for every notebook. Should I do that by adding spark.conf.set.. code in each Notebook (Runtime SQL configurations ar...

Data Engineering

4323 Views
5 replies
11 kudos

11-17-2021 7:00:29 AM

View Replies

Latest Reply

Leszek
Contributor

11-17-2021 11:41:57 PM

11 kudos

Hi, Thank you all for the tips. I tried before to set this option in Spark Config but didn't work for some reason. Today I tried again and it's working :).

11 kudos

11-17-2021 11:41:57 PM

4 More Replies

by SRS • New Contributor II

11-16-2021 1:40:28 AM

3747 Views
3 replies
5 kudos

Resolved! Delta Tables incremental backup method

Hello,Does anyone tried to create an incremental backup on delta tables? What I mean is to load into the backup storage only the latest parquet files part of the Delta Table and to refresh the _delta_log folder, instead of copying the whole files aga...

Data Engineering

3747 Views
3 replies
5 kudos

11-16-2021 1:40:28 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

11-17-2021 11:47:09 AM

5 kudos

Hi @Stefan Stegaru ,You can use Delta time travel to query the data that was just added on a specific version. Then like @Hubert Dudek mentioned, you can copy over this sub set of data to a new table or a new location. You will need to do a deep...

5 kudos

11-17-2021 11:47:09 AM

2 More Replies

by Mohit_m • Valued Contributor II

11-16-2021 3:44:31 AM

4252 Views
3 replies
5 kudos

Resolved! Can't find or Enable "Files in Repos" feature

Not able to Find or Enable "Files in Repos" feature in the workspace, What could be the reason

Data Engineering

4252 Views
3 replies
5 kudos

11-16-2021 3:44:31 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-16-2021 4:17:21 AM

5 kudos

Please check your admin console.

5 kudos

11-16-2021 4:17:21 AM

2 More Replies

by Anonymous • Not applicable

11-03-2021 2:51:11 PM

3240 Views
4 replies
2 kudos

Resolved! Anyone using RAPIDS and cuGraph on a current runtime?

We're in the process of migrating a large graph computation workload to nvidia RAPIDS + cuGraph for GPU acceleration. The package isn't a part of the base runtime and it is available by conda package management only, so can't be installed via init sc...

Data Engineering

3240 Views
4 replies
2 kudos

11-03-2021 2:51:11 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-16-2021 11:32:00 AM

2 kudos

Thanks @Prabakar Ammeappin , we're looking at this. Strangely, the last commit removed the rapids libraries from the base cuda-images. We're adding them back in.

2 kudos

11-16-2021 11:32:00 AM

3 More Replies

by yatharth29 • New Contributor II

11-11-2021 2:45:00 AM

5530 Views
3 replies
0 kudos

How can I extract/get the time, along with the status (Failed or Succeeded) into a table for every time my Databricks job finishes running?

I want to get a mail notification at the end of each day for when my Databricks job has finished running and for that I need to extract the time of it's completion and it's status. How can I achieve that?

Data Engineering

5530 Views
3 replies
0 kudos

11-11-2021 2:45:00 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

11-17-2021 8:37:02 AM

0 kudos

Hi @Yatharth Kaushik you can use the JobsRunList API to get all the information of the job run. You can write a code to extract the information that you need for the table.The are multiple API's in the same doc that you can use to get information a...

0 kudos

11-17-2021 8:37:02 AM

2 More Replies

by RantoB • Valued Contributor

11-11-2021 7:34:49 AM

7894 Views
4 replies
0 kudos

Resolved! SSLCertVerificationError how to disable SSL Certification

Hi, How is that possible to disable SSL Certification.With databricks API I got this error :SSLCertVerificationError SSLCertVerificationError: ("hostname 'https' doesn't match either of '*.numericable.fr', 'numericable.fr'",) MaxRetryError: HTTPS...

Data Engineering

7894 Views
4 replies
0 kudos

11-11-2021 7:34:49 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-12-2021 8:29:53 AM

0 kudos

@Bertrand BURCKER - Thanks for letting us know your issue is resolved. If @Prabakar Ammeappin's answer solved the problem, would you be happy to mark his answer as best so others can more easily find an answer for this?

0 kudos

11-12-2021 8:29:53 AM

3 More Replies

by marsjuli • New Contributor II

10-24-2021 10:39:07 AM

18928 Views
1 replies
1 kudos

How to handle <IPython.core.display.HTML object>

Some libraries have intermediate IPython HTML-objects returned to the notebook cell output.Since this happens during training a machine learning model the statements are typically buried within in the library so I cannot easily interfere. (e.g. in or...

Data Engineering

18928 Views
1 replies
1 kudos

10-24-2021 10:39:07 AM

View Replies

Latest Reply

marsjuli
New Contributor II

11-17-2021 12:30:57 AM

1 kudos

Hi @Kaniz Fatma ,thanks for showing me the link. This helps if you are in control of the generated html-object. If the html-content comes from a library, that is where the problems start, because I cannot wrap displayHTML().(I can of course look for...

1 kudos

11-17-2021 12:30:57 AM

by Orianh • Valued Contributor II

11-14-2021 1:00:06 AM

4172 Views
3 replies
1 kudos

Train deep learning model with numpy arrays.

Hey guys,I'm trying to train deep learning model at ML databricks with numpy arrays as input.For now i organized all the data inside DF- df contains 4 columns : col1,col2,col3,col4col1 and col2 have arrays with shape (1,3,3,3,3), col 3 have array wit...

Data Engineering

4172 Views
3 replies
1 kudos

11-14-2021 1:00:06 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-15-2021 2:06:47 AM

1 kudos

Maybe you could save some your code. It will be easier to answer and also we could learn deep learning in databricks from your code.

1 kudos

11-15-2021 2:06:47 AM

2 More Replies

by Anonymous • Not applicable

11-16-2021 10:33:50 PM

606 Views
0 replies
0 kudos

Sell Handbags from Home. Free up your wardrobe space by selling a few preloved handbags at the best prices. Visit Sell Your Bag now!

Data Engineering

606 Views
0 replies
0 kudos

11-16-2021 10:33:50 PM

by Sarvagna_Mahaka • New Contributor III

11-02-2021 6:30:24 AM

17627 Views
6 replies
8 kudos

Resolved! Exporting csv files from Databricks

I'm trying to export a csv file from my Databricks workspace to my laptop.I have followed the below steps. 1.Installed databricks CLI2. Generated Token in Azure Databricks3. databricks configure --token5. Token:xxxxxxxxxxxxxxxxxxxxxxxxxx6. databrick...

Data Engineering

17627 Views
6 replies
8 kudos

11-02-2021 6:30:24 AM

View Replies

Latest Reply

User16871418122
Contributor III

11-16-2021 7:14:49 PM

8 kudos

Hi @Sarvagna Mahakali There is an easier hack: a) You can save results locally on the disk and create a hyper link for downloading CSV . You can copy the file to this location: dbfs:/FileStore/table1_good_2020_12_18_07_07_19.csvb) Then download with...

8 kudos

11-16-2021 7:14:49 PM

5 More Replies

by DB_007 • New Contributor III

11-15-2021 8:37:34 AM

8682 Views
8 replies
4 kudos

Resolved! Databricks SQL not displaying all the databases that i have on my cluster.

I have a cluster running on 7.3 LTS and it has about 35+ databases. When i tried to setup an endpoint on Databricks SQL, i do not see any database listed.

Data Engineering

8682 Views
8 replies
4 kudos

11-15-2021 8:37:34 AM

View Replies

Latest Reply

User16871418122
Contributor III

11-16-2021 7:04:02 PM

4 kudos

hi @Arif Ali You may have to check the data access config to add the params for external metastore: spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driverspark.hadoop.javax.jdo.option.ConnectionUserName <mysql-username>spark.had...

4 kudos

11-16-2021 7:04:02 PM

7 More Replies

User

Count

1610

763

345

286

251

Databricks Community

Forum Posts

Hi Guys Is there any documentation on where the /databricks-datasets/ mount is actually served from?We are looking at locking down where our workspace...

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

Resolved! Why adding the package 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1' failed in runtime 9.1.x-scala2.12 but was successful using runtime 8.2.x-scala2.12 ?

Resolved! display() function always return connection refused on tunneling despite successfully retrieving the schema

Resolved! Runtime SQL Configuration - how to make it simple

Resolved! Delta Tables incremental backup method

Resolved! Can't find or Enable "Files in Repos" feature

Resolved! Anyone using RAPIDS and cuGraph on a current runtime?

How can I extract/get the time, along with the status (Failed or Succeeded) into a table for every time my Databricks job finishes running?

Resolved! SSLCertVerificationError how to disable SSL Certification

How to handle <IPython.core.display.HTML object>

Train deep learning model with numpy arrays.

Sell Handbags from Home. Free up your wardrobe space by selling a few preloved handbags at the best prices. Visit Sell Your Bag now!

Resolved! Exporting csv files from Databricks

Resolved! Databricks SQL not displaying all the databases that i have on my cluster.

Connect with Databricks Users in Your Area

databricks workspace import_dir not working withou...

Writing back from notebook to blob storage as sing...

Hostname not resolving using Spark JDBC

Error updating tables in DLT

How to develop with databricks connect smoothly?