Data Engineering

Forum Posts

Sorted by:

by Dave_Nithio • Contributor

12-12-2022 3:18:35 PM

406 Views
0 replies
1 kudos

Natively Query Delta Lake with R

I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...

Data Engineering

406 Views
0 replies
1 kudos

12-12-2022 3:18:35 PM

by lawrence009 • Contributor

12-08-2022 8:39:44 PM

1333 Views
4 replies
4 kudos

Cannot CREATE TABLE with 'No Isolation Shared' cluster

Recently I ran into a number issues running with our notebooks in Interactive Mode. For example, we can't create (delta) table. The command would run and then idle for no apparent exception. The path is created on AWS S3 but delta log is never create...

Data Engineering

1333 Views
4 replies
4 kudos

12-08-2022 8:39:44 PM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

12-09-2022 7:54:31 AM

4 kudos

The Admin can disable the possibility to use the no Isolate Shared cluster. I recommend you to switch to Single user where UC is activated. Don't worry you won't need to change your code. If you encounter this kind of issues, make sure to open a tick...

4 kudos

12-09-2022 7:54:31 AM

3 More Replies

by Hunter • New Contributor III

09-15-2021 2:31:58 PM

6873 Views
9 replies
8 kudos

Resolved! How to programmatically download png files from matplotlib plots in notebook?

I am creating plots in databricks using python and matplotlib. These look great in notebook and I can save them to the dbfs usingplt.savefig("/dbfs/FileStore/tables/[plot_name].png")I can then download the png files to my computer individually by pas...

Data Engineering

6873 Views
9 replies
8 kudos

09-15-2021 2:31:58 PM

View Replies

Latest Reply

Hunter
New Contributor III

09-20-2021 8:19:02 AM

8 kudos

Thanks everyone! I am already at a place where I can download a png to FileStore and use a url to download that file locally. What I was wondering was if there is some databricks function I can use to launch the url that references the png file and d...

8 kudos

09-20-2021 8:19:02 AM

8 More Replies

by successhawk • New Contributor II

12-11-2022 8:31:06 PM

1029 Views
4 replies
3 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

Data Engineering

1029 Views
4 replies
3 kudos

12-11-2022 8:31:06 PM

View Replies

Latest Reply

Kaniz
Community Manager

12-12-2022 5:06:20 AM

3 kudos

Hi @Nathan Hawk, We haven’t heard from you since the last response from @nafri A, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others.Als...

3 kudos

12-12-2022 5:06:20 AM

3 More Replies

by labtech • Valued Contributor II

12-08-2022 7:11:02 PM

914 Views
3 replies
20 kudos

Resolved! Create Databricks Workspace with different email address on Azure

Hi team,I wonder if we can create a Databricks Workspace that not releated with Azure email address.Thanks

Data Engineering

914 Views
3 replies
20 kudos

12-08-2022 7:11:02 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-10-2022 7:16:23 AM

20 kudos

yes , i have done this multiple time

20 kudos

12-10-2022 7:16:23 AM

2 More Replies

by labtech • Valued Contributor II

12-11-2022 10:18:45 PM

920 Views
4 replies
14 kudos

Resolved! Get a new badge or new certified for version 3 of DE exam

I took a certified of DE exam (version 2). Do I receive a new badge or certified when I pass newest version of DE exam?I'm going to take that and review my knowlege.

Data Engineering

920 Views
4 replies
14 kudos

12-11-2022 10:18:45 PM

View Replies

Latest Reply

Kaniz
Community Manager

12-12-2022 1:53:20 AM

14 kudos

Hi @Gam Nguyen, Thank you for reaching out!Let us look into this for you, and we'll circle back with an update.

14 kudos

12-12-2022 1:53:20 AM

3 More Replies

by cmilligan • Contributor II

12-12-2022 5:54:27 AM

444 Views
0 replies
1 kudos

Fail a multi-task job successfully

I have a multi-task job that runs everyday where the first notebook in the job checks if the run should be continued based on the date that the job is run. The majority of the time the answer to that is no and I'm raising an exception for the job to ...

Data Engineering

444 Views
0 replies
1 kudos

12-12-2022 5:54:27 AM

by seberino • New Contributor III

12-11-2022 3:54:59 PM

1039 Views
2 replies
1 kudos

Resolved! Why Revoke button initially greyed out in Data Explorer (of SQL Workspace) in the Permissions tab of a table?

Goal is to try to revoke SELECT permissions from a user for a table in Data Explorer in the SQL Workspace.I've tried navigating to the Permissions tab of the tab in Data Explorer.Initially the Revoke button is greyed out and only the Grant button is ...

Data Engineering

1039 Views
2 replies
1 kudos

12-11-2022 3:54:59 PM

View Replies

Latest Reply

Kaniz
Community Manager

12-12-2022 5:04:55 AM

1 kudos

Hi @Christian Seberino, We haven’t heard from you since the last response from @Ajay Pandey , and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful t...

1 kudos

12-12-2022 5:04:55 AM

1 More Replies

by Harun • Honored Contributor

12-12-2022 3:37:53 AM

594 Views
1 replies
1 kudos

Hi Community members and Databricks Officials, Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks offi...

Hi Community members and Databricks Officials,Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks officials please take action on the users who are spamming the timeline with some promotional contents.As...

Data Engineering

594 Views
1 replies
1 kudos

12-12-2022 3:37:53 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

12-12-2022 4:44:00 AM

1 kudos

Yes @Databricks Forum Admin please take an action on this

1 kudos

12-12-2022 4:44:00 AM

by DB_developer • New Contributor III

12-08-2022 5:08:20 AM

1493 Views
2 replies
3 kudos

How to optimize storage for sparse data in data lake?

I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?

Data Engineering

1493 Views
2 replies
3 kudos

12-08-2022 5:08:20 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

12-08-2022 6:17:23 AM

3 kudos

datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster

3 kudos

12-08-2022 6:17:23 AM

1 More Replies

by grazie • Contributor

12-06-2022 6:24:53 AM

4424 Views
5 replies
3 kudos

Resolved! Can we use "Access Connector for Azure Databricks" to access Azure Key Vault?

We have a scenario where ideally we'd like to use Managed Identities to access storage but also secrets. Per now we have a setup with service principals accessing secrets through secret scopes, but we foresee a situation where we may get many service...

Data Engineering

4424 Views
5 replies
3 kudos

12-06-2022 6:24:53 AM

View Replies

Latest Reply

grive
New Contributor III

12-09-2022 5:43:34 AM

3 kudos

I have unofficial word that this is not supported, and docs don't mention it. I have the feeling that even if I got it to work it should not be trusted for now.

3 kudos

12-09-2022 5:43:34 AM

4 More Replies

by andrew0117 • Contributor

12-08-2022 7:24:59 AM

3033 Views
4 replies
4 kudos

Resolved! will I be charged by Databricks if leaving the cluster on but not running?

or Databricks only charges you whenever you are actually running the cluster, no matter how long you keep the cluster idle?Thanks!

Data Engineering

3033 Views
4 replies
4 kudos

12-08-2022 7:24:59 AM

View Replies

Latest Reply

labtech
Valued Contributor II

12-11-2022 10:14:57 PM

4 kudos

If you not congifure your cluster auto terminate after period of idle time, yes you will be charged for that.

4 kudos

12-11-2022 10:14:57 PM

3 More Replies

by FerArribas • Contributor

11-28-2022 2:26:50 AM

1514 Views
4 replies
3 kudos

How to import a custom CA certificate into the Databricks SQL module?

We need to be able to import a custom certificate (https://learn.microsoft.com/en-us/azure/databricks/kb/python/import-custom-ca-cert) in the same way as in the "data engineering" module but in the Databricks SQL module

Data Engineering

1514 Views
4 replies
3 kudos

11-28-2022 2:26:50 AM

View Replies

Latest Reply

VaibB
Contributor

12-02-2022 12:06:42 PM

3 kudos

You can try downloading it to DBFS and may be accessing it from there if you use case really needs that.

3 kudos

12-02-2022 12:06:42 PM

3 More Replies

by ivanychev • Contributor

04-19-2022 10:47:48 AM

4886 Views
16 replies
5 kudos

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Using DBR 10.0When calling toPandas() the worker fails with IndexOutOfBoundsException. It seems like ArrowWriter.sizeInBytes (which looks like a proprietary method since I can't find it in OSS) calls arrow's getBufferSizeFor which fails with this err...

Data Engineering

4886 Views
16 replies
5 kudos

04-19-2022 10:47:48 AM

View Replies

Latest Reply

vikas_ahlawat
New Contributor II

12-11-2022 6:36:50 AM

5 kudos

I am also facing the same issue, I have applied the config: `spark.sql.execution.arrow.pyspark.enabled` set to `false`, but still facing the same issue. Any Idea, what's going on???. Please help me out....org.apache.spark.SparkException: Job aborted ...

5 kudos

12-11-2022 6:36:50 AM

15 More Replies

by Soma • Valued Contributor

09-26-2022 8:27:32 AM

1585 Views
5 replies
1 kudos

Resolved! Store data using client side encryption and read data using client side encryption

Hi All,I am looking for some options to add the Client side encryption feature of azure to store data in adls gen2https://learn.microsoft.com/en-us/azure/storage/blobs/client-side-encryption?tabs=javaAny help will be highly appreciatedNote: Fernet si...

Data Engineering

1585 Views
5 replies
1 kudos

09-26-2022 8:27:32 AM

View Replies

Latest Reply

Soma
Valued Contributor

12-11-2022 2:25:51 AM

1 kudos

@Vidula Khanna We are going with fernet encryption as direct method is not available

1 kudos

12-11-2022 2:25:51 AM

4 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Natively Query Delta Lake with R

Cannot CREATE TABLE with 'No Isolation Shared' cluster

Resolved! How to programmatically download png files from matplotlib plots in notebook?

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

Resolved! Create Databricks Workspace with different email address on Azure

Resolved! Get a new badge or new certified for version 3 of DE exam

Fail a multi-task job successfully

Resolved! Why Revoke button initially greyed out in Data Explorer (of SQL Workspace) in the Permissions tab of a table?

Hi Community members and Databricks Officials, Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks offi...

How to optimize storage for sparse data in data lake?

Resolved! Can we use "Access Connector for Azure Databricks" to access Azure Key Vault?

Resolved! will I be charged by Databricks if leaving the cluster on but not running?

How to import a custom CA certificate into the Databricks SQL module?

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Resolved! Store data using client side encryption and read data using client side encryption

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...