cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dave_Nithio
by Contributor
  • 406 Views
  • 0 replies
  • 1 kudos

Natively Query Delta Lake with R

I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...

  • 406 Views
  • 0 replies
  • 1 kudos
lawrence009
by Contributor
  • 1333 Views
  • 4 replies
  • 4 kudos

Cannot CREATE TABLE with 'No Isolation Shared' cluster

Recently I ran into a number issues running with our notebooks in Interactive Mode. For example, we can't create (delta) table. The command would run and then idle for no apparent exception. The path is created on AWS S3 but delta log is never create...

  • 1333 Views
  • 4 replies
  • 4 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 4 kudos

The Admin can disable the possibility to use the no Isolate Shared cluster. I recommend you to switch to Single user where UC is activated. Don't worry you won't need to change your code. If you encounter this kind of issues, make sure to open a tick...

  • 4 kudos
3 More Replies
Hunter
by New Contributor III
  • 6873 Views
  • 9 replies
  • 8 kudos

Resolved! How to programmatically download png files from matplotlib plots in notebook?

I am creating plots in databricks using python and matplotlib. These look great in notebook and I can save them to the dbfs usingplt.savefig("/dbfs/FileStore/tables/[plot_name].png")I can then download the png files to my computer individually by pas...

  • 6873 Views
  • 9 replies
  • 8 kudos
Latest Reply
Hunter
New Contributor III
  • 8 kudos

Thanks everyone! I am already at a place where I can download a png to FileStore and use a url to download that file locally. What I was wondering was if there is some databricks function I can use to launch the url that references the png file and d...

  • 8 kudos
8 More Replies
successhawk
by New Contributor II
  • 1029 Views
  • 4 replies
  • 3 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

  • 1029 Views
  • 4 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Nathan Hawk​, We haven’t heard from you since the last response from @nafri A​, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others.Als...

  • 3 kudos
3 More Replies
labtech
by Valued Contributor II
  • 914 Views
  • 3 replies
  • 20 kudos

Resolved! Create Databricks Workspace with different email address on Azure

Hi team,I wonder if we can create a Databricks Workspace that not releated with Azure email address.Thanks

  • 914 Views
  • 3 replies
  • 20 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 20 kudos

yes , i have done this multiple time

  • 20 kudos
2 More Replies
labtech
by Valued Contributor II
  • 920 Views
  • 4 replies
  • 14 kudos

Resolved! Get a new badge or new certified for version 3 of DE exam

I took a certified of DE exam (version 2). Do I receive a new badge or certified when I pass newest version of DE exam?I'm going to take that and review my knowlege.

  • 920 Views
  • 4 replies
  • 14 kudos
Latest Reply
Kaniz
Community Manager
  • 14 kudos

Hi @Gam Nguyen​, Thank you for reaching out!Let us look into this for you, and we'll circle back with an update.

  • 14 kudos
3 More Replies
cmilligan
by Contributor II
  • 444 Views
  • 0 replies
  • 1 kudos

Fail a multi-task job successfully

I have a multi-task job that runs everyday where the first notebook in the job checks if the run should be continued based on the date that the job is run. The majority of the time the answer to that is no and I'm raising an exception for the job to ...

  • 444 Views
  • 0 replies
  • 1 kudos
seberino
by New Contributor III
  • 1039 Views
  • 2 replies
  • 1 kudos

Resolved! Why Revoke button initially greyed out in Data Explorer (of SQL Workspace) in the Permissions tab of a table?

Goal is to try to revoke SELECT permissions from a user for a table in Data Explorer in the SQL Workspace.I've tried navigating to the Permissions tab of the tab in Data Explorer.Initially the Revoke button is greyed out and only the Grant button is ...

  • 1039 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Christian Seberino​, We haven’t heard from you since the last response from @Ajay Pandey​ , and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful t...

  • 1 kudos
1 More Replies
Harun
by Honored Contributor
  • 594 Views
  • 1 replies
  • 1 kudos

Hi Community members and Databricks Officials, Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks offi...

Hi Community members and Databricks Officials,Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks officials please take action on the users who are spamming the timeline with some promotional contents.As...

  • 594 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Yes @Databricks Forum Admin​ please take an action on this

  • 1 kudos
DB_developer
by New Contributor III
  • 1493 Views
  • 2 replies
  • 3 kudos

How to optimize storage for sparse data in data lake?

I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?

  • 1493 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster

  • 3 kudos
1 More Replies
grazie
by Contributor
  • 4424 Views
  • 5 replies
  • 3 kudos

Resolved! Can we use "Access Connector for Azure Databricks" to access Azure Key Vault?

We have a scenario where ideally we'd like to use Managed Identities to access storage but also secrets. Per now we have a setup with service principals accessing secrets through secret scopes, but we foresee a situation where we may get many service...

  • 4424 Views
  • 5 replies
  • 3 kudos
Latest Reply
grive
New Contributor III
  • 3 kudos

I have unofficial word that this is not supported, and docs don't mention it. I have the feeling that even if I got it to work it should not be trusted for now.

  • 3 kudos
4 More Replies
andrew0117
by Contributor
  • 3033 Views
  • 4 replies
  • 4 kudos

Resolved! will I be charged by Databricks if leaving the cluster on but not running?

or Databricks only charges you whenever you are actually running the cluster, no matter how long you keep the cluster idle?Thanks!

  • 3033 Views
  • 4 replies
  • 4 kudos
Latest Reply
labtech
Valued Contributor II
  • 4 kudos

If you not congifure your cluster auto terminate after period of idle time, yes you will be charged for that.

  • 4 kudos
3 More Replies
FerArribas
by Contributor
  • 1514 Views
  • 4 replies
  • 3 kudos

How to import a custom CA certificate into the Databricks SQL module?

We need to be able to import a custom certificate (https://learn.microsoft.com/en-us/azure/databricks/kb/python/import-custom-ca-cert) in the same way as in the "data engineering" module but in the Databricks SQL module

  • 1514 Views
  • 4 replies
  • 3 kudos
Latest Reply
VaibB
Contributor
  • 3 kudos

You can try downloading it to DBFS and may be accessing it from there if you use case really needs that.

  • 3 kudos
3 More Replies
ivanychev
by Contributor
  • 4886 Views
  • 16 replies
  • 5 kudos

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Using DBR 10.0When calling toPandas() the worker fails with IndexOutOfBoundsException. It seems like ArrowWriter.sizeInBytes (which looks like a proprietary method since I can't find it in OSS) calls arrow's getBufferSizeFor which fails with this err...

  • 4886 Views
  • 16 replies
  • 5 kudos
Latest Reply
vikas_ahlawat
New Contributor II
  • 5 kudos

I am also facing the same issue, I have applied the config: `spark.sql.execution.arrow.pyspark.enabled` set to `false`, but still facing the same issue. Any Idea, what's going on???. Please help me out....org.apache.spark.SparkException: Job aborted ...

  • 5 kudos
15 More Replies
Soma
by Valued Contributor
  • 1585 Views
  • 5 replies
  • 1 kudos

Resolved! Store data using client side encryption and read data using client side encryption

Hi All,I am looking for some options to add the Client side encryption feature of azure to store data in adls gen2https://learn.microsoft.com/en-us/azure/storage/blobs/client-side-encryption?tabs=javaAny help will be highly appreciatedNote: Fernet si...

  • 1585 Views
  • 5 replies
  • 1 kudos
Latest Reply
Soma
Valued Contributor
  • 1 kudos

@Vidula Khanna​ We are going with fernet encryption as direct method is not available

  • 1 kudos
4 More Replies
Labels
Top Kudoed Authors