cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hunter
by New Contributor III
  • 23283 Views
  • 7 replies
  • 6 kudos

Resolved! How to programmatically download png files from matplotlib plots in notebook?

I am creating plots in databricks using python and matplotlib. These look great in notebook and I can save them to the dbfs usingplt.savefig("/dbfs/FileStore/tables/[plot_name].png")I can then download the png files to my computer individually by pas...

  • 23283 Views
  • 7 replies
  • 6 kudos
Latest Reply
Hunter
New Contributor III
  • 6 kudos

Thanks everyone! I am already at a place where I can download a png to FileStore and use a url to download that file locally. What I was wondering was if there is some databricks function I can use to launch the url that references the png file and d...

  • 6 kudos
6 More Replies
successhawk
by New Contributor II
  • 3294 Views
  • 3 replies
  • 2 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

  • 3294 Views
  • 3 replies
  • 2 kudos
Latest Reply
irfanaziz
Contributor II
  • 2 kudos

Is it not the linked service that defines the kind of cluster created or used for any job?So i believe you could control the configuration via the linked service settings.

  • 2 kudos
2 More Replies
labtech
by Valued Contributor II
  • 2682 Views
  • 3 replies
  • 20 kudos

Resolved! Create Databricks Workspace with different email address on Azure

Hi team,I wonder if we can create a Databricks Workspace that not releated with Azure email address.Thanks

  • 2682 Views
  • 3 replies
  • 20 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 20 kudos

yes , i have done this multiple time

  • 20 kudos
2 More Replies
labtech
by Valued Contributor II
  • 2568 Views
  • 3 replies
  • 14 kudos

Get a new badge or new certified for version 3 of DE exam

I took a certified of DE exam (version 2). Do I receive a new badge or certified when I pass newest version of DE exam?I'm going to take that and review my knowlege.

  • 2568 Views
  • 3 replies
  • 14 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 14 kudos

Hi @Gam Nguyen​ I think there is no new badge for this one

  • 14 kudos
2 More Replies
cmilligan
by Contributor II
  • 1319 Views
  • 0 replies
  • 1 kudos

Fail a multi-task job successfully

I have a multi-task job that runs everyday where the first notebook in the job checks if the run should be continued based on the date that the job is run. The majority of the time the answer to that is no and I'm raising an exception for the job to ...

  • 1319 Views
  • 0 replies
  • 1 kudos
Harun
by Honored Contributor
  • 2227 Views
  • 1 replies
  • 1 kudos

Hi Community members and Databricks Officials, Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks offi...

Hi Community members and Databricks Officials,Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks officials please take action on the users who are spamming the timeline with some promotional contents.As...

  • 2227 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

Yes @Databricks Forum Admin​ please take an action on this

  • 1 kudos
DB_developer
by New Contributor III
  • 10542 Views
  • 2 replies
  • 3 kudos

How to optimize storage for sparse data in data lake?

I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?

  • 10542 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster

  • 3 kudos
1 More Replies
grazie
by Contributor
  • 11356 Views
  • 5 replies
  • 4 kudos

Resolved! Can we use "Access Connector for Azure Databricks" to access Azure Key Vault?

We have a scenario where ideally we'd like to use Managed Identities to access storage but also secrets. Per now we have a setup with service principals accessing secrets through secret scopes, but we foresee a situation where we may get many service...

  • 11356 Views
  • 5 replies
  • 4 kudos
Latest Reply
grive
New Contributor III
  • 4 kudos

I have unofficial word that this is not supported, and docs don't mention it. I have the feeling that even if I got it to work it should not be trusted for now.

  • 4 kudos
4 More Replies
andrew0117
by Contributor
  • 8326 Views
  • 4 replies
  • 4 kudos

Resolved! will I be charged by Databricks if leaving the cluster on but not running?

or Databricks only charges you whenever you are actually running the cluster, no matter how long you keep the cluster idle?Thanks!

  • 8326 Views
  • 4 replies
  • 4 kudos
Latest Reply
labtech
Valued Contributor II
  • 4 kudos

If you not congifure your cluster auto terminate after period of idle time, yes you will be charged for that.

  • 4 kudos
3 More Replies
seberino
by New Contributor III
  • 5229 Views
  • 1 replies
  • 1 kudos

Resolved! Why Revoke button initially greyed out in Data Explorer (of SQL Workspace) in the Permissions tab of a table?

Goal is to try to revoke SELECT permissions from a user for a table in Data Explorer in the SQL Workspace.I've tried navigating to the Permissions tab of the tab in Data Explorer.Initially the Revoke button is greyed out and only the Grant button is ...

  • 5229 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

Hi @Christian Seberino​ connect with Databricks for the sameYou can also raise a request for the same

  • 1 kudos
FerArribas
by Contributor
  • 4394 Views
  • 4 replies
  • 3 kudos

How to import a custom CA certificate into the Databricks SQL module?

We need to be able to import a custom certificate (https://learn.microsoft.com/en-us/azure/databricks/kb/python/import-custom-ca-cert) in the same way as in the "data engineering" module but in the Databricks SQL module

  • 4394 Views
  • 4 replies
  • 3 kudos
Latest Reply
VaibB
Contributor
  • 3 kudos

You can try downloading it to DBFS and may be accessing it from there if you use case really needs that.

  • 3 kudos
3 More Replies
ivanychev
by Contributor II
  • 17979 Views
  • 14 replies
  • 5 kudos

toPandas() causes IndexOutOfBoundsException in Apache Arrow

Using DBR 10.0   When calling toPandas() the worker fails with IndexOutOfBoundsException. It seems like ArrowWriter.sizeInBytes (which looks like a proprietary method since I can't find it in OSS) calls arrow's getBufferSizeFor which fails with this ...

  • 17979 Views
  • 14 replies
  • 5 kudos
Latest Reply
vikas_ahlawat
New Contributor II
  • 5 kudos

I am also facing the same issue, I have applied the config: `spark.sql.execution.arrow.pyspark.enabled` set to `false`, but still facing the same issue. Any Idea, what's going on???. Please help me out....org.apache.spark.SparkException: Job aborted ...

  • 5 kudos
13 More Replies
Soma
by Valued Contributor
  • 4483 Views
  • 3 replies
  • 1 kudos

Resolved! Store data using client side encryption and read data using client side encryption

Hi All,I am looking for some options to add the Client side encryption feature of azure to store data in adls gen2https://learn.microsoft.com/en-us/azure/storage/blobs/client-side-encryption?tabs=javaAny help will be highly appreciatedNote: Fernet si...

  • 4483 Views
  • 3 replies
  • 1 kudos
Latest Reply
Soma
Valued Contributor
  • 1 kudos

@Vidula Khanna​ We are going with fernet encryption as direct method is not available

  • 1 kudos
2 More Replies
Ravikumashi
by Contributor
  • 1837 Views
  • 2 replies
  • 0 kudos

access databricks secretes in int script

we are trying install databricks cli on init scripts and in order to do this we need to autheticate with databricks token but it is not secure as anyone got access to cluster can get hold of this databricks token.we try to inject the secretes into se...

  • 1837 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

I think you don't need to install CLI. There is a whole API available via notebook. below is example:import requests ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext() host_name = ctx.tags().get("browserHostName").get() host_toke...

  • 0 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 9083 Views
  • 4 replies
  • 11 kudos

Resolved! Pyspark learning path

Can anyone suggest to take the best series of courses offered by Databricks to learn pyspark for ETL purpose either in Databricks partner learning portal or Databricks learning portal.

  • 9083 Views
  • 4 replies
  • 11 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 11 kudos

To learn Databricks ETL, I highy recommend videos made by Simon on that channel https://www.youtube.com/@AdvancingAnalytics

  • 11 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels