cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

antonyj453
by New Contributor II
  • 4841 Views
  • 1 replies
  • 3 kudos

How to extract JSON object from a pyspark data frame. I was able to extract data from another column which in array format using "Explode" function, but Explode is not working for Object type. Its returning with type mismatch error.

I have tried below code to extract data which in Array:df2 = df_deidentifieddocuments_tst.select(F.explode('annotationId').alias('annotationId')).select('annotationId.$oid')It was working fine.. but,its not working for JSON object type. Below is colu...

CreateaAT
  • 4841 Views
  • 1 replies
  • 3 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 3 kudos

Did you try extracting that column data using from_json function ?

  • 3 kudos
gpzz
by New Contributor III
  • 3567 Views
  • 1 replies
  • 3 kudos

pyspark code error

rdd4 = rdd3.reducByKey(lambda x,y: x+y)AttributeError: 'PipelinedRDD' object has no attribute 'reducByKey'Pls help me out with this

  • 3567 Views
  • 1 replies
  • 3 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 3 kudos

Is it a typo or are you really using reducByKey instead of reduceByKey ?

  • 3 kudos
Axserv
by New Contributor II
  • 4584 Views
  • 4 replies
  • 1 kudos

How do I "Earn 100 points to the Databricks Community Rewards Store" ? (As advertised on Databricks Academy)

Hello, how do I join the Databricks Community study group for 100points, as advertised on the Databricks Academy website?

image
  • 4584 Views
  • 4 replies
  • 1 kudos
Latest Reply
Harun
Honored Contributor
  • 1 kudos

@Alex Serlovsky​ You need to earn the lakehouse fundamental credetial certification, then you can join this community group. Within 24 to 48 hours you will get 100 reward points. But As per databricks, you need to earn the credential on or before Nov...

  • 1 kudos
3 More Replies
Dave_Nithio
by Contributor II
  • 2367 Views
  • 0 replies
  • 1 kudos

Natively Query Delta Lake with R

I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...

  • 2367 Views
  • 0 replies
  • 1 kudos
lawrence009
by Contributor
  • 4590 Views
  • 4 replies
  • 4 kudos

Cannot CREATE TABLE with 'No Isolation Shared' cluster

Recently I ran into a number issues running with our notebooks in Interactive Mode. For example, we can't create (delta) table. The command would run and then idle for no apparent exception. The path is created on AWS S3 but delta log is never create...

  • 4590 Views
  • 4 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

The Admin can disable the possibility to use the no Isolate Shared cluster. I recommend you to switch to Single user where UC is activated. Don't worry you won't need to change your code. If you encounter this kind of issues, make sure to open a tick...

  • 4 kudos
3 More Replies
Hunter
by New Contributor III
  • 24375 Views
  • 7 replies
  • 6 kudos

Resolved! How to programmatically download png files from matplotlib plots in notebook?

I am creating plots in databricks using python and matplotlib. These look great in notebook and I can save them to the dbfs usingplt.savefig("/dbfs/FileStore/tables/[plot_name].png")I can then download the png files to my computer individually by pas...

  • 24375 Views
  • 7 replies
  • 6 kudos
Latest Reply
Hunter
New Contributor III
  • 6 kudos

Thanks everyone! I am already at a place where I can download a png to FileStore and use a url to download that file locally. What I was wondering was if there is some databricks function I can use to launch the url that references the png file and d...

  • 6 kudos
6 More Replies
successhawk
by New Contributor II
  • 3508 Views
  • 3 replies
  • 2 kudos

Resolved! Is there a way to tell if a created job is not compliant against configured cluster policies before it runs?

As a DevOps engineer, I want to enforce cluster policies at deployment time when the job is deployed/created, well before it is time to actually use it (i.e. before its scheduled/triggered run time without actually running it).

  • 3508 Views
  • 3 replies
  • 2 kudos
Latest Reply
irfanaziz
Contributor II
  • 2 kudos

Is it not the linked service that defines the kind of cluster created or used for any job?So i believe you could control the configuration via the linked service settings.

  • 2 kudos
2 More Replies
labtech
by Valued Contributor II
  • 2825 Views
  • 3 replies
  • 20 kudos

Resolved! Create Databricks Workspace with different email address on Azure

Hi team,I wonder if we can create a Databricks Workspace that not releated with Azure email address.Thanks

  • 2825 Views
  • 3 replies
  • 20 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 20 kudos

yes , i have done this multiple time

  • 20 kudos
2 More Replies
labtech
by Valued Contributor II
  • 2717 Views
  • 3 replies
  • 14 kudos

Get a new badge or new certified for version 3 of DE exam

I took a certified of DE exam (version 2). Do I receive a new badge or certified when I pass newest version of DE exam?I'm going to take that and review my knowlege.

  • 2717 Views
  • 3 replies
  • 14 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 14 kudos

Hi @Gam Nguyen​ I think there is no new badge for this one

  • 14 kudos
2 More Replies
cmilligan
by Contributor II
  • 1441 Views
  • 0 replies
  • 1 kudos

Fail a multi-task job successfully

I have a multi-task job that runs everyday where the first notebook in the job checks if the run should be continued based on the date that the job is run. The majority of the time the answer to that is no and I'm raising an exception for the job to ...

  • 1441 Views
  • 0 replies
  • 1 kudos
Harun
by Honored Contributor
  • 2294 Views
  • 1 replies
  • 1 kudos

Hi Community members and Databricks Officials, Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks offi...

Hi Community members and Databricks Officials,Now a days i am seeing lot of spam post in our groups and discussions. Forum admins and databricks officials please take action on the users who are spamming the timeline with some promotional contents.As...

  • 2294 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

Yes @Databricks Forum Admin​ please take an action on this

  • 1 kudos
DB_developer
by New Contributor III
  • 10814 Views
  • 2 replies
  • 3 kudos

How to optimize storage for sparse data in data lake?

I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?

  • 10814 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster

  • 3 kudos
1 More Replies
grazie
by Contributor
  • 12189 Views
  • 5 replies
  • 4 kudos

Resolved! Can we use "Access Connector for Azure Databricks" to access Azure Key Vault?

We have a scenario where ideally we'd like to use Managed Identities to access storage but also secrets. Per now we have a setup with service principals accessing secrets through secret scopes, but we foresee a situation where we may get many service...

  • 12189 Views
  • 5 replies
  • 4 kudos
Latest Reply
grive
New Contributor III
  • 4 kudos

I have unofficial word that this is not supported, and docs don't mention it. I have the feeling that even if I got it to work it should not be trusted for now.

  • 4 kudos
4 More Replies
andrew0117
by Contributor
  • 9142 Views
  • 4 replies
  • 4 kudos

Resolved! will I be charged by Databricks if leaving the cluster on but not running?

or Databricks only charges you whenever you are actually running the cluster, no matter how long you keep the cluster idle?Thanks!

  • 9142 Views
  • 4 replies
  • 4 kudos
Latest Reply
labtech
Valued Contributor II
  • 4 kudos

If you not congifure your cluster auto terminate after period of idle time, yes you will be charged for that.

  • 4 kudos
3 More Replies
seberino
by New Contributor III
  • 5388 Views
  • 1 replies
  • 1 kudos

Resolved! Why Revoke button initially greyed out in Data Explorer (of SQL Workspace) in the Permissions tab of a table?

Goal is to try to revoke SELECT permissions from a user for a table in Data Explorer in the SQL Workspace.I've tried navigating to the Permissions tab of the tab in Data Explorer.Initially the Revoke button is greyed out and only the Grant button is ...

  • 5388 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

Hi @Christian Seberino​ connect with Databricks for the sameYou can also raise a request for the same

  • 1 kudos
Labels