cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

DaniP
by New Contributor II
  • 321 Views
  • 0 replies
  • 1 kudos

Wrote an article about implementing real-time fraud detection with Databricks!

Hey ya'll!I've just started to dabble with Databricks recently and decided a fraud-detection pipeline would be a cool project to implement. Let me know what ya'll think about the article. Also would love more smaller scale project ideas I could work ...

  • 321 Views
  • 0 replies
  • 1 kudos
Yogic24
by Contributor III
  • 902 Views
  • 2 replies
  • 1 kudos

Resolved! Regarding Certification renewal process

hi team,Anyone can guide me for certification renewal process?  

  • 902 Views
  • 2 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Yogic24 It's in certification FAQ. https://www.databricks.com/learn/certification/faq#certificationsTo recertify, you will need to take the full current live exam.

  • 1 kudos
1 More Replies
mh-hsn
by New Contributor III
  • 1047 Views
  • 0 replies
  • 0 kudos

Inconsistent behavior while loading pickle file

I have a pickle file "vectorizer.pkl" and I am currently facing an inconsistent behavior when trying to load that file. Sometimes it gets loaded successfully and sometimes I face an error. Here is how I am trying to load the file:from joblib import l...

Community Platform Discussions
datascience
machine learning
pickle
python
  • 1047 Views
  • 0 replies
  • 0 kudos
mh-hsn
by New Contributor III
  • 5569 Views
  • 5 replies
  • 8 kudos

python multiprocessing hangs at map on one cluster but works fine on another

I have a simple python script which have been running fine on my cluster but recently the same script gets stuck at map. So I tried creating a new cluster with less resources and tried to run the same script over that and it ran just fine.Here are th...

Community Platform Discussions
datascience
machine learning
MAP
multiprocessing
python
  • 5569 Views
  • 5 replies
  • 8 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 8 kudos

I agree with @raphaelblg. Most likely you're running out of memory. Multiprocessing or threadpools unfortunately do not benefit from extra workers as they only run on your driver node. This is very annoying and not a very known fact. Spark driver als...

  • 8 kudos
4 More Replies
nileshtiwaari
by New Contributor
  • 1014 Views
  • 1 replies
  • 1 kudos

Streaming Query

How to remove duplicates in streaming query on the basis of some id?

  • 1014 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@nileshtiwaari Are you refering to Strucutred Streaming or DLT?In case of Structured Streaming: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#streaming-deduplicationAbout DLT, here's a thread from a couple of months...

  • 1 kudos
himoshi
by New Contributor II
  • 499 Views
  • 0 replies
  • 0 kudos

Clarification on overwriting in Unity Catalog

Hello, While reviewing Unity Catalog to better understand its limitations, I came across the following statement:Overwrite mode for DataFrame write operations into Unity Catalog is supported only for Delta tables, not for other file formats. The user...

  • 499 Views
  • 0 replies
  • 0 kudos
Cloud_Architect
by New Contributor III
  • 411 Views
  • 1 replies
  • 0 kudos

Need help calculating the cost benefits of switching from interactive to job cluster

I need help calculating the cost benefits of switching from interactive to job cluster. Can you help me get some formulas on how to calculate the cost differences in Databricks?

  • 411 Views
  • 1 replies
  • 0 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 0 kudos

Assuming you're on Azure (otherwise use the AWS/GCP equivalent), did you try the Azure cost calculator? https://azure.microsoft.com/en-us/pricing/details/databricks/Question to ask yourself to get more specific: Do you have an idea how much DBU's you...

  • 0 kudos
MadCowTM
by New Contributor II
  • 1660 Views
  • 1 replies
  • 2 kudos

Resolved! get_json_object and json path filtering

I have following string [{"key":"abc","value":{"string_value":"abc123"}},{"key":"def","value":{"int_value":123}},{"key":"ghi","value":{"string_value":"ghi456"}}] and from that string i need to extract key.value.string_value for key with the value equ...

  • 1660 Views
  • 1 replies
  • 2 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 2 kudos

Can you try with the below code snippet WITH exploded_json AS ( SELECT explode(from_json( '[{"key":"abc","value":{"string_value":"abc123"}},{"key":"def","value":{"int_value":123}},{"key":"ghi","value":{"string_value":"ghi456"}}]', 'array<s...

  • 2 kudos
unity_Catalog
by New Contributor III
  • 613 Views
  • 0 replies
  • 0 kudos

UCX installation error

I am getting the below error while Installing UCX. But Installation is done in the workspace.I have admin privileges on the workspace. The below error suggests to check token or URL of workspace.They are provided correctly.Then why below error is sho...

  • 613 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors