cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

rameshkumar610
by New Contributor
  • 262 Views
  • 0 replies
  • 0 kudos

S60 Eliminate SPN secrets - Connect Azure Databricks to ADLS Gen2 , Gen1 via custom AD token

Hi Team,In Azure Databricks, we currently use Service Principal when creating Mount Points to Azure storage ( ADLS Gen1, ADLS Gen 2 and Azure Blob Storage).As part of S360 action to eliminate SPN secrets, we were asked to move to SPN+certificate / MS...

  • 262 Views
  • 0 replies
  • 0 kudos
derek_s
by New Contributor
  • 171 Views
  • 0 replies
  • 0 kudos

Field mapping

What’s a good way to map ddiiferent datasets all to a standard set of variables. For example in table 1 there is an ‘user_number’ field. And table 2 has the same field but it’s labeled ‘user_id’. They are both the same, and I want to plug both into a...

  • 171 Views
  • 0 replies
  • 0 kudos
Yogic24
by New Contributor III
  • 450 Views
  • 2 replies
  • 1 kudos

Resolved! Regarding Certification renewal process

hi team,Anyone can guide me for certification renewal process?  

  • 450 Views
  • 2 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Yogic24 It's in certification FAQ. https://www.databricks.com/learn/certification/faq#certificationsTo recertify, you will need to take the full current live exam.

  • 1 kudos
1 More Replies
mh-hsn
by New Contributor III
  • 241 Views
  • 0 replies
  • 0 kudos

Inconsistent behavior while loading pickle file

I have a pickle file "vectorizer.pkl" and I am currently facing an inconsistent behavior when trying to load that file. Sometimes it gets loaded successfully and sometimes I face an error. Here is how I am trying to load the file:from joblib import l...

Community Discussions
datascience
machine learning
pickle
python
  • 241 Views
  • 0 replies
  • 0 kudos
mh-hsn
by New Contributor III
  • 753 Views
  • 5 replies
  • 8 kudos

Resolved! python multiprocessing hangs at map on one cluster but works fine on another

I have a simple python script which have been running fine on my cluster but recently the same script gets stuck at map. So I tried creating a new cluster with less resources and tried to run the same script over that and it ran just fine.Here are th...

Community Discussions
datascience
machine learning
MAP
multiprocessing
python
  • 753 Views
  • 5 replies
  • 8 kudos
Latest Reply
jacovangelder
Contributor III
  • 8 kudos

I agree with @raphaelblg. Most likely you're running out of memory. Multiprocessing or threadpools unfortunately do not benefit from extra workers as they only run on your driver node. This is very annoying and not a very known fact. Spark driver als...

  • 8 kudos
4 More Replies
nileshtiwaari
by New Contributor
  • 629 Views
  • 1 replies
  • 1 kudos

Streaming Query

How to remove duplicates in streaming query on the basis of some id?

  • 629 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@nileshtiwaari Are you refering to Strucutred Streaming or DLT?In case of Structured Streaming: https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#streaming-deduplicationAbout DLT, here's a thread from a couple of months...

  • 1 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Top Kudoed Authors