cancel
Showing results for 
Search instead for 
Did you mean: 
Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SumitSingh
by Contributor
  • 2636 Views
  • 6 replies
  • 8 kudos

From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications

In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...

SumitSingh_0-1721402402230.png SumitSingh_1-1721402448677.png SumitSingh_2-1721402469214.png
  • 2636 Views
  • 6 replies
  • 8 kudos
Latest Reply
jem
New Contributor II
  • 8 kudos

This is great! I have worked with Databricks for almost three years and has decided to pursue the Databricks Engineer Professional certification. This will certainly help setting up an effective plan.

  • 8 kudos
5 More Replies
Emil_Kaminski
by Contributor
  • 8549 Views
  • 3 replies
  • 4 kudos

Materials to pass Databricks Data Engineering Associate Exam

Hi Guys, I have passed it already some time ago, but just recently have summarized all the materials which helped me to do it. Pay special attention to GitHub repository, which contains many great exercises prepared by Databricks teamhttps://youtu.be...

  • 8549 Views
  • 3 replies
  • 4 kudos
Latest Reply
Charles_Wuds
New Contributor
  • 4 kudos

I’m now certified! Huge thanks to ds for their excellent practice questions. Their detailed questions were spot-on and so helpful. Highly satisfied!

  • 4 kudos
2 More Replies
WarrenO
by New Contributor
  • 40 Views
  • 1 replies
  • 1 kudos

Resolved! Log Custom Transformer with Feature Engineering Client

Hi everyone,I'm building a Pyspark ML Pipeline where the first stage is to fill nulls with zero. I wrote a custom class to do this since I cannot find a Transformer that will do this imputation. I am able to log this pipeline using ML Flow log model ...

Knowledge Sharing Hub
Custom Transformer
feature engineering
ML FLow
pipeline
pyspark
  • 40 Views
  • 1 replies
  • 1 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 1 kudos

Hi @WarrenO , thanks for sharing that with the detailed code! I was able to reproduce the error, specifically the following error: AttributeError: module '__main__' has no attribute 'CustomAdder'File <command-1315887242804075>, line 3935 evaluator = ...

  • 1 kudos
himoshi
by New Contributor II
  • 3468 Views
  • 3 replies
  • 0 kudos

Error code 403 - Invalid access to Org

I am trying to make a GET /api/2.1/jobs/list call in a Notebook to get a list of all jobs in my workspace but am unable to do so due to a 403 "Invalid access to Org" error message. I am using a new PAT and the endpoint is correct. I also have workspa...

  • 3468 Views
  • 3 replies
  • 0 kudos
Latest Reply
xmad2772
New Contributor
  • 0 kudos

Hey did you make any progress on the error? I'm experiencing the same in my environment. Thanks! 

  • 0 kudos
2 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 922 Views
  • 1 replies
  • 1 kudos

📊 Simplifying CDC with Databricks Delta Live Tables & Snapshots 📊

In the world of data integration, synchronizing external relational databases (like Oracle, MySQL) with the Databricks platform can be complex, especially when Change Data Feed (CDF) streams aren’t available. Using snapshots is a powerful way to mana...

Pull-Based Snapshots.png
  • 922 Views
  • 1 replies
  • 1 kudos
Latest Reply
BilalHaniff1
New Contributor II
  • 1 kudos

Hi AjayCan apply changes into snapshot handle re-processing of an older snapshot? UseCase:- Source has delivered data on day T, T1 and T2.  - Consumers realise there is an error on the day T data, and make a correction in the source. The source redel...

  • 1 kudos
ChsAIkrishna
by New Contributor III
  • 683 Views
  • 1 replies
  • 2 kudos

Consideration Before Migrating Hive Tables to Unity Catalog

Databricks recommends four methods to migrate Hive tables to Unity Catalog, each with its pros and cons. The choice of method depends on specific requirements.SYNC: A SQL command that migrates schema or tables to Unity Catalog external tables. Howeve...

highresrollsafe piz.PNG
  • 683 Views
  • 1 replies
  • 2 kudos
Latest Reply
Mantsama4
Contributor III
  • 2 kudos

This is a great solution! The post effectively outlines the methods for migrating Hive tables to Unity Catalog while emphasizing the importance of not just performing a simple migration but transforming the data architecture into something more robus...

  • 2 kudos
MichTalebzadeh
by Valued Contributor
  • 2347 Views
  • 3 replies
  • 3 kudos

Resolved! Feature Engineering for Data Engineers: Building Blocks for ML Success

For a  UK Government Agency, I made a Comprehensive presentation titled " Feature Engineering for Data Engineers: Building Blocks for ML Success".  I made an article of it in Linkedlin together with the relevant GitHub code. In summary the code delve...

Knowledge Sharing Hub
feature engineering
ML
python
  • 2347 Views
  • 3 replies
  • 3 kudos
Latest Reply
Mantsama4
Contributor III
  • 3 kudos

This is a fantastic post! The detailed explanation of feature engineering, from handling missing values to using Variational Autoencoders (VAEs) for synthetic data generation, provides invaluable insights for improving machine learning models. The ap...

  • 3 kudos
2 More Replies
Harun
by Honored Contributor
  • 5310 Views
  • 3 replies
  • 5 kudos

Comprehensive Guide to Databricks Optimization: Z-Order, Data Compaction, and Liquid Clustering

Optimizing data storage and access is crucial for enhancing the performance of data processing systems. In Databricks, several optimization techniques can significantly improve query performance and reduce costs: Z-Order Optimize, Optimize Compaction...

  • 5310 Views
  • 3 replies
  • 5 kudos
Latest Reply
Mantsama4
Contributor III
  • 5 kudos

I also have the same question!

  • 5 kudos
2 More Replies
Mantsama4
by Contributor III
  • 219 Views
  • 0 replies
  • 0 kudos

Rebuilding and Re-Platforming Your Databricks Lakehouse with Serverless Compute

Dear Databricks Community,In today’s fast-paced data landscape, managing infrastructure manually can slow down innovation, increase costs, and limit scalability. Databricks Serverless Compute solves these challenges by eliminating infrastructure over...

  • 219 Views
  • 0 replies
  • 0 kudos
SumitSingh
by Contributor
  • 2636 Views
  • 6 replies
  • 8 kudos

From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications

In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...

SumitSingh_0-1721402402230.png SumitSingh_1-1721402448677.png SumitSingh_2-1721402469214.png
  • 2636 Views
  • 6 replies
  • 8 kudos
Latest Reply
jem
New Contributor II
  • 8 kudos

This is great! I have worked with Databricks for almost three years and has decided to pursue the Databricks Engineer Professional certification. This will certainly help setting up an effective plan.

  • 8 kudos
5 More Replies
nk25
by New Contributor II
  • 774 Views
  • 3 replies
  • 0 kudos

Getting data from Databricks into Excel using Databricks Jobs API

If you have your data in Databricks, but like to analyse it in Excel, you can use Web API on Power Query. It allows you to not just query an existing table, but also trigger the execution of a PySpark notebook using Databricks Jobs API, and get the d...

  • 774 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Got it, yes you have specified the same in your message. Thanks for sharing.

  • 0 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group