- 3984 Views
- 7 replies
- 12 kudos
From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications
In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...
- 3984 Views
- 7 replies
- 12 kudos
- 12 kudos
As an additional tip for those working towards both the Associate and Professional certifications, I recommend avoiding a long gap between the two exams to maintain your momentum. If possible, try to schedule them back-to-back with just a few days in...
- 12 kudos
- 739 Views
- 0 replies
- 1 kudos
Use Query Patterns to Suggest Indexes Dynamically
Hey folks,Ever notice how a query that used to run super fast suddenly starts dragging? We’ve all been there. As data grows, those little inefficiencies in your SQL start showing up — and they show up hard. That’s where something cool comes in: using...
- 739 Views
- 0 replies
- 1 kudos
- 3772 Views
- 6 replies
- 4 kudos
My Journey with Schema Management in Databricks
When I first started handling schema management in Databricks, I realized that a little bit of planning could save me a lot of headaches down the road. Here’s what I’ve learned and some simple tips that helped me manage schema changes effectively. On...
- 3772 Views
- 6 replies
- 4 kudos
- 4 kudos
Haha, glad it made sense! Joao.Try it out, and if you run into any issues, just let me know. Always happy to help! And best friends? You got it!
- 4 kudos
- 613 Views
- 0 replies
- 1 kudos
Unit Testing for Data Engineering: How to Ensure Production-Ready Data Pipelines
In today’s data-driven world, the success of any business use case relies heavily on trust in the data. This trust is built upon key pillars such as data accuracy, consistency, freshness, and overall quality. When organizations release data into prod...
- 613 Views
- 0 replies
- 1 kudos
- 657 Views
- 0 replies
- 0 kudos
The Future of Data Engineering: Smarter, Faster, and More Automated
Data Engineering has come a long way. From the days of manual ETL scripts to the modern world of automated, AI-driven data pipelines, the evolution has been nothing short of fascinating. As a data engineer working across various platforms, I’ve seen ...
- 657 Views
- 0 replies
- 0 kudos
- 3984 Views
- 7 replies
- 12 kudos
From Associate to Professional: My Learning Plan to ace all Databricks Data Engineer Certifications
In today’s data-driven world, the role of a data engineer is critical in designing and maintaining the infrastructure that allows for the efficient collection, storage, and analysis of large volumes of data. Databricks certifications holds significan...
- 3984 Views
- 7 replies
- 12 kudos
- 12 kudos
As an additional tip for those working towards both the Associate and Professional certifications, I recommend avoiding a long gap between the two exams to maintain your momentum. If possible, try to schedule them back-to-back with just a few days in...
- 12 kudos
- 439 Views
- 0 replies
- 0 kudos
Optimizing Complex, Embedded Workflows with Databricks Cluster Pool
Managing complex, embedded workflows efficiently is a key challenge for enterprise architects. As organizations scale their data ecosystems, optimizing resource allocation becomes crucial. Databricks Cluster Pools offer a strategic solution to minimi...
- 439 Views
- 0 replies
- 0 kudos
- 5404 Views
- 2 replies
- 1 kudos
Optimizing Costs in Databricks by Dynamically Choosing Cluster Sizes
Databricks is a popular unified data analytics platform known for its powerful data processing capabilities and seamless integration with Apache Spark. However, managing and optimizing costs in Databricks can be challenging, especially when it comes ...
- 5404 Views
- 2 replies
- 1 kudos
- 1 kudos
How can this actually be used to choose a cluster pool for a Databricks workflow dynamically, that is, at run time? In other words, what can you actually do with the value of `selected_pool` other than printing it out?
- 1 kudos
- 386 Views
- 0 replies
- 1 kudos
Migrating from MySQL to Databricks: Real-time Insights That Matter
We successfully migrated a client’s MySQL databases to DB using a dual-approach that maintained 100% data integrity while enabling real-time analytics.After struggling with batch-based updates and analytics delays, we implemented:- One-time historica...
- 386 Views
- 0 replies
- 1 kudos
- 4170 Views
- 8 replies
- 6 kudos
Library Management via Custom Compute Policies and ADF Job Triggering
This guide is intended for those looking to install libraries on a cluster using a Custom Compute Policy and trigger Databricks jobs from an Azure Data Factory (ADF) linked service. While many users rely on init scripts for library installation, it i...
- 4170 Views
- 8 replies
- 6 kudos
- 6 kudos
Hi @hassan2 I had same issue and found solution.When I created POOL i created it as On-demand (not spot) and then policy only worked when I removed entire section "azure_attributes.spot_bid_max_price" from policy.Looks like "azure_attributes.spot_bi...
- 6 kudos
- 2798 Views
- 1 replies
- 1 kudos
Resolved! Log Custom Transformer with Feature Engineering Client
Hi everyone,I'm building a Pyspark ML Pipeline where the first stage is to fill nulls with zero. I wrote a custom class to do this since I cannot find a Transformer that will do this imputation. I am able to log this pipeline using ML Flow log model ...
- 2798 Views
- 1 replies
- 1 kudos
- 1 kudos
Hi @WarrenO , thanks for sharing that with the detailed code! I was able to reproduce the error, specifically the following error: AttributeError: module '__main__' has no attribute 'CustomAdder'File <command-1315887242804075>, line 3935 evaluator = ...
- 1 kudos
- 4464 Views
- 3 replies
- 0 kudos
Error code 403 - Invalid access to Org
I am trying to make a GET /api/2.1/jobs/list call in a Notebook to get a list of all jobs in my workspace but am unable to do so due to a 403 "Invalid access to Org" error message. I am using a new PAT and the endpoint is correct. I also have workspa...
- 4464 Views
- 3 replies
- 0 kudos
- 0 kudos
Hey did you make any progress on the error? I'm experiencing the same in my environment. Thanks!
- 0 kudos
- 287 Views
- 0 replies
- 0 kudos
The Hidden Security Risks in Stored Procedure Migrations—What Databricks Exposed
Your stored procedure migration to DB isn't just a 'copy-paste' job - it's a security nightmare waiting to happen.We discovered our 'trusted' stored procedures had hidden access patterns that nearly compromised our entire data governance model. Here'...
- 287 Views
- 0 replies
- 0 kudos
- 372 Views
- 0 replies
- 1 kudos
The Hidden Pitfalls of Snowflake to Databricks Migrations
Everyone's rushing their Snowflake to Databricks migration, and they're setting themselves up for failure.After leading multiple enterprise migrations to Databricks last quarter, here's what shocked me: The technical lift isn't the hard part. It's th...
- 372 Views
- 0 replies
- 1 kudos
- 1238 Views
- 1 replies
- 1 kudos
📊 Simplifying CDC with Databricks Delta Live Tables & Snapshots 📊
In the world of data integration, synchronizing external relational databases (like Oracle, MySQL) with the Databricks platform can be complex, especially when Change Data Feed (CDF) streams aren’t available. Using snapshots is a powerful way to mana...
- 1238 Views
- 1 replies
- 1 kudos
- 1 kudos
Hi AjayCan apply changes into snapshot handle re-processing of an older snapshot? UseCase:- Source has delivered data on day T, T1 and T2. - Consumers realise there is an error on the day T data, and make a correction in the source. The source redel...
- 1 kudos
- 965 Views
- 1 replies
- 4 kudos
Consideration Before Migrating Hive Tables to Unity Catalog
Databricks recommends four methods to migrate Hive tables to Unity Catalog, each with its pros and cons. The choice of method depends on specific requirements.SYNC: A SQL command that migrates schema or tables to Unity Catalog external tables. Howeve...
- 965 Views
- 1 replies
- 4 kudos
- 4 kudos
This is a great solution! The post effectively outlines the methods for migrating Hive tables to Unity Catalog while emphasizing the importance of not just performing a simple migration but transforming the data architecture into something more robus...
- 4 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
ADF Linked Service
1 -
ADF Pipeline
1 -
Advanced Data Engineering
3 -
ApacheSpark
1 -
Automation
1 -
AWS
1 -
Azure databricks
1 -
Azure devops integration
1 -
AzureDatabricks
1 -
Big data
1 -
CICDForDatabricksWorkflows
1 -
Cluster
1 -
Cluster Pools
1 -
Cost Optimization Effort
1 -
custom compute policy
1 -
CustomLibrary
1 -
Data
1 -
Data Engineering
1 -
Data Mesh
1 -
Data Processing
1 -
Databricks Community
1 -
Databricks Delta Table
1 -
Databricks Demo Center
1 -
Databricks Migration
1 -
Databricks Mlflow
1 -
Databricks spark
1 -
Databricks Support
1 -
Databricks Unity Catalog
2 -
Databricks Workflows
1 -
DatabricksML
1 -
DeepLearning
1 -
Delta Lake
4 -
Delta Time Travel
1 -
Devops
1 -
DimensionTables
1 -
Dns
1 -
Dynamic
1 -
Governance
1 -
Hive metastore
1 -
Library Installation
1 -
Medallion Architecture
1 -
MSExcel
1 -
Networking
1 -
Private Link
1 -
Pyspark Code
1 -
Pyspark Databricks
1 -
Pytest
1 -
Python
1 -
Scala Code
1 -
Serverless
1 -
Spark
5 -
SparkSQL
1 -
SQL Serverless
1 -
Support Ticket
1 -
Sync
1 -
Unit Test
1 -
Unity Catalog
3 -
Unity Catlog
1 -
Workflow Jobs
1 -
Workflows
2
- « Previous
- Next »
User | Count |
---|---|
40 | |
14 | |
11 | |
10 | |
8 |