Community Articles

by data_turtle • New Contributor

04-13-2024 12:24:51 PM

1510 Views
0 replies
0 kudos

Feedback request for Gradient, a tool to help optimize and monitor jobs automatically

Hi Everyone,We built Gradient, a tool to automatically optimize and monitor Databricks jobs to hit your business objectives of cost or runtime.Gradient works by applying a reinforcement ML model to automatically learn and custom tune your jobs cluste...

Community Articles

Reply

1510 Views
0 replies
0 kudos

04-13-2024 12:24:51 PM

by data_turtle • New Contributor

04-13-2024 12:16:32 PM

1162 Views
0 replies
0 kudos

Understand why your jobs' performances are changing over time

Hi Folks -We released a new metrics view for databricks jobs in Gradient, which helps track and plot the metrics below over time to help engineers understand what's going on with their jobs over time.Job cost (DBU + Cloud fees)Job RuntimeNumber of co...

Community Articles

Reply

1162 Views
0 replies
0 kudos

04-13-2024 12:16:32 PM

by Danny_Lee • Databricks Partner

04-07-2024 6:22:37 AM

3770 Views
1 replies
1 kudos

Jonathan Frankel at Sigma talk

Hi @Sujitha Just to follow up on your suggestion to share my takeaways from Jonathan Frankel's talk at Sigma in NYC. The key ideas I came away with is:Building in-house custom models is more than just possible, there's advantages to itThere's danger...

Community Articles

AI

ML

Reply

3770 Views
1 replies
1 kudos

04-07-2024 6:22:37 AM

View Replies

Latest Reply

Sujitha
Databricks Employee

04-07-2024 9:05:58 PM

1 kudos

@Danny_Lee This is super insightful! Really appreciate your time to share your key takeaways with us.

1 kudos

04-07-2024 9:05:58 PM

by Danny_Lee • Databricks Partner

03-31-2024 6:47:52 PM

2196 Views
0 replies
1 kudos

Databricks AI Security Framework

Today Databricks announced the release of the Databricks AI Security Framework (LinkedIn Post)You can download the paper (PDF) from blog post. Anyone else download this and have thoughts? My first thought is its a great start and has an excellent G...

Community Articles

Reply

2196 Views
0 replies
1 kudos

03-31-2024 6:47:52 PM

by avrm91 • Databricks Partner

03-26-2024 3:06:59 AM

1871 Views
0 replies
0 kudos

GCP - Initial External Location to GCP Bucket is wrong

When creating a new Workspace in GCP the default GCP External Location is wrong.Its easily fixed by Catalog (on the left) > External Data (on the bottom) > External Locations > choose the connection and edit the URL by deleting the second BucketId af...

Community Articles

Reply

1871 Views
0 replies
0 kudos

03-26-2024 3:06:59 AM

by Hubert-Dudek • Databricks MVP

03-26-2024 2:35:49 AM

2092 Views
0 replies
0 kudos

Predictive optimization log

After you enable predictive optimization, it is good to look at the system table and see what is going on with your tables #databricks

Community Articles

Reply

2092 Views
0 replies
0 kudos

03-26-2024 2:35:49 AM

by MichTalebzadeh • Valued Contributor

03-22-2024 9:45:27 AM

2274 Views
0 replies
0 kudos

Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

I created this article in Linkedlin to allow both this community and Apache Spark user community to have access to it.It is particularly useful for data engineers who want to have a basic understanding of what Generative AI with Spark can do.Leverag...

Community Articles

Generative AI

spark

Reply

2274 Views
0 replies
0 kudos

03-22-2024 9:45:27 AM

by Hubert-Dudek • Databricks MVP

03-14-2024 5:23:06 AM

4400 Views
1 replies
3 kudos

DBR 15.0 beta

databricks runtime 15 is out there!Some breaking changes. More info here https://docs.databricks.com/en/release-notes/runtime/15.0.html

Community Articles

Reply

4400 Views
1 replies
3 kudos

03-14-2024 5:23:06 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

03-20-2024 2:18:34 PM

3 kudos

Thanks for sharing this information @Hubert-Dudek!!!

3 kudos

03-20-2024 2:18:34 PM

by Hubert-Dudek • Databricks MVP

03-16-2024 11:24:41 AM

4198 Views
1 replies
1 kudos

Notebook IDE

This is an excellent step for #databricks notebooks. Integrated debugger and CLI in notebook terminal is a big step towards a fully functional cloud IDE.

Community Articles

Reply

4198 Views
1 replies
1 kudos

03-16-2024 11:24:41 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

03-20-2024 2:17:35 PM

1 kudos

Thank you for sharing this @Hubert-Dudek!!!

1 kudos

03-20-2024 2:17:35 PM

by MichTalebzadeh • Valued Contributor

03-19-2024 4:40:39 AM

8106 Views
2 replies
0 kudos

Build a machine learning model to detect fraudulent transactions using PySpark's MLlib library

IntroductionFinancial fraud is a significant concern for businesses and consumers alike. I have written about this concern a few times in Linkedlin articles. Machine learning offers powerful tools to combat this issue by automatically identifying sus...

Community Articles

Financial Fraud

PySpark MLlib

spark

Reply

8106 Views
2 replies
0 kudos

03-19-2024 4:40:39 AM

View Replies

Latest Reply

deborah621
New Contributor II

03-20-2024 2:42:27 AM

0 kudos

Looking to build a machine learning model for detecting fraudulent transactions using PySpark’s MLlib. Generate synthetic transaction data. Provides a dataset for model training without using sensitive real-world data. Enables the creation of diverse...

0 kudos

03-20-2024 2:42:27 AM

1 More Replies

by alexgv12 • New Contributor III

03-13-2024 9:33:41 AM

2495 Views
1 replies
2 kudos

is it possible to have a class level separation in databricks or implement a design pattern in datab

if you have thought about making your code inside databricks and notebooks more reusable and organized and you have thought about implementing a design pattern or class level separation in databricks the answer is yes, I am going to tell you the deta...

Community Articles

Reply

2495 Views
1 replies
2 kudos

03-13-2024 9:33:41 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

03-20-2024 1:14:43 AM

2 kudos

tnx! I have spent quite some time on figuring out what the best way is. Your approach is certainly a valid one.Myself I prefer to package reused classes in a jar (we mainly code in scala). Works fine too.

2 kudos

03-20-2024 1:14:43 AM

by MichTalebzadeh • Valued Contributor

03-06-2024 1:09:33 PM

7346 Views
1 replies
1 kudos

Building Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

I recently saw an article from Databricks titled "Scalable Spark Structured Streaming for REST API Destinations". A great article focusing on continuous Spark Structured Streaming (SSS). About a year old. I then decided, given customer demands to wo...

Community Articles

Event-driven architecture

Flask

spark

Spark Structure Streaming

Spark Structured Streaming

Reply

7346 Views
1 replies
1 kudos

03-06-2024 1:09:33 PM

View Replies

by Hubert-Dudek • Databricks MVP

02-22-2024 9:07:20 AM

2919 Views
0 replies
0 kudos

stored procedures

The plan for stored procedures in databricks spark has been announced in a few places. How can stored procedures look in Spark SQL?

Community Articles

spark

Reply

2919 Views
0 replies
0 kudos

02-22-2024 9:07:20 AM

by hanlinsun • Databricks Employee

02-07-2024 5:06:10 PM

1150 Views
0 replies
0 kudos

Redesigned Move File & Clone File Experiences

Hi everyone! We are redesigning the Move File and Clone File experiences. We want to make it as seamless as possible to organize your files, and would love your feedback on the designs! Move File: Move Option 1 Move Option 2: Clone File: Cl...

Community Articles

Reply

1150 Views
0 replies
0 kudos

02-07-2024 5:06:10 PM

by Hubert-Dudek • Databricks MVP

07-20-2023 6:41:29 AM

3214 Views
1 replies
2 kudos

liquid partitioning

Based on my experience with data partitioning, it often diminishes performance rather than enhancing it. There are exceptions, like when handling tables over 1TB, or when EVERY single query utilizes partition in the WHERE clause - for instance, a Pow...

Community Articles

optimize

Partitions

Reply

3214 Views
1 replies
2 kudos

07-20-2023 6:41:29 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

01-10-2024 11:59:28 AM

2 kudos

Thank you for sharing this @Hubert-Dudek !!

2 kudos

01-10-2024 11:59:28 AM

Databricks Community

Forum Posts

Feedback request for Gradient, a tool to help optimize and monitor jobs automatically

Understand why your jobs' performances are changing over time

Jonathan Frankel at Sigma talk

Databricks AI Security Framework

GCP - Initial External Location to GCP Bucket is wrong

Predictive optimization log

Feature article: Leveraging Generative AI with Apache Spark: Transforming Data Engineering

DBR 15.0 beta

Notebook IDE

Build a machine learning model to detect fraudulent transactions using PySpark's MLlib library

is it possible to have a class level separation in databricks or implement a design pattern in datab

Building Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

stored procedures

Redesigned Move File & Clone File Experiences

liquid partitioning

9 Powerful 🚀 Spark Optimization Techniques in Dat...

CI/CD on Databricks with Asset Bundles (DABs) and ...

Custom asset bundles file name

Designing a Cost-Efficient Databricks Lakehouse, P...

Data Driven AI Roadmap Databricks Governance Best ...