Databricks Community

Suheb

How can I build an AI system (using Databricks) where the AI model doesn’t just rely on its built-in knowledge, but also retrieves real information from a database or documents before answering?

Suheb

When moving your big data pipelines from local servers to Databricks, what problems usually happen, and how did you fix them?

Suheb

If you are building data pipelines in Databricks (where data is Extracted, Transformed, and Loaded), what tips, methods, or best practices do you use to make those pipelines run faster, cheaper, and more efficiently?

Suheb

When should I use a single-user cluster, and when should I use a multi-user/shared cluster? What’s the difference and how do I pick the right one?

Suheb

How do I start in Databricks by creating a notebook and use it to run a simple data processing task (a Spark job)?

Suheb

Databricks Repos make collaborative development easy by connecting notebooks to Git. You can work on branches, track changes, and sync with your team. Plus, they integrate with CI/CD pipelines, allowing automated testing and deployment of notebooks o...

Suheb

Developing ETL pipelines in Databricks comes with challenges like managing diverse data sources, optimizing Spark performance, and controlling cloud costs. Ensuring data quality, handling errors, and maintaining security and compliance add complexity...

Suheb

Leveraging Databricks Marketplace and API integrations can significantly streamline app development. By using pre-built datasets, notebooks, and APIs, developers can accelerate data workflows, reduce redundant coding, and ensure seamless integration ...

Suheb

Developing and debugging Spark jobs in Databricks can be challenging due to the distributed nature of Spark and the volume of data processed. To streamline your workflow:Leverage Notebooks for Iterative Development:Use Databricks notebooks to write a...

Databricks Community

User Stats

User Activity

How do I build a retrieval-augmented generation (RAG) pipeline on Databricks — what are the best pra

What are common pitfalls when migrating large on-premise ETL workflows to Databricks and how did you

What strategies have you found most effective for optimizing ETL pipelines built on the Databricks L

How do I choose between a standard cluster and a shared cluster in Databricks?

How can I create my first notebook and run a Spark job in Databricks?

Re: How to Use Databricks Repos for Collaborative Development and CI/CD Integration?

Re: What Are the Key Challenges in Developing ETL Pipelines Using Databricks?

Re: Streamlining App Development with Databricks Marketplace and API Integrations

Re: Tips for Streamlining Spark Job Development and Debugging in Databricks