Discussions - Databricks Community

Register to join the community

Discussions

Engage in dynamic conversations covering diverse topics within the Databricks Community. Explore discussions on data engineering, machine learning, and more. Join the conversation and expand your knowledge base with insights from experts and peers.

Browse the Community

Databricks Platform Discussions

Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...

13408 Posts

Community Discussions

Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...

3620 Posts

Activity in Discussions

Sorted by:

Start a conversation

by NielsMH > • New Contributor III

11-12-2024 5:13:32 AM

888 Views
2 replies
0 kudos

spark_session invocation from executor side error, when using sparkXGBregressor and fe client

Hi I have created a model and pipeline using xgboost.spark's sparkXGBregressor and pyspark.ml's Pipeline instance. However, i run into a "RuntimeError: _get_spark_session should not be invoked from executor side." when i try to save the predictions i...

Machine Learning

888 Views
2 replies
0 kudos

11-12-2024 5:13:32 AM

Latest Reply

TrevorL
Visitor

54m ago

0 kudos

Did you ever find a resolution to this? I've been running into the same error with a Spark XGBoost classification model, and haven't had any success in finding a solution. Setting it to a pyfunc model in logging resulted in an error, and clearly you ...

0 kudos

54m ago

by vziog > • Visitor

an hour ago

7 Views
0 replies
0 kudos

Costs from cost managem azure portal are not allligned with costs calculated from usage system table

Hello,the costs regarding the databricks service from cost management in azure portal (45,869...) are not allligned with costs calculated from usage system table (75,34). The costs from the portal are filtered based on the desired period (usage_date ...

Data Engineering

7 Views
0 replies
0 kudos

an hour ago

by srushti1729 > • New Contributor

an hour ago

13 Views
0 replies
0 kudos

Databricks Data engineer associate certificate

Hi, I have completed the assessment, Databricks Certified Data Engineer Associate on 15 April 2025 but i haven't received any certificate yet it's been more than 48hours.

13 Views
0 replies
0 kudos

an hour ago

by smpa01 > • New Contributor

2 hours ago

8 Views
0 replies
0 kudos

Debugging jobs/run-now endpoint

I am not being able to run jobs/runnow endpoint. I am getting an error asError fetching files: 403 - {"error_code":"PERMISSION_DENIED","message":"User xxxx-dxxxx-xxx-xxxx does not have Manage Run or Owner or Admin permissions on job 437174060919465",...

Data Engineering

8 Views
0 replies
0 kudos

2 hours ago

by GregTyndall > • New Contributor II

12-03-2024 5:43:50 AM

623 Views
3 replies
0 kudos

Resolved! Materialized View Refresh - NUM_JOINS_THRESHOLD_EXCEEDED?

I have a very basic view with 3 inner joins that will only do a full refresh. Is there a limit to the number of joins you can have and still get an incremental refresh?"incrementalization_issues": [{"issue_type": "INCREMENTAL_PLAN_REJECTED_BY_COST_MO...

Data Engineering

623 Views
3 replies
0 kudos

12-03-2024 5:43:50 AM

Latest Reply

PotnuruSiva
Databricks Employee

12-04-2024 6:07:17 AM

0 kudos

@GregTyndall Yes, the current limit is 2 by default. But we can increase up to 5 with the below flag added to the pipeline settings. pipelines.enzyme.numberOfJoinsThreshold 5

0 kudos

12-04-2024 6:07:17 AM

by Christian_C > • Visitor

2 hours ago

8 Views
0 replies
0 kudos

Google Pub Sub and Delta live table

I am using delta live table and pub sub to ingest message from 30 different topics in parallel. I noticed that initialization time can be very long around 15 minutes. Does someone knows how to reduced initialization time in dlt ? Thanks You

Data Engineering

8 Views
0 replies
0 kudos

2 hours ago

by PabloCSD > • Valued Contributor

yesterday

36 Views
1 replies
0 kudos

How to paralellize using R in Databricks notebook?

Hi!I'm using an R library, but it is only using one node, is there a way to paralellize it?Thanks in advance!

Machine Learning

36 Views
1 replies
0 kudos

yesterday

Latest Reply

BigRoux
Databricks Employee

2 hours ago

0 kudos

To parallelize computations in R while using a Databricks environment, you can utilize two main approaches: SparkR or sparklyr. Both allow you to run R code in a distributed manner across multiple nodes in a cluster. Hope this helps. Louis.

0 kudos

2 hours ago

by Sri2025 > • Visitor

yesterday

35 Views
1 replies
0 kudos

Not able to run end to end ML project on Databricks Trial

I started using Databricks trial version from today. I want to explore full end to end ML lifecycle on the databricks. I observed for the compute only 'serverless' option is available. I was trying to execute the notebook posted on https://docs.datab...

Machine Learning

35 Views
1 replies
0 kudos

yesterday

Latest Reply

BigRoux
Databricks Employee

2 hours ago

0 kudos

I can take up to 15 minutes for the serving endpoint to be created. Once you initiate the "create endpoint" chunk of code go and grab a cup of coffee and wait 15 minutes. Then, before you use it verify it is running (bottom left menu "Serving") by g...

0 kudos

2 hours ago

by Chris_sh > • New Contributor II

10-25-2023 11:00:07 AM

2228 Views
2 replies
1 kudos

[STREAMING_TABLE_OPERATION_NOT_ALLOWED.REQUIRES_SHARED_COMPUTE]

Currently trying to refresh a Delta Live Table using a Full Refresh but an error keeps coming up saying that we have to use a shared cluster or a SQL warehouse. I've tried both a shared cluster and a SQL warehouse and the same error keeps coming up. ...

Data Engineering

2228 Views
2 replies
1 kudos

10-25-2023 11:00:07 AM

Latest Reply

BigRoux
Databricks Employee

2 hours ago

1 kudos

You are not using "No Isolation Shared" mode, right? Also, can you share the chunk of code that is causing the failure? Thanks, Louis.

1 kudos

2 hours ago

by kumarsuresh > • New Contributor III

06-12-2024 9:50:50 PM

4825 Views
8 replies
0 kudos

Gen AI course material

Databricks updated the Generative AI course https://partner-academy.databricks.com/learn/lp/315/generative-ai-engineering-pathway but the course material is missing in the partner academy. Does anybody know where to download the course material?

4825 Views
8 replies
0 kudos

06-12-2024 9:50:50 PM

Latest Reply

fazma
Visitor

3 hours ago

0 kudos

do you have any update on why course material is not appearing in partner acadamy course version?

0 kudos

3 hours ago

by abin-bcgov > • Visitor

yesterday

68 Views
2 replies
1 kudos

using Azure Databricks vs using Databricks directly

Hi friends,A quick question regarding how data, workspace controls works while using "Azure Databricks". I am planning to use Azure Databricks that comes as part of my employer's Azure Subscriptions. I work for a Public sector organization, which is ...

Get Started Discussions

68 Views
2 replies
1 kudos

yesterday

Latest Reply

abin-bcgov
Visitor

3 hours ago

1 kudos

@SP_6721 - Thanks for the reply. But a small confusion on your last statement, "While Databricks manages the backend, your data and workloads stay fully inside Azure and your chosen region". As per my analysis on Azure Databricks, none of artifacts l...

1 kudos

3 hours ago

by cs_de > • Visitor

4 hours ago

13 Views
0 replies
0 kudos

How do I deploy or run one job if I have multiple jobs in a Databricks Asset Bundle?

How do I deploy or run a single job if I have 2 or more jobs defined in my asset bundle?$databricks bundle deploy job1 #does not work I do not see a flag to identify what job to run.

Data Engineering

13 Views
0 replies
0 kudos

4 hours ago

by AnaMocanu > • Contributor

02-20-2025 12:57:38 PM

594 Views
5 replies
2 kudos

How do you add comment metadata to delta live tables?

Hey all,We've got a bunch of business objects we created as delta live tables. these are important as that's what business users will use in dashboards, genie rooms etc. We're trying to enrich the metadata for those but the option is greyed out and I...

Data Governance

594 Views
5 replies
2 kudos

02-20-2025 12:57:38 PM

Latest Reply

joshuat
New Contributor III

4 hours ago

2 kudos

Thanks, but that doesn't answer the question ashraf1395. OP's question was how to create/update the field comments in the Catalog GUI. Also note that I've populated the `comment=` syntax in my Python DLT definition, but the table description or crea...

2 kudos

4 hours ago

by guest0 > • New Contributor II

a week ago

517 Views
6 replies
3 kudos

Spark UI Simulator Not Accessible

Hello,The Spark UI Simulator is not accessible since the last few days. I was able to refer to it last week, at https://www.databricks.training/spark-ui-simulator/index.html. I already have access to partner academy (if that is any relevant). <Error...

Data Engineering

simulator

spark-ui

517 Views
6 replies
3 kudos

a week ago

Latest Reply

guest0
New Contributor II

4 hours ago

3 kudos

Just a short update: the request I raised was closed saying there is no active support contract with the org (from the email I used) to look into this. Perhaps someone else could try raising a request using the link above.

3 kudos

4 hours ago

by HaripriyaP > • New Contributor

04-17-2024 6:38:07 AM

814 Views
3 replies
0 kudos

Multiple Notebooks Migration from one workspace to another without using Git.

Hi all!I need to migrate multiple notebooks from one workspace to another. Is there any way to do it without using Git?Since Manual Import and Export is difficult to do for multiple notebooks and folders, need an alternate solution.Please reply as so...

Data Engineering

814 Views
3 replies
0 kudos

04-17-2024 6:38:07 AM

Latest Reply

RiyazAli
Valued Contributor II

5 hours ago

0 kudos

Hello @HaripriyaP , @rabia_farooq check for the databricks cli documentation here - https://docs.databricks.com/aws/en/dev-tools/cli/commandsYou can work with databricks workflow <command_such_as_export> and move the notebooks/code files around.Cheer...

0 kudos

5 hours ago

Top Kudoed Authors

User

Count

1818

890

472

314

313

Featured Posts

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Business Intelligence in the Era of AI