Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...
Hi I have created a model and pipeline using xgboost.spark's sparkXGBregressor and pyspark.ml's Pipeline instance. However, i run into a "RuntimeError: _get_spark_session should not be invoked from executor side." when i try to save the predictions i...
Did you ever find a resolution to this? I've been running into the same error with a Spark XGBoost classification model, and haven't had any success in finding a solution. Setting it to a pyfunc model in logging resulted in an error, and clearly you ...
Hello,the costs regarding the databricks service from cost management in azure portal (45,869...) are not allligned with costs calculated from usage system table (75,34). The costs from the portal are filtered based on the desired period (usage_date ...
Hi, I have completed the assessment, Databricks Certified Data Engineer Associate on 15 April 2025 but i haven't received any certificate yet it's been more than 48hours.
I am not being able to run jobs/runnow endpoint. I am getting an error asError fetching files: 403 - {"error_code":"PERMISSION_DENIED","message":"User xxxx-dxxxx-xxx-xxxx does not have Manage Run or Owner or Admin permissions on job 437174060919465",...
I have a very basic view with 3 inner joins that will only do a full refresh. Is there a limit to the number of joins you can have and still get an incremental refresh?"incrementalization_issues": [{"issue_type": "INCREMENTAL_PLAN_REJECTED_BY_COST_MO...
@GregTyndall Yes, the current limit is 2 by default. But we can increase up to 5 with the below flag added to the pipeline settings. pipelines.enzyme.numberOfJoinsThreshold 5
I am using delta live table and pub sub to ingest message from 30 different topics in parallel. I noticed that initialization time can be very long around 15 minutes. Does someone knows how to reduced initialization time in dlt ? Thanks You
Hi!I'm using an R library, but it is only using one node, is there a way to paralellize it?Thanks in advance!
To parallelize computations in R while using a Databricks environment, you can utilize two main approaches: SparkR or sparklyr. Both allow you to run R code in a distributed manner across multiple nodes in a cluster. Hope this helps. Louis.
I started using Databricks trial version from today. I want to explore full end to end ML lifecycle on the databricks. I observed for the compute only 'serverless' option is available. I was trying to execute the notebook posted on https://docs.datab...
I can take up to 15 minutes for the serving endpoint to be created. Once you initiate the "create endpoint" chunk of code go and grab a cup of coffee and wait 15 minutes. Then, before you use it verify it is running (bottom left menu "Serving") by g...
Currently trying to refresh a Delta Live Table using a Full Refresh but an error keeps coming up saying that we have to use a shared cluster or a SQL warehouse. I've tried both a shared cluster and a SQL warehouse and the same error keeps coming up. ...
You are not using "No Isolation Shared" mode, right? Also, can you share the chunk of code that is causing the failure? Thanks, Louis.
Databricks updated the Generative AI course https://partner-academy.databricks.com/learn/lp/315/generative-ai-engineering-pathway but the course material is missing in the partner academy. Does anybody know where to download the course material?
do you have any update on why course material is not appearing in partner acadamy course version?
Hi friends,A quick question regarding how data, workspace controls works while using "Azure Databricks". I am planning to use Azure Databricks that comes as part of my employer's Azure Subscriptions. I work for a Public sector organization, which is ...
@SP_6721 - Thanks for the reply. But a small confusion on your last statement, "While Databricks manages the backend, your data and workloads stay fully inside Azure and your chosen region". As per my analysis on Azure Databricks, none of artifacts l...
How do I deploy or run a single job if I have 2 or more jobs defined in my asset bundle?$databricks bundle deploy job1 #does not work I do not see a flag to identify what job to run.
Hey all,We've got a bunch of business objects we created as delta live tables. these are important as that's what business users will use in dashboards, genie rooms etc. We're trying to enrich the metadata for those but the option is greyed out and I...
Thanks, but that doesn't answer the question ashraf1395. OP's question was how to create/update the field comments in the Catalog GUI. Also note that I've populated the `comment=` syntax in my Python DLT definition, but the table description or crea...
Hello,The Spark UI Simulator is not accessible since the last few days. I was able to refer to it last week, at https://www.databricks.training/spark-ui-simulator/index.html. I already have access to partner academy (if that is any relevant). <Error...
Just a short update: the request I raised was closed saying there is no active support contract with the org (from the email I used) to look into this. Perhaps someone else could try raising a request using the link above.
Hi all!I need to migrate multiple notebooks from one workspace to another. Is there any way to do it without using Git?Since Manual Import and Export is difficult to do for multiple notebooks and folders, need an alternate solution.Please reply as so...
Hello @HaripriyaP , @rabia_farooq check for the databricks cli documentation here - https://docs.databricks.com/aws/en/dev-tools/cli/commandsYou can work with databricks workflow <command_such_as_export> and move the notebooks/code files around.Cheer...
User | Count |
---|---|
1818 | |
890 | |
472 | |
314 | |
313 |