Databricks vs. Competitors: Key Features
Hi everyone,Can anyone share insights on the key features that differentiate Databricks from its competitors? Looking forward to your thoughts!Thanks!
- 2327 Views
- 0 replies
- 0 kudos
Hi everyone,Can anyone share insights on the key features that differentiate Databricks from its competitors? Looking forward to your thoughts!Thanks!
Hello,we are running a workflow as a service principal, that is a aad managed identity. This does result in the issue: run databricks workflow as service principal the reads from azure dev ops repo Failed to checkout Git repository: PERMISSION_DENIED...
We managed to solve this problem, however it is not an elegant solution. Databricks should simplify this.The steps that have to be done are listed below. We are using user assigned managed identity (MI), but I assume this should work for Azure Servic...
Are there any recommended practices to set cluster auto termination for cost optimization?
Hello @dataailearner , Greetings of the day! Here are a few steps that you can follow for cost optimizations: 1. Choose the most efficient compute size: Databricks runs one executor per worker node. The total number of cores across all executors is a...
Due to the limitations with all output data needing to be stored in one target we have stopped using DLT until more flexibility is added. If anyone has a workaround we are open to suggestions.
Hi Zavi,One potential workaround is to establish multiple DLT pipelines, with each pipeline specifically configured to point to a unique target. This approach effectively allows for a diverse range of output data to be stored across various targets.T...
Hello all, I have created a custom model serving endpoint in Azure databricks. This endpoint connects with the AzureopenAI model and Azure postgres connection.All of these Azure services are with Private endpoints. When I run this notebook,I am able ...
Hi all,I've been trying to sync my VSCode IDE with our Databricks GCP workspace using the Databricks extension. I am able to connect authenticate my account and workspace and find our clusters. However, when I try to sync a destination it throws a st...
@Retired_mod thanks for you response.I am not running through a proxy. At least, not on purpose. How do I know if I am running through a proxy? And where can I find the values of <proxy_url> and <port> so that I can try restarting my VSCode.I have tr...
I created a table in databricks using a dbt model pre hook CREATE TABLE IF NOT EXISTS accounts ( account_id BIGINT GENERATED ALWAYS AS IDENTITY, description STRING other columns)I use the same dbt model to merge values into this table in the post...
While downloading and installing ucx from a shell code,I am facing the below error. Can anyone provide a solution[i] Creating isolated Virtualenv with Python: /c/Program Files/Python312/pythonActual environment location may have moved due to redirect...
Hi,I am trying to fetch CPU and memory details from Databricks. Are there any APIs present to which I can connect using postman and fetch these details?
Trying to create a metastore that will be connected to an external storage (ADLS) but we don't have the option to create a new metastore in 'Catalog' tab in the UI. Based on some research, we see that we'll have to go into "Manage Account" and then c...
I have been wrestling with this question for days now. I seem to be the only one with this question so I am sure I am doing something wrong. I am trying to create a UC metastore but there is not an option in "Catalog" to create a metastore. This s...
I'm encountering an issue in my .gitlab-ci.yml file when attempting to execute databricks bundle deploy -t prod. The error message I receive is: Error: Request failed for POST <path>/state/deploy.lockInterestingly, when I run the same command locally...
Hi , We are trying to load data from a delta table to a dataframe(a copy of original table) . Initially delta table has count 911 . The dataframe in which the data is loaded also has the same count .Now, we are deleting some records from the delta...
Hi, There is a way to retain the copy of data frame, even if the data in underling table is manipulated but that's a memory expensive operation, be careful while using it.df1 = spark.createDataFrame(df.rdd.map(lambda x: x), schema=df.schema)Here we a...
Team,Initially our team created the databases with the environment name appended. Ex: cust_dev, cust_qa, cust_prod.I am looking to standardize the database name as consistent name across environments. I want to rename to "cust". All of my tables are ...
You can also use “CASCADE” to drop schema and tables as well. It is recursive.
Hello! We are integrating with Databricks and we get the API key, workspace ID, and host from our users in order to connect to Databricks. We need the to validate the workspace ID because we do need it outside of the context of the API key (with webh...
Hi mates!I'n my company, we are moving our pipelines to Databricks bundles, our pipelines use a notebook that receives some parameters.This notebook uses a custom python package to apply the business logic based on the parameters that receive.The thi...
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up NowUser | Count |
---|---|
133 | |
115 | |
56 | |
42 | |
34 |