Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Engage in discussions about the Databricks Free Trial within the Databricks Community. Share insight...
If I grant all privileges in my schema does that automatically give access to users for all underlying objects? Or should I give access seperately for all the objects?
Dear all,(migrating for an on-premise Oracle ...)The question is in the subject: "What is the equivalent of Oracle's CLOB in Databricks" ?I saw that the "string" type can go up to 50 thousands characters, which is quite good in most of our cases, but...
Hello;Thanks for the answer.For the concatenation itself, it is not an issue.My question is "is Databricks supporting something bigger than the 'string' data-type" ? Thanks
HiI have chosen the default partition size 128 MB. I am reading a 3.8 GB file and checking the size of partition using df.rdd.getNumPartitions() as given below. I find the partition size: 159 MB. Why the partition size after reading the file differ ?...
Using workflows, is there a way to obtain the task name from within a task?EX: I have a workflow with a notebook task. From within that notebook task I would like to retrieve the task name so I can use it for a variety of purposes.Currently, we're re...
Hi @EWhitley,Would {{task.name}} help in getting the current task name?https://docs.databricks.com/en/workflows/jobs/parameter-value-references.htmlPass context about job runs into job t
Hi, When using the `query_history.list` function of the python SDK workspace client the queries that have more than 153,596 characters are truncated.I could not find anywhere in the documentation this limit so I wanted to know if this is documented s...
Hi All,We are facing some performance issue and I need your help to know what could be the best approach to follow here.Existing: For each region, we have view(Reg1_View,Reg2_View..) to pull data from table( we don't have direct access to table).And ...
Does any table hold data of all region 1. if yes. Get a Materialized view created (replacing all_reg_view)2. i see you already tried creating a staging table replacing the all_reg_view. Try creating cluster key along with partition.Cluster key on the...
Hi team,I want to add some shared libs which might be used by many repos, e.g. some util functions which might be used by any repos.1. What is the recommended way to add those libs? E.g. create a separate repo and reference it in another repo?2. How ...
Hi, I am trying to upload a wheel file to Databricks workspace using Azure DevOps release pipeline to use it in the interactive cluster. I tried "databricks workspace import" command, but looks like it does not support .whl files. Hence, I tried to u...
Hi @vvk - The HTTP 403 error typically indicates a permissions issue. Ensure that the SP has the necessary permissions to perform the fs cp operation on the specified path. Verify that the path specified in the fs cp command is correct and that the v...
I'm trying to read in ~500 million small json files into an spark autoloader pipeline, and I seem to be slowed down massively by S3 request limits, so I want to explore using AWS EFS instead. I found this blog post: https://www.databricks.com/blog/20...
Hi @stvayers Please refer to this doc. https://docs.databricks.com/api/workspace/clusters/create It has instructions on how to mount using EFS.
Long story short, I'm not sure if this is an already known problem, but the Auto Stop feature on SQL Warehouses after minutes of inactivity is not working properly.We started using SQL Warehouses more aggressively this December when we scaled up one ...
Is this still being investigated by Databricks? I'm seeing similar behavior that's costing us a lot of money.
Hello,I have the issue that even a query like "select 1" is not finishing. The sql warehouse runs infinite. I have no idea where to look for any issues because in the SPARK UI I cant see any error.What is intresting is that also allpurpose clusters (...
Hi @Bepposbeste1993, Do you have the case ID raised for this issue?
The current UDF implementation stores UDFs in a catalogue.schema location. This requires reference/call to said udf location; example `select my_catalogue.my_schema.my_udf()`. Or have the sql execute from that schema.In Snowflake, UDFs are globally a...
Issue Summary:When running multiple jobs on the same compute cluster, over time, I see an increase in memory utilization that is seemingly never fully released, even when jobs finish. This eventually leads to some jobs stalling out as memory hits the...
I am getting below error while creating external delta table in Databricks, even there is a external location created.[NO_PARENT_EXTERNAL_LOCATION_FOR_PATH] No parent external location was found for path 'abfss://destination@datalakeprojectsid.dfs.co...
@Siddalinga If the path specified during table creation is outside the scope of the external location, you may encounter the [NO_PARENT_EXTERNAL_LOCATION_FOR_PATH] error.Is the external location correctly defined to scope the directory, such as abfss...
Setup UCX from Databricks web terminal For cases when your desktop or laptop will not support the databricks labs install ucx technical requirements such as Python 3.10, administrative access, python package access or network access but your Databric...
I’ve found that using DBR 15.4, Personal Compute, and a Single Node is the most effective setup for installing UCX via web terminal. Note: The Databricks CLI is the recommended and most efficient option for managing the UCX installation. Always try t...
User | Count |
---|---|
1786 | |
841 | |
464 | |
311 | |
298 |