Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
Here's your Data + AI Summit 2024 - Warehousing & Analytics recap as you use intelligent data warehousing to improve performance and increase your organization’s productivity with analytics, dashboards and insights.
Keynote: Data Warehouse presente...
Hey Support Team,We are experiencing an issue with our Databricks warehouse where the auto-stop feature does not seem to be working as expected. Despite setting an idle timeout, the warehouse continues running after the configured auto-stop time has ...
We were able to diagnose and resolve the problem. The problem was caused due to a cube.js JDBC connection repeatedly connecting to our Databricks SQL Warehouse. Databricks SQL Warehouse does not scale down when there are repeated new connections made...
I have applied row-level security (RLS) on the department column in the table so that users can only see data related to their own department. The security policy works perfectly when I query the table in Databricks SQL.Now, I have built a Databricks...
Hi @Akshay_Petkar , which credential mode did you use to publish the dashboard?
In order to make access control work based on the viewer's credential, "Don't embed credentials" mode should be used. If you published the dashboard with "Embed credentia...
Dear all,(migrating for an on-premise Oracle ...)The question is in the subject: "What is the equivalent of Oracle's CLOB in Databricks" ?I saw that the "string" type can go up to 50 thousands characters, which is quite good in most of our cases, but...
Hello;Thanks for the answer.For the concatenation itself, it is not an issue.My question is "is Databricks supporting something bigger than the 'string' data-type" ? Thanks
Hi all,- migrating from an on-premise Oracle -Currently on Oracle, I have a "library" of let's say 300 tables to load, sequentially, based on views (some tables being fed potentially by several views, therefore the number of underlying views is highe...
The common scenario for data processing in Oracle is based on PL/SQL and cursors. In the case of PySpark we don't have such concept as cursors and iteration on data frames can lead to poor performance.I migrated Oracle to Databricks and I learned tha...
Hello,I maintain a spark plugin with users who are moving from spark to databricks and want to use shared access mode. My library has a bunch of custom datasources and users are seeing errors when using them. Is there a way to write a custom datasour...
Hello @bcb44,
Thanks for your question.
Data Source V2 Relations: These are not supported in clusters that are configured with Table ACL or Credential Passthrough. Can you validate if ACL or Passthrough are enabled on the cluster? Also this should w...
Hi All,Does anyone noticed if you "run all" the first time on the notebook, later if you click "run all below" on a cell, that wouldn't work anymore, and require to click "run all" again.It doesn't happen to me about couple of weeks ago, I used to ru...
Hello @Paddy_chu!
This doesn't happen with every notebook, but it's likely due to cell dependencies. When you Run All, the final cell might modify or clear the data. Later, if you use Run All Below from a middle cell, it may not work if the required ...
Hey all,I have encountered a strange issue while validating my dashboard visual against the source table.The visual shows the count(distinct VAL) per day and when I ran a query that does the same calculation I get a difference of 84 (on average) and ...
Hello,I'm not sure if it is the correct place to post this, sorry.Migrating from an on-premise Oracle to Databricks, we are wondering about the following functionality:. From the reporting tool in place (currenly, PowerBI), users are able to send bac...
Hello Mantu,Thanks for your answer.This is clear, and as our colleague on PowerBI is already dealing with Power Automate, he should be able to test this.If we will be allowed to use Databricks REST API (our infra guys will tell us), I guess we will b...
We are unable to create a Serverless Warehouse Cluster at our own databricks workspace. The same settings do work on other Azure tenants that I have access to.The workspace is running in Azure on a Premium Plan in West Europe.Features enabled:Automat...
Hi-We have a delta table in our unity catalog called dream_team.stern_portfolio.location_info.We are trying to use row level security to filter our data based on a users group membership. This way when users look at out dashboard they can only see th...
Hey @HeathDG1 I think your function isn’t behaving the way you expect because of how the logic is set up:•If the user is in Stern MA, they only see rows where state = 'MA' (which is good).•BUT for everyone else, the function returns true, meaning th...
I would like to know if the cluster size of a Serverless Warehouse can automatically scale up and down, and what determines the number of workers used when executing queries. Does it use all workers within the cluster size fixedly, or does it use par...
@fishingrod My understanding is that Intelligent Workload Management (IWM) in Serverless SQL Warehouses adjusts the number of clusters, but it does not automatically scale the cluster size.This means that if you need to improve the execution performa...
I notice it is very easily to get visualization from sql language inside Databricks. Say you run a SQL query which gives you a table, and you can easily use that table to do its visualization in terms of plots.
How about in Python language when we ...
Yes! You can visualize a Python DataFrame in Databricks easily using: display(df)This works like SQL visualizations, offering built-in charts. For more customization, Matplotlib, Seaborn, or Plotly can be used. Would love to see even more native sup...
Hi all,I have a table named employee in Databricks. I ran the following query to filter out rows where the salary is greater than 25000.This query returns 10 rows. I want to find the size of these 10 rows in bytes, and I would like to calculate or re...
Hi @Akshay_Petkar,
You can try with this query:
SELECT SUM(LENGTH(CAST(employee.* AS STRING))) AS total_size_in_bytesFROM employeeWHERE salary > 25000;
I am using a Shared Databricks Compute and trying to read data from an S3 bucket via an Instance Profile. However, I am encountering the following error: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SELECT on any ...
Hi @vidya_kothavale , Greetings!
Can you please refer to this article and check if it helps you to resolve your issue : https://kb.databricks.com/en_US/data/user-does-not-have-permission-select-on-any-file
Please note that these permissions are only ...