cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Warehousing, Analytics, and BI

Forum Posts

MadelynM
by Databricks Employee
  • 907 Views
  • 0 replies
  • 0 kudos

[Recap] Data + AI Summit 2024 - Warehousing & Analytics | Improve performance and increase insights

Here's your Data + AI Summit 2024 - Warehousing & Analytics recap as you use intelligent data warehousing to improve performance and increase your organization’s productivity with analytics, dashboards and insights.  Keynote: Data Warehouse presente...

Screenshot 2024-07-03 at 10.15.26 AM.png
Warehousing & Analytics
AI BI Dashboards
AI BI Genie
Databricks SQL
  • 907 Views
  • 0 replies
  • 0 kudos
qwerty3
by Contributor
  • 2273 Views
  • 21 replies
  • 3 kudos

Spark dataframe performing poorly

I have huge datasets, transformation, display, print, show are working well on this data when read in a pandas dataframe. But the same dataframe when converted to a spark dataframe, is taking minutes to display even a single row and hours to write th...

  • 2273 Views
  • 21 replies
  • 3 kudos
Latest Reply
gchandra
Databricks Employee
  • 3 kudos

I understand you want it sooner. Did it at least write the data in 10 minutes compared to not writing before? There are more knobs you can tweak like  spark.sql.shuffle.partitions=auto Do you have any index columns in your spatial data that can be us...

  • 3 kudos
20 More Replies
qwerty3
by Contributor
  • 464 Views
  • 3 replies
  • 0 kudos

Unable to obtain count of dataframe

I am unable to obtain a count of a dataframe, it always get stuck at 1 stage, I have tried reducing the size, what can be the issue? How can I read cluster logs to identify the issue? 

  • 464 Views
  • 3 replies
  • 0 kudos
Latest Reply
qwerty3
Contributor
  • 0 kudos

Driver memory is good enough, it is able to handle 90 lakhs data, what I am giving it is definitely less than that, what can I do about skewed data and shuffling?

  • 0 kudos
2 More Replies
Aminsnh
by New Contributor
  • 262 Views
  • 0 replies
  • 0 kudos

Adding customized shortcut keys

Hi all, I need to add a shortcut key for R's pip operator (%>%) to my Databricks notebook. I want the operator to be written in my code snippet when I hold down the shortcut keys (shift + ctrl + m). Is there a straightforward way to add such shortcut...

  • 262 Views
  • 0 replies
  • 0 kudos
DataFarmer
by New Contributor II
  • 4499 Views
  • 4 replies
  • 1 kudos

Resolved! How to let Business Users edit tables in Databricks

Hi Community!I have the requirement that business users shall be able to edit/update tables in Unity Catalog, e.g. master data records, mapping tables. I also want thes actions to be logged for auditing/troubleshooting.Is there any simple solution to...

  • 4499 Views
  • 4 replies
  • 1 kudos
Latest Reply
kenwong
Databricks Employee
  • 1 kudos

We do have a few partners that offer solutions in this space (e.g. Retool).  Recently, Sigma added support for their InputTable feature which was designed for this use case: https://www.sigmacomputing.com/blog/bring-your-own-data-to-databricks-with-s...

  • 1 kudos
3 More Replies
JS_L
by New Contributor II
  • 483 Views
  • 2 replies
  • 1 kudos

ERROR: key not found in SQL when trying to pass the result of a CTE as a function parameter

Hi Community,I try to pass the result of a CTE as a function parameter as code below WITH t1 AS ( SELECT array_join(collect_list(output), ',') AS x FROM my_catalog.my_db.get_x(:startTime, :endTime) ) SELECT 'AM_offline' as Type, CASE WHEN off...

  • 483 Views
  • 2 replies
  • 1 kudos
Latest Reply
JS_L
New Contributor II
  • 1 kudos

Hi @szymon_dybczak Thanks for replying. I don't the issue is related to datatype, since the query works if I pass the subquery to _x parameter without CTE.Please see as below code:SELECT 'AM_offline' as Type, CASE WHEN offline_ratio > 1.5 THEN 'no-Go...

  • 1 kudos
1 More Replies
sachamourier
by New Contributor II
  • 615 Views
  • 3 replies
  • 0 kudos

Importing Python files into another Workspace Python file does not work

I have created Python modules containing some Python functions and I would like to import them from a notebook contained in the Workspace. For example, I have a "etl" directory, containing a "snapshot.py" file with some Python functions, and an empty...

Warehousing & Analytics
Databricks
Modules
python
  • 615 Views
  • 3 replies
  • 0 kudos
Latest Reply
filipniziol
Contributor III
  • 0 kudos

Hi @sachamourier ,It will work, but you need carefully craft path to sys.path.append(), you even do not need __init__.py to make it work.Try to hard-code the path to the snapshot.py in workspace.Add this to your notebook: import sys import os absolu...

  • 0 kudos
2 More Replies
LeoRickli
by New Contributor II
  • 939 Views
  • 4 replies
  • 2 kudos

Serverless SQL warehouses on GCP?

According to the official Databricks documentation on GCP, I should have the ability to deploy a serverless SQL warehouse inside Databricks. Following the documentation, it is requested to turn on Serverless SQL warehouses (On), but there is nothing ...

LeoRickli_0-1725640332787.png
  • 939 Views
  • 4 replies
  • 2 kudos
Latest Reply
LeoRickli
New Contributor II
  • 2 kudos

Hello @filipniziol, thanks for the response.I'm the workspace owner. I just gave myself the account admin (Metastore admin) but still got nothing new.

  • 2 kudos
3 More Replies
EmmaP
by New Contributor III
  • 1163 Views
  • 5 replies
  • 2 kudos

Understand cluster activity Serverless SQL

Hello, Following abnormally high costs when using serverless sql on September 9 and 10, I noticed that the cluster sometimes stays on for an hour even though it's not receiving any new requests, and that the auto-stop is set to 5 minutes of inactivit...

serverless_activity_anomaly.png
  • 1163 Views
  • 5 replies
  • 2 kudos
Latest Reply
RCo
New Contributor III
  • 2 kudos

Hi @EmmaP!I have encountered this. Even though the UI says that they are complete, they actually are not. While the query itself completed, the client is still fetching the data from the SQL Warehouse.To check if this is your issue, from the monitori...

  • 2 kudos
4 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 3451 Views
  • 2 replies
  • 2 kudos

PDF report from databricks

You can now send pdf reports from Lakeview dashboards. Just hit subscribe (you can also add subscribers by yourself in schedule settings) #databricks 

pdf.png
  • 3451 Views
  • 2 replies
  • 2 kudos
Latest Reply
bernsb
New Contributor II
  • 2 kudos

Cool. This is a very convenient feature since most people now use the PDF format when working with text files. If anyone has ever had any issues with this format, I can say that I recently needed to merge several PDF files into one, and with the help...

  • 2 kudos
1 More Replies
xwen
by New Contributor II
  • 603 Views
  • 1 replies
  • 1 kudos

how to modify data type of a column explicitly via DBSQL

is there a SQL equivalent of overwriteSchema ?https://docs.databricks.com/en/delta/update-schema.html#explicitly-update-schema-to-change-column-type-or-name

  • 603 Views
  • 1 replies
  • 1 kudos
Latest Reply
xwen
New Contributor II
  • 1 kudos

 In place schema adjustment =>Then ALTER TABLE XXX ADD/DROP COLUMN XXX INTExamplecreate table test (id int, first_name string, last_name string ); insert into test values (1, 'john', 'smith'); alter table test add column age int; select * from testCr...

  • 1 kudos
msolcuadrado
by New Contributor II
  • 1125 Views
  • 1 replies
  • 0 kudos

SQL warehouse autostop not working

I'm using a SQL warehouse with autostop after 5 minutes of inactivity.  However, the cluster is constantly activating and deactivating without any explanation. There are no queries being executed, and I can't identify any reasons why it is happening,...

Screen Shot 2024-09-06 at 12.34.47.png
  • 1125 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @msolcuadrado ,In your case I would try to contact directly with databricks support team. This is a serious issue and I feel your pain. They should help you pinpoint an excat cause + maybe you'll get a refund

  • 0 kudos
Rich85
by New Contributor
  • 758 Views
  • 1 replies
  • 0 kudos

Incorrect syntax near '=' error that I can't solve

Hi,I'm receiving the error Incorrect syntax near '=' when I run simple queries like the example below.  This only happens when I use a column created using a CASE statement in the WHERE clause.  I can use any other column in the WHERE clause, includi...

  • 758 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kayla
Valued Contributor
  • 0 kudos

What jumps out to me at first is the backticks on `Peak Vertical Force / BW`, but I'm assuming that's just a column name and not an attempt at division.Next that jumps out is TestType and TestTypeName being aliased as testType and testTypeName- spark...

  • 0 kudos
PabloCSD
by Contributor II
  • 553 Views
  • 0 replies
  • 0 kudos

Can't access to a directory for installing .whl

I was trying to install a personalized .whl file located in the "shared" folder but I'm obtaining this error:org.apache.spark.SparkException: Process List(/bin/su, libraries, -c, bash /local_disk0/.ephemeral_nfs/cluster_libraries/python/python_start_...

Warehousing & Analytics
dbx
library installation
  • 553 Views
  • 0 replies
  • 0 kudos
mathiaskvist
by New Contributor III
  • 800 Views
  • 4 replies
  • 0 kudos

SQL Warehouse REST statement execution validation fails with DECLARE SET

Hi,I'm using the REST API for SQL Warehouse in order to execute queries. I have experienced multiple times that query validation fails over the REST API, while executing the same query in the Databricks UI on the same cluster succeeds. An example: [P...

  • 800 Views
  • 4 replies
  • 0 kudos
Latest Reply
adriennn
Contributor II
  • 0 kudos

Had to try for myself and it seems the sql execution context in the REST API is different than that of an *.sql script, notebook or query made against an sql warehouse through the ui. The error stems from the fact that the SET command can also be use...

  • 0 kudos
3 More Replies
JosephX
by New Contributor
  • 411 Views
  • 1 replies
  • 0 kudos

optimize query from power bi desktop

How to tuning databricks query performance from Power BI Dosktop

  • 411 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Valued Contributor III
  • 0 kudos

Hi Joeshph,How are you doing today?Give a try with below inputs and let me know if works well.Filter and aggregate data in Databricks to reduce load before it reaches Power BI. Use DirectQuery carefully, simplify measures, and reduce the number of vi...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels
Top Kudoed Authors