cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MadelynM
by Databricks Employee
  • 3060 Views
  • 0 replies
  • 0 kudos

[Recap] Data + AI Summit 2024 - Warehousing & Analytics | Improve performance and increase insights

Here's your Data + AI Summit 2024 - Warehousing & Analytics recap as you use intelligent data warehousing to improve performance and increase your organization’s productivity with analytics, dashboards and insights.  Keynote: Data Warehouse presente...

Screenshot 2024-07-03 at 10.15.26 AM.png
Warehousing & Analytics
AI BI Dashboards
AI BI Genie
Databricks SQL
  • 3060 Views
  • 0 replies
  • 0 kudos
MoJaMa
by Databricks Employee
  • 2468 Views
  • 0 replies
  • 0 kudos

What are the Best Practices for SQL Endpoint sizing?

I see that cluster sizes are mentioned here https://docs.databricks.com/sql/admin/sql-endpoints.html#cluster-size but I would like to know when to pick what type of cluster (data size/ users/ concurrency) without having to do too much trial and error...

  • 2468 Views
  • 0 replies
  • 0 kudos
brickster_2018
by Databricks Employee
  • 2404 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to run 2 different applications with the same class name on a cluster

I have two jars with the same class name. It works fine on yarn. When trying to run these jars on the Databricks cluster, I run into issues. Why Databricks is having this limitation?

  • 2404 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

When you run the jobs in Yarn, those are 2 different applications getting submitted on Yarn. Hence each application will have a separate Spark driver JVM's. In Databricks, a cluster has one JVM for the Spark driver. When applications with the same na...

  • 0 kudos
User16826994223
by Databricks Employee
  • 1086 Views
  • 1 replies
  • 0 kudos

Resolved! Does Koalas support Structured Streaming

Does Koalas support Structured Streaming

  • 1086 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Databricks Employee
  • 0 kudos

No, Koalas does not support Structured Streaming officially.As a workaround, you can use Koalas APIs with foreachBatch in Structured Streaming which allows batch APIs:>>> def func(batch_df, batch_id):   ... koalas_df = ks.DataFrame(batch_df)   .....

  • 0 kudos
christys
by Databricks Employee
  • 1012 Views
  • 1 replies
  • 2 kudos
  • 1012 Views
  • 1 replies
  • 2 kudos
Latest Reply
Taha
Databricks Employee
  • 2 kudos

So if you've got an S3 bucket with your data in it, the first thing you'll need to do is connect it to a Databricks workspace to grant access. Then you can start querying the contents of the bucket from notebooks (or running jobs) by using clusters (...

  • 2 kudos
Anonymous
by Not applicable
  • 1029 Views
  • 1 replies
  • 0 kudos
  • 1029 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

Frequency of logs in stdout / stderr would be a function of what code you run on the databricks clusters .The default log level for log4j is INFO - you could change it following the instructions here

  • 0 kudos
Digan_Parikh
by Databricks Employee
  • 2078 Views
  • 1 replies
  • 0 kudos

Resolved! DBSQL connection to other BI tools

How do i connect DBSQL to other BI tools?

  • 2078 Views
  • 1 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

Generally, you can connect to SQL endpoint using a ODBC or JDBC driver. More information can be found here. https://docs.databricks.com/integrations/bi/index-sqla.html

  • 0 kudos
User16826992666
by Databricks Employee
  • 2740 Views
  • 2 replies
  • 0 kudos
  • 2740 Views
  • 2 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

When sizing, this is the recommendation. Data set Cluster SizeITB / rows X-Large+500GB / 1B rows X-LargeSOGB / IOOM+ rows LargeIOOGB / rows MediumIOGB / -M rows SmallThis table maps SQL endpoint cluster sizes to Databricks cluster driver sizes and wo...

  • 0 kudos
1 More Replies
User16826987838
by Databricks Employee
  • 1209 Views
  • 1 replies
  • 0 kudos
  • 1209 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Databricks Employee
  • 0 kudos

Second question. Yes. Just don't grant CAN_RUN to a user/grouphttps://docs.databricks.com/sql/user/security/access-control/dashboard-acl.html#dashboard-permissions

  • 0 kudos
User16826992666
by Databricks Employee
  • 5459 Views
  • 1 replies
  • 0 kudos

Resolved! Can I implement Row Level Security for users when using SQL Endpoints?

I'd like to be able to limit the rows users see when querying tables in Databricks SQL based on what access level each user is supposed to be granted. Is this possible in the SQL environment?

  • 5459 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

Using dynamic views you can specify permissions down to the row or field level e.g. CREATE VIEW sales_redacted AS SELECT user_id, country, product, total FROM sales_raw WHERE CASE WHEN is_member('managers') THEN TRUE ELSE total <= 1...

  • 0 kudos
User16826992666
by Databricks Employee
  • 2419 Views
  • 1 replies
  • 0 kudos

Resolved! Should I enable Photon on my SQL Endpoint?

I see the option to enable Photon when creating a new SQL Endpoint. The description says that enabling it helps speed up up queries, which sounds good, but are there any downsides I need to be aware of?

  • 2419 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Generally, yes you should enable photon. The majority of functionality is available and will perform extremely well. There are some limitations with it that can be found here. Limitations: Works on Delta and Parquet tables only for both read and writ...

  • 0 kudos
User16826992666
by Databricks Employee
  • 2999 Views
  • 1 replies
  • 0 kudos

Resolved! How can I see the performance of individual queries in Databricks SQL?

If I want to get more information about how an individual query is performing in the Databricks SQL environment, is there anywhere I can see that?

  • 2999 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

You could see details on different queries that ran against an endpoint under the query history section

  • 0 kudos
User16826992666
by Databricks Employee
  • 6067 Views
  • 1 replies
  • 0 kudos

Resolved! Can you run Structured Streaming on a job cluster?

Need to know if I can use job clusters to start and run streaming jobs or if it has to be interactive

  • 6067 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

Yes. Here is a doc containing some info on running Structured Streaming in production using Databricks jobs

  • 0 kudos
User16788316451
by Databricks Employee
  • 1503 Views
  • 1 replies
  • 0 kudos

How to troubleshoot SSL certificate errors while connecting Business Intelligence (BI) tools to Databricks in a Private Cloud (PVC) environment?

How to troubleshoot SSL certificate errors while connecting Business Intelligence (BI) tools to Databricks in a Private Cloud (PVC) environment?

  • 1503 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16788316451
Databricks Employee
  • 0 kudos

See attached for steps to inspect the certificate chain using openssl

  • 0 kudos