cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

JohannaAK
by New Contributor
  • 1209 Views
  • 1 replies
  • 1 kudos

Restrict access to Hive Metastore in environments

 Hi, We have issues with users creating tables and storing data in hive metastore in environments with UC. How to you restrict this access? 

  • 1209 Views
  • 1 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@JohannaAK please restrict users in catalog level, by only providing select operation, looks all permissions are enabled for users 

  • 1 kudos
rmennes
by New Contributor II
  • 750 Views
  • 0 replies
  • 0 kudos

API to access query data plan

I'm working on a tool to visualise who accessed which (unity) catalogs, schemas and tables. To do that, I would like to access the query plan of the queries in the query history. Unfortunately, It seems like the REST api doesn't support accessing tho...

  • 750 Views
  • 0 replies
  • 0 kudos
spallinti
by New Contributor
  • 2576 Views
  • 1 replies
  • 0 kudos

Bronze layer tables

What type of tables do we use for Bronze layer. Managed or external when our raw data is in csv file

  • 2576 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vinay_M_R
Databricks Employee
  • 0 kudos

You can create either a managed or an unmanaged (external) table in the bronze layer, depending on your preference and use case. If you choose to create a managed table, Databricks will manage both the metadata and the data for the table. If you choo...

  • 0 kudos
Anonymous
by Not applicable
  • 3671 Views
  • 2 replies
  • 0 kudos
  • 3671 Views
  • 2 replies
  • 0 kudos
Latest Reply
Joe_Suarez
New Contributor III
  • 0 kudos

Redis offers various data structures such as strings, lists, sets, and hashes. Depending on your use case, select the appropriate data structure for storing the crm enrich data. For example, if you need to store key-value pairs, Redis hashes (HSET, H...

  • 0 kudos
1 More Replies
yuyang
by New Contributor
  • 3418 Views
  • 1 replies
  • 0 kudos

How can I use Delta Lake with AWS Athena and AWS Glue catalog?

Currently we use AWS Athena and aws glue catalog for our data lake. We would like to evaluate delta lake for data management. How shall we try this with the existing setup?  

  • 3418 Views
  • 1 replies
  • 0 kudos
Latest Reply
OlivierAllovon
New Contributor III
  • 0 kudos

Actually the Glue Hive Metastore integration with Unity Catalog as been announced today at the Databricks Summit.Give it a try herehttps://docs.databricks.com/archive/external-metastores/aws-glue-metastore.html

  • 0 kudos
carroll_q2
by New Contributor III
  • 3467 Views
  • 2 replies
  • 0 kudos

Resolved! Connect Spark Cluster to SQL Endpoint

Hello! Is it possible to retrieve data from a SQL Endpoint in the Databricks SQL persona using the Data Science and Engineering persona?  In other words, I would like to use pyspark in DS&E to query a table in Databricks SQL.#DatabricksSQL#Databricks...

  • 3467 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

you do not need the sql warehouse itself for that.  for ds & e you need a classic cluster (not a sql endpoint) anyway so you can easily read the tables from the metastore using spark.read.table().Connecting the sql endpoint to the ds cluster seems od...

  • 0 kudos
1 More Replies
User16826992666
by Valued Contributor
  • 4732 Views
  • 5 replies
  • 1 kudos

Resolved! Use different instance types in pools

I am wondering if it's possible to create a pool that has a mix of instance types in it?

  • 4732 Views
  • 5 replies
  • 1 kudos
Latest Reply
abagshaw
New Contributor III
  • 1 kudos

AWS Fleet instance types are now GA and available for clusters and instance pools. You can see more details here: https://docs.databricks.com/compute/aws-fleet-instances.html

  • 1 kudos
4 More Replies
User16826992666
by Valued Contributor
  • 2560 Views
  • 3 replies
  • 0 kudos

Resolved! Are there any scenarios where it doesn't make sense to use Spot Instances?

It seems like using spot instances make a lot of sense for cost savings. But are there any risks to using them? Or things to consider before enabling them?

  • 2560 Views
  • 3 replies
  • 0 kudos
Latest Reply
abagshaw
New Contributor III
  • 0 kudos

On AWS, to further improve the chance of acquiring spot instances, you can use the newly GA'd feature Fleet instance types: https://docs.databricks.com/compute/aws-fleet-instances.html

  • 0 kudos
2 More Replies
Avvar2022
by Contributor
  • 1956 Views
  • 1 replies
  • 1 kudos

Limited number of workspaces vs workspace for department or line of business ?

We are just getting started with databricks currently we have 1 workspace for each environment (DEV, QA and PRD). we have started with 1 workspace but there is already getting flooded with new workspace requests?   is there any check list/criteria fo...

  • 1956 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mounika_Tarigop
Databricks Employee
  • 1 kudos

I believe DEV, QA and PRD are the right segregate . We may need this because this will be easy to categorize the production and QA workload - based upon the amount of data cluster processes (Meaning more DBU) which we can restrict by the company work...

  • 1 kudos
KKo
by Contributor III
  • 2018 Views
  • 1 replies
  • 0 kudos

Delta Live Table tables in Data Tab

If I use this code (CREATE STREAMING LIVE TABLE Employee) in dlt pipeline, where does the Employee table gets created by default, if no storage location is specified? How can I create this table in Data tab within a database (a.k.a: Schema) lets say ...

  • 2018 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Kris Koirala​ :When you create a streaming live table in Databricks Delta Lake using the code CREATE STREAMING LIVE TABLE Employee, the table is created in the default database called default. If no storage location is specified, the table is stored...

  • 0 kudos
AK031
by New Contributor II
  • 3057 Views
  • 3 replies
  • 0 kudos

If I come via Databricks Partner connect and subscribe a partner product then how is the billing done and what api is used for publishing usage information to databricks?

If I come via Databricks Partner connect and subscribe a partner product then how is the billing done and what api is used for publishing usage information to databricks?

  • 3057 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Atul Karn​ We haven't heard from you since the last response from @Kaniz Fatma​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others....

  • 0 kudos
2 More Replies
mriccardi
by New Contributor II
  • 10249 Views
  • 1 replies
  • 0 kudos

Spark Streaming: Checkpoint corrupted

Hi Everyone!   Today 4 streaming jobs started to fail out of nowhere due to: StreamingQueryException: [STREAM_FAILED] Query [id = ####, runId = ####] terminated with exception: dbfs:/mnt/path/my_table/sources/0/0 doesn't exist (latestId: 8, compactIn...

  • 10249 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hi @Martin Riccardi​,We haven't heard from you since the last response from @Kaniz Fatma​ , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ot...

  • 0 kudos