cancel
Showing results for 
Search instead for 
Did you mean: 
Discussions
Engage in dynamic conversations covering diverse topics within the Databricks Community. Explore discussions on data engineering, machine learning, and more. Join the conversation and expand your knowledge base with insights from experts and peers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Browse the Community

Community Discussions

Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...

3620 Posts

Activity in Discussions

Yunky007
by > Visitor
  • 21 Views
  • 2 replies
  • 0 kudos

ETL pipeline

I have an ETL pipeline in workflows which I am using to create materialized view. I want to schedule the pipeline for 10 hours only starting from 10 am. How can I schedule that? I can only see hourly basis schedule or cron syntax. I want the compute ...

  • 21 Views
  • 2 replies
  • 0 kudos
Latest Reply
Isi
Contributor
  • 0 kudos

Hey @Yunky007 You should use the cron expression 0 10 * * * to start the process at 10 AM.Then, inside your script, implement a loop or mechanism that keeps the logic running for 10 hours, that’s the trick. import time from datetime import datetime, ...

  • 0 kudos
1 More Replies
BhavyaSreeBanga
by > New Contributor
  • 158 Views
  • 1 replies
  • 0 kudos

Missing Genie - Upload File Feature in Preview Section

Despite having admin privileges for both the workspace and Genie Workspace, we are unable to see the "Genie - Upload File" feature under the Preview section, even though the documentation indicates it should be available.We also attempted switching r...

  • 158 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @BhavyaSreeBanga! The Genie - Upload File option might need to be explicitly enabled by your Databricks account team, even if you have admin access and can see other Genie features. It’s worth checking with your account team to see if it’s been...

  • 0 kudos
BigAlThePal
by > New Contributor
  • 326 Views
  • 2 replies
  • 0 kudos

.py file running stuck on waiting

Hello, hope you are doing well.We are facing an issue when running .py files. This is fairly recent and we were not experiencing this issue last week.As shown in the screenshots below, the .py file hangs on "waiting" after we press "run all". No matt...

BigAlThePal_0-1743709475328.png
  • 326 Views
  • 2 replies
  • 0 kudos
Latest Reply
humpy_reddy
New Contributor II
  • 0 kudos

Hey @BigAlThePal, It looks like a UI bug, especially in Microsoft Edge. The code actually runs, but the output doesn't show until you refresh. A few quick things you can try:Run cells individually instead of using "Run All"Switch to Chrome or Firefox...

  • 0 kudos
1 More Replies
jakobhaggstrom
by > New Contributor
  • 182 Views
  • 1 replies
  • 0 kudos

Issue when using M2M authentication with azure databricks jdbc driver 2.7.1

Hi!I try to connect to an azure databricks sql warehouse in dbeaver, which uses the azure databricks jdbc driver version 2.7.1 ## and I cannot get M2M authentication to work. I get a 'Not Authorized' (401) response when I try to connect, and it seems...

  • 182 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
New Contributor III
  • 0 kudos

Hi @jakobhaggstrom, this error likely occurs due to the type of secret you're using. For M2M authentication, the Databricks JDBC driver requires a Databricks generated OAuth secret, not a Microsoft Entra ID client secret. While your service principal...

  • 0 kudos
petitregny
by > New Contributor
  • 68 Views
  • 2 replies
  • 0 kudos

Reading from an S3 bucket using boto3 on serverless cluster

Hello All,I am trying to read a CSV file from my S3 bucket in a notebook running on serverless.I am using the two standard functions below, but I get a credentials error (Error reading CSV from S3: Unable to locate credentials).I don't have this issu...

  • 68 Views
  • 2 replies
  • 0 kudos
Latest Reply
Isi
Contributor
  • 0 kudos

Hi @petitregny ,The issue you’re encountering is likely due to the access mode of your cluster. Serverless compute uses standard/shared access mode, which does not allow you to directly access AWS credentials (such as the instance profile) in the sam...

  • 0 kudos
1 More Replies
notwarte
by > New Contributor II
  • 136 Views
  • 4 replies
  • 0 kudos

Unity Catalog storage amounts

Hi,I am using Azure and I do have predictive optimization enable on the catalog. I have wrote a script to calculate the data amounts of all of the tables -  looping over all of the tables and running "describe detail".All of the tables amount to ~ 1....

wiselka_1-1744630645723.png
  • 136 Views
  • 4 replies
  • 0 kudos
Latest Reply
Isi
Contributor
  • 0 kudos

Hey @notwarte,Using the __databricks_internal catalog to trace the underlying storage location is a solid approach for investigating their footprint.Regarding your question about storage duplication: yes, materialized views in Databricks do store a p...

  • 0 kudos
3 More Replies
KIRKQUINBAR
by > New Contributor II
  • 56 Views
  • 1 replies
  • 0 kudos

Predictive Optimization with multiple workspaces

We currently have an older instance of Azure Databricks that i migrated to Unity Catalog. Unfortunately i ran into some weird issues that don't seem fixable so i created a new instance and pointed it to the same metastore. The setting at the metastor...

  • 56 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
New Contributor III
  • 0 kudos

Hi @KIRKQUINBAR, if you enable Predictive Optimization at the metastore level in Unity Catalog, it automatically applies to all Unity Catalog managed tables within that metastore, no matter which workspace is accessing them. PO runs centrally, so the...

  • 0 kudos
antonionuzzo
by > New Contributor II
  • 15 Views
  • 0 replies
  • 0 kudos

Unexpected Behavior with Azure Databricks and Entra ID SCIM Integration

Hi everyone,I'm currently running some tests for a company that uses Entra ID as the backbone of its authentication system. Every employee with a corporate email address is mapped within the organization's Entra ID.Our company's Azure Databricks is c...

  • 15 Views
  • 0 replies
  • 0 kudos
Tuno986
by > New Contributor
  • 668 Views
  • 1 replies
  • 0 kudos

Implementing Federated Governance in Databricks Unity Catalog

Hi,I am working for a large company that is implementing a Databricks solution. We have multiple domains, each responsible for its own data products, following a data mesh approach.As part of a federated governance model, we need a way to communicate...

  • 668 Views
  • 1 replies
  • 0 kudos
Latest Reply
antonionuzzo
New Contributor II
  • 0 kudos

Hi, All the information for the creation and modification of assets is recorded within the system tables. Perhaps, when a catalog is created, a possible solution could be to trigger a job that notifies the central team about this event.

  • 0 kudos
antonionuzzo
by > New Contributor II
  • 18 Views
  • 0 replies
  • 0 kudos

Monitor workspace admin activities

Hello everyone,I am conducting tests on Databricks AWS and have noticed that in an organization with multiple workspaces, each with different workspace admins, a workspace admin can invite a user who is not mapped within their workspace but is alread...

  • 18 Views
  • 0 replies
  • 0 kudos
Malthe
by > New Contributor II
  • 16 Views
  • 1 replies
  • 0 kudos

Parametrize DLT pipeline

If I'm using Databricks Asset Bundles, how would I parametrize a DLT pipeline based on a static configuration file.In pseudo-code, I would have a .py-file:import dlt # Something that pulls a pipeline resource (or artifact) and parses from JSON table...

  • 16 Views
  • 1 replies
  • 0 kudos
Latest Reply
Emmitt18Lefebvr
  • 0 kudos

Hello!To parametrize a Databricks DLT pipeline with a static configuration file using Asset Bundles, include your JSON/YAML config file in the bundle. In your DLT pipeline code, read this file using Python's file I/O (referencing its deployed path). ...

  • 0 kudos
YuriS
by > Visitor
  • 25 Views
  • 0 replies
  • 0 kudos

VACUUM with Azure Storage Inventory Report is not working

Could someone please advise regarding VACUUM with Azure Storage Inventory Report as i have failed to make it work.DBR 15.4 LTS, VACUUM command is being run with USING INVENTORY clause, as follows:VACUUM schema.table USING INVENTORY ( select 'https://...

  • 25 Views
  • 0 replies
  • 0 kudos
Yuki
by > New Contributor III
  • 11 Views
  • 0 replies
  • 0 kudos

When does everyone utilize the model register?

Hi, I'm Yuki,I'm considering when I should use register_model.In my case, I'm running the training batch once a week and if the model is good, I want to update the champion.I have created the code to register the model if the score is the best.# star...

  • 11 Views
  • 0 replies
  • 0 kudos
Pranav29
by > New Contributor
  • 11653 Views
  • 4 replies
  • 0 kudos

Exam Suspended

Hi,I am extremely disappointed with Databricks and its testing partners. I am having a pathetic experience taking up the certification exam. Databricks and its partners are wasting my time and effort that I have put into preparing for the certificati...

  • 11653 Views
  • 4 replies
  • 0 kudos
Latest Reply
rahul_mehta
New Contributor II
  • 0 kudos

Why don't Databricks engage with external prometric centres like other vendors so that exams can be given without any glitches ?  

  • 0 kudos
3 More Replies
AP01
by > Visitor
  • 49 Views
  • 0 replies
  • 0 kudos

Databricks JDBC Error: Job Aborted Due to Stage Failure (Executor OOM - Error Code 52)

java.sql.SQLException: [Databricks][JDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: null, Query: SELECT `ma***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.s...

Warehousing & Analytics
Databricks JDBC SparkSQL OOM HiveThriftServer Error500051
Databricks SQL
JDBC Driver
SparkSQL
sql
  • 49 Views
  • 0 replies
  • 0 kudos