cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

drag7ter
by Contributor
  • 3228 Views
  • 4 replies
  • 0 kudos

Resolved! Not able to set run_as service_principal_name

I'm trying to run: databricks bundle deploy -t prod --profile PROD_Service_Principal My bundle looks: bundle: name: myproject include: - resources/jobs/bundles/*.yml targets: # The 'dev' target, for development purposes. This target is the de...

  • 3228 Views
  • 4 replies
  • 0 kudos
Latest Reply
reidwil
New Contributor II
  • 0 kudos

Building on this situation, I am seeing if I deploy a job using a service principal this way, I am getting something prepended to the job like `[dev f46583c2_8c9e_499f_8d41_823332bfd4473] `. Is there a different way for me via bundling to change this...

  • 0 kudos
3 More Replies
kfoster
by Contributor
  • 3848 Views
  • 6 replies
  • 7 kudos

Azure DevOps Repo - Invalid Git Credentials

I have a Repo in Databricks connected to Azure DevOps Repositories.The repo has been working fine for almost a month, until last week. Now when I try to open the Git settings in Databricks, I am getting "Invalid Git Credentials". Nothing has change...

  • 3848 Views
  • 6 replies
  • 7 kudos
Latest Reply
tbMark
New Contributor II
  • 7 kudos

Same symptoms, same issue. Azure support hasn't figured it out

  • 7 kudos
5 More Replies
anshi_t_k
by New Contributor III
  • 896 Views
  • 3 replies
  • 1 kudos

Data engineering professional exam

Each configuration below is identical in that each cluster has 400 GB total of RAM, 160 total cores, and only one Executor per VM.Given an extremely long-running job for which completion must be guaranteed, which cluster configuration will be able to...

  • 896 Views
  • 3 replies
  • 1 kudos
Latest Reply
filipniziol
Contributor III
  • 1 kudos

Hi @anshi_t_k ,The key consideration here is fault tolerance. How do you protect against a VM failure? By having more VMs, as the impact of a single VM  failure will be the lowest.For example answer C - the crash of the VM is loosing 1/1 so 100% capa...

  • 1 kudos
2 More Replies
stefanberreiter
by New Contributor III
  • 3002 Views
  • 7 replies
  • 3 kudos

[Azure Databricks] Create an External Location to Microsoft Fabric Lakehouse

Hi,I want to create an external location from Azure Databricks to a Microsoft Fabric Lakehouse, but seems I am missing something.What did I do:I created an "Access Connector for Azure Databricks" in Azure PortalI created a storage credential for the ...

  • 3002 Views
  • 7 replies
  • 3 kudos
Latest Reply
stefanberreiter
New Contributor III
  • 3 kudos

I guess I'm looking now into Lakehouse federation for the SQL endpoint of the Fabric Lakehouse - which comes closest to the experience of the External Table I guess.Running Federated Queries from Unity Catalog on Microsoft Fabric SQL Endpoint | by Ai...

  • 3 kudos
6 More Replies
Stellar
by New Contributor II
  • 4748 Views
  • 1 replies
  • 0 kudos

Databricks CI/CD Azure Devops

Hi all,I am looking for advice on what would be the best approach when it comes to CI/CD in Databricks and repo in general. What would be the best approach; to have main branch and branch off of it or? How will changes be propagated from dev to qa an...

  • 4748 Views
  • 1 replies
  • 0 kudos
maddan80
by New Contributor
  • 597 Views
  • 2 replies
  • 0 kudos

Oracle Essbase connectivity

Team, I wanted to understand the best way of connecting to Oracle Essbase to ingest data into the delta lake

  • 597 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@maddan80 I see that Essbase supports ODBC/JDBC connector. Try utilizing one of those.

  • 0 kudos
1 More Replies
Harsha777
by New Contributor III
  • 1079 Views
  • 3 replies
  • 2 kudos

Resolved! Sub-Query behavior in sql statements

Hi Team,I have a query with below construct in my projectSELECT count(*) FROM `catalog`.`schema`.`t_table`WHERE _col_check IN (SELECT DISTINCT _col_check FROM `catalog`.`schema`.`t_check_table`)Actually, there is no column "_col_check" in the sub-que...

  • 1079 Views
  • 3 replies
  • 2 kudos
Latest Reply
filipniziol
Contributor III
  • 2 kudos

Hi @Harsha777 ,What occurs is called column shadowing.What happens is that the column names in main query and sub query  are identica and the databricks engine after not finding it in sub query searches in the main query.The simplest way to avoid the...

  • 2 kudos
2 More Replies
m_weirath
by New Contributor
  • 708 Views
  • 2 replies
  • 0 kudos

DLT-META requires ddl when using cdc_apply_changes

We are setting up new DLT Pipelines using the DLT-Meta package. Everything is going well in bringing our data in from Landing to our Bronze layer when we keep the onboarding JSON fairly vanilla. However, we are hitting an issue when using the cdc_app...

  • 708 Views
  • 2 replies
  • 0 kudos
Latest Reply
dbuser17
New Contributor II
  • 0 kudos

Please check these details: https://github.com/databrickslabs/dlt-meta/issues/90

  • 0 kudos
1 More Replies
VasuKumarT
by New Contributor
  • 432 Views
  • 1 replies
  • 0 kudos

Unity Catalog: Metastore 3 level Hierarchy

I have data files categorized by application and region. Want to know the best way to load them into the Bronze and Silver layers while maintaining proper segregation.For example, in our landing zone, we have a structure of raw files to be loaded usi...

  • 432 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shazaamzaa
New Contributor III
  • 0 kudos

If I understand it correctly, you have source files partitioned by application and region in cloud storage that you want to load and would like some suggestions on the Unity Catalog structure. It will definitely depend on how you want the data to be ...

  • 0 kudos
mjar
by New Contributor III
  • 3366 Views
  • 7 replies
  • 2 kudos

ModuleNotFoundError when using foreachBatch on runtime 14 with Unity

Recently we have run into an issue using foreachBatch after upgrading our Databricks cluster on Azure to a runtime version 14 with Spark 3.5 with Shared access mode and Unity catalogue.The issue was manifested by ModuleNotFoundError error being throw...

  • 3366 Views
  • 7 replies
  • 2 kudos
Latest Reply
ananddanny
New Contributor II
  • 2 kudos

I am facing this issue with Scala Spark streaming in shared cluster with 15.4 LTS run time. Is there any fix or alternative for this. I can't used assigned cluster as my table has masked columns and my company hasn't enabled serverless yet in our wor...

  • 2 kudos
6 More Replies
Sudheer89
by New Contributor
  • 1066 Views
  • 1 replies
  • 0 kudos

Where is Data tab and DBFS in Premium Databricks workspace

Currently I can see Catalog tab instead of Data tab in left side navigation. I am unable to find Data tab -> File browser where I would like to upload one sample orders csv file. Later I want to refer that path in Databricks notebooks as /FileStore/t...

  • 1066 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Sudheer89 ,By default - DBFS tab is disabled. As an admin user, you can manage your users’ ability to browse data in the Databricks File System (DBFS) using the visual browser interface.Go to the admin console.Click the Workspace Settings tab.In ...

  • 0 kudos
valesexp
by New Contributor II
  • 821 Views
  • 1 replies
  • 1 kudos

Enforce tags to Jobs

Anyone know how I enforce jobs tags, not the custom tags for cluster. I want to enforce that jobs has certain tags so we can filter our jobs. We are not using Unity Catalog yet. 

  • 821 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Currently, enforcing job tags is not a built-in feature in Databricks. However, you can add tags to your jobs when creating or updating them and filter jobs by these tags on the jobs list page.

  • 1 kudos
Nathant93
by New Contributor III
  • 652 Views
  • 1 replies
  • 0 kudos

Constructor public org.apache.spark.ml.feature.Bucketizer(java.lang.String) is not whitelisted.

Hi,I am getting the error Constructor public org.apache.spark.ml.feature.Bucketizer(java.lang.String) is not whitelisted. when using a serverless compute cluster. I have seen in some other articles that this is due to high concurrency - does anyone k...

  • 652 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The error you're encountering, "Constructor public org.apache.spark.ml.feature.Bucketizer(java.lang.String) is not whitelisted", typically arises when using a shared mode cluster. This is because Spark ML is not supported in shared clusters due to se...

  • 0 kudos
Padmaja
by New Contributor II
  • 564 Views
  • 1 replies
  • 0 kudos

Need Help with SCIM Provisioning URL and Automation

Hi Databricks Community,I’m working on setting up SCIM provisioning and need some assistance:SCIM Provisioning URL:Can anyone confirm the correct process to obtain the SCIM Provisioning URL from the Databricks account console? I need to ensure I'm re...

  • 564 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Which provider are you using? You can use the doc for Okta provisioning to guide you through the process https://docs.databricks.com/en/admin/users-groups/scim/okta.html

  • 0 kudos
637858
by New Contributor II
  • 4418 Views
  • 2 replies
  • 3 kudos

How to disable users to create personal compute using notebook?

A Databricks account administrator can disable account-wide access to the Personal Compute default policy using the following steps: Navigate to the Databricks Account Console. Click the “Settings” icon. Click the “Feature enablement” tab. Switch the...

  • 4418 Views
  • 2 replies
  • 3 kudos
Latest Reply
mggl
New Contributor II
  • 3 kudos

Is there no way to prevent using Personal Compute policy from a notebook?Or does my question make sense? In other words is it by design right/immutable to have this policy when creating a notebook?

  • 3 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels