cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Sergecom
by New Contributor III
  • 3266 Views
  • 9 replies
  • 4 kudos

Databricks SQL Exists does not work correct?

Can someone explain to me how this is possible?  

Sergecom_0-1742460269477.png Sergecom_1-1742460306964.png
  • 3266 Views
  • 9 replies
  • 4 kudos
Latest Reply
Sergecom
New Contributor III
  • 4 kudos

@Shua42 I've realized that I didn’t mention the subquery issue in my first post, so I guess this can be handled as a separate ticket.

  • 4 kudos
8 More Replies
T_I
by New Contributor II
  • 2703 Views
  • 5 replies
  • 0 kudos

Connect Databricks to Airflow

Hi,I have Databricks on top of aws. I have a Databricks connection on Airflow (mwaa). I am able to conect and execute a Datbricks job via Airflow using a personal access token. I believe the best practice is to conect using a service principal. I und...

  • 2703 Views
  • 5 replies
  • 0 kudos
Latest Reply
Sloka
New Contributor II
  • 0 kudos

https://airflow.apache.org/docs/apache-airflow-providers-databricks/6.9.0/connections/databricks.html

  • 0 kudos
4 More Replies
Mailendiran
by New Contributor III
  • 6209 Views
  • 3 replies
  • 0 kudos

Unity Catalog - Storage Account Data Access

I was exploring on unity catalog option on Databricks premium workspace.I understood that i need to create storage account credentials and external connection in workspace.Later, i can access the cloud data using 'abfss://storage_account_details' .I ...

  • 6209 Views
  • 3 replies
  • 0 kudos
Latest Reply
DouglasMoore
Databricks Employee
  • 0 kudos

Databricks strategic direction is to deprecate mount points in favor of Unity Catalog Volumes.Setup an STORAGE CREDENTIAL and EXTERNAL LOCATION to access and define how to get to your cloud storage account. To access data on the account, define a Tab...

  • 0 kudos
2 More Replies
ChrisLawford
by New Contributor II
  • 4921 Views
  • 4 replies
  • 2 kudos

PyTest working in Repos but not in Databricks Asset Bundles

Hello,I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to valida...

  • 4921 Views
  • 4 replies
  • 2 kudos
Latest Reply
cinyoung
Databricks Employee
  • 2 kudos

@ChrisLawford You can run pytest through job databricks bundle run -t dev pytest_job I was able to work around in this way.resource/pytest.job.ymlresources: jobs: pytest_job: name: pytest_job tasks: - task_key: pytest_task ...

  • 2 kudos
3 More Replies
Aquib
by New Contributor
  • 3809 Views
  • 2 replies
  • 0 kudos

How to migrate DBFS from one tenant to another tenant

I am working on Databricks workspace migration, where I need to copy the Databricks workspace including DBFS from source to target (both source and target are in different subscription/account). Can someone suggest what could be approach to migrate D...

  • 3809 Views
  • 2 replies
  • 0 kudos
Latest Reply
arjunappani
New Contributor II
  • 0 kudos

Hi @jose_gonzalez How can we migrate the data from Managed storage account from azure data bricks to new tenant?

  • 0 kudos
1 More Replies
hdu
by New Contributor II
  • 954 Views
  • 1 replies
  • 1 kudos

Resolved! Change cluster owner API call failed

I am trying to change cluster's owner using API call. but get following error. I am positive that host, cluster_id and owner_username are all correct. The error message says No API found. Is this related with the compute I am using? or something else...

hdu_0-1742837197352.png
  • 954 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi hdu,How are you doing today?, As per my understanding, It sounds like you’re really close! That “No API found” error usually means either the wrong API endpoint is being used, or the cluster type doesn’t support ownership changes—for example, shar...

  • 1 kudos
Shivap
by New Contributor III
  • 1688 Views
  • 4 replies
  • 3 kudos

What's the recommended way of creating tables in Databricks with unity catalog (External/Managed)

I have databricks with unity catalog enables and created an external ADLS location. when I create the catalog/schema it uses the external location. when I try to create the table it uses the extrenal location but they are managed tables. What's the r...

  • 1688 Views
  • 4 replies
  • 3 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 3 kudos

Hi Shivap,How are you doing today?, As per my understanding, in Unity Catalog, if you want to create an external table, you just need to make sure the external location is registered and approved first. Even though you're specifying a path with LOCAT...

  • 3 kudos
3 More Replies
bidek56
by Contributor
  • 2680 Views
  • 8 replies
  • 2 kudos

Resolved! When will DB release runtime with Scala 2.13

When will DB release runtime with Scala 2.13? Thx

  • 2680 Views
  • 8 replies
  • 2 kudos
Latest Reply
JoseSoto
New Contributor III
  • 2 kudos

Spark 4 is coming and it's only going to support Scala 2.13, so a Databricks Runtime with Spark 3.5.x and Scala 2.13 should be released soonish.

  • 2 kudos
7 More Replies
samye760
by New Contributor II
  • 3286 Views
  • 1 replies
  • 1 kudos

Job Retry Wait Policy and Cluster Shutdown

Hi all,I have a Databricks Workflow job in which the final task makes an external API call. Sometimes this API will be overloaded and the call will fail. In the spirit of automation, I want this task to retry the call an hour later if it fails in the...

Data Engineering
clusters
jobs
retries
Workflows
  • 3286 Views
  • 1 replies
  • 1 kudos
Latest Reply
rmartinezdezaya
New Contributor II
  • 1 kudos

What about this? Any reply? Any alternative? I'm facing the same issue.

  • 1 kudos
Jennifer
by New Contributor III
  • 1973 Views
  • 6 replies
  • 0 kudos

Can external tables be created backed by current cloud files without ingesting files in Databricks?

Hi,We have huge amount of parquet files in s3 with the path pattern <bucket>/<customer>/yyyy/mm/dd/hh/.*.parquet.The question is can I create a external table in Unity Catalog from this external location without actually ingesting the files? Like wha...

  • 1973 Views
  • 6 replies
  • 0 kudos
Latest Reply
Data_Mavericks
New Contributor III
  • 0 kudos

 i think the issue is that you are trying to create a DELTA table in Unity catalog from an Parquet source without converting it to Delta format first.As Unity catalog will not allow delta table to be created in an non-empty location. Since you want t...

  • 0 kudos
5 More Replies
Rakesh007
by New Contributor II
  • 1919 Views
  • 3 replies
  • 0 kudos

Maven library installation issue on 15.4 LTS

recently i upgraded from 10.4LTS databricks runtime version to 15.4LTS version. while installing Maven library i was facing issue like :Library installation attempted on the driver node of cluster 0415-115331-dune977 and failed. Library resolution fa...

  • 1919 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16611530679
Databricks Employee
  • 0 kudos

Hi @Rakesh007, Good Day!This seems to be a compatibility issue with the Apache Spark version, as the DBR 15.4LTS supports 3.5.0. Please try installing the below version and let us know how it goes?Version: com.crealytics:spark-excel_2.12:3.5.0_0.20.3...

  • 0 kudos
2 More Replies
Brad
by Contributor II
  • 7331 Views
  • 2 replies
  • 0 kudos

Why "rror: Invalid access to Org: xxx"

Hi team, I installed Databricks CLI, and run "databricks auth login --profile xxx" successfully. I can also connect from vscode to Databricks. "databricks clusters list -p xxx" also works. But when I tried to rundatabricks bundle validateI got"Error:...

  • 7331 Views
  • 2 replies
  • 0 kudos
Latest Reply
swhite
New Contributor II
  • 0 kudos

I just ran into this issue (in Azure Databricks) and found that it was caused by an incorrect `host` value specified in my databricks.yml file:targets: dev: default: true mode: production workspace: host: https://adb-<workspace-id...

  • 0 kudos
1 More Replies
afisl
by New Contributor II
  • 15860 Views
  • 8 replies
  • 5 kudos

Resolved! Apply unitycatalog tags programmatically

Hello,I'm interested in the "Tags" feature of columns/schemas/tables of the UnityCatalog (described here: https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/tags)I've been able to play with them by hand and would now lik...

Data Engineering
tags
unitycatalog
  • 15860 Views
  • 8 replies
  • 5 kudos
Latest Reply
Jiri_Koutny
New Contributor III
  • 5 kudos

Hi, running ALTER TABLE SET TAGS works on views too!

  • 5 kudos
7 More Replies
Sadam97
by New Contributor III
  • 912 Views
  • 3 replies
  • 0 kudos

GCE cluster chokes the secret api server.

Hi,We upgraded the GKE cluster to GCE cluster as per the databricks documentation. It works fine with one or two notebooks in job. Our production job has more than 40 notebooks and each notebook access the secret api and seems like secret api server ...

  • 912 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Sadam97, This looks to be a known issue. I will share more details soon. There is a known issue with NAT and GKE and we’ll follow up offline

  • 0 kudos
2 More Replies
DataGeek_JT
by New Contributor II
  • 4060 Views
  • 4 replies
  • 4 kudos

Is it possible to use Liquid Clustering on Delta Live Tables / Materialised Views?

Is it possible to use Liquid Clustering on Delta Live Tables? If it is available what is the Python syntax for adding liquid clustering to a Delta Live Table / Materialised view please? 

  • 4060 Views
  • 4 replies
  • 4 kudos
Latest Reply
surajitDE
New Contributor III
  • 4 kudos

@Dlt.table(name=table_name,comment="just_testing",table_properties={"quality": "gold","mergeSchema": "true"},cluster_by=["test_id", "find_date"] # Optimizes for queries filtering on these columns)def testing_table():return create_testing_table(df_fin...

  • 4 kudos
3 More Replies
Labels