cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

T_I
by New Contributor II
  • 2439 Views
  • 5 replies
  • 0 kudos

Connect Databricks to Airflow

Hi,I have Databricks on top of aws. I have a Databricks connection on Airflow (mwaa). I am able to conect and execute a Datbricks job via Airflow using a personal access token. I believe the best practice is to conect using a service principal. I und...

  • 2439 Views
  • 5 replies
  • 0 kudos
Latest Reply
Sloka
New Contributor II
  • 0 kudos

https://airflow.apache.org/docs/apache-airflow-providers-databricks/6.9.0/connections/databricks.html

  • 0 kudos
4 More Replies
Mailendiran
by New Contributor III
  • 5974 Views
  • 3 replies
  • 0 kudos

Unity Catalog - Storage Account Data Access

I was exploring on unity catalog option on Databricks premium workspace.I understood that i need to create storage account credentials and external connection in workspace.Later, i can access the cloud data using 'abfss://storage_account_details' .I ...

  • 5974 Views
  • 3 replies
  • 0 kudos
Latest Reply
DouglasMoore
Databricks Employee
  • 0 kudos

Databricks strategic direction is to deprecate mount points in favor of Unity Catalog Volumes.Setup an STORAGE CREDENTIAL and EXTERNAL LOCATION to access and define how to get to your cloud storage account. To access data on the account, define a Tab...

  • 0 kudos
2 More Replies
ChrisLawford
by New Contributor II
  • 4701 Views
  • 4 replies
  • 2 kudos

PyTest working in Repos but not in Databricks Asset Bundles

Hello,I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to valida...

  • 4701 Views
  • 4 replies
  • 2 kudos
Latest Reply
cinyoung
Databricks Employee
  • 2 kudos

@ChrisLawford You can run pytest through job databricks bundle run -t dev pytest_job I was able to work around in this way.resource/pytest.job.ymlresources: jobs: pytest_job: name: pytest_job tasks: - task_key: pytest_task ...

  • 2 kudos
3 More Replies
Aquib
by New Contributor
  • 3756 Views
  • 2 replies
  • 0 kudos

How to migrate DBFS from one tenant to another tenant

I am working on Databricks workspace migration, where I need to copy the Databricks workspace including DBFS from source to target (both source and target are in different subscription/account). Can someone suggest what could be approach to migrate D...

  • 3756 Views
  • 2 replies
  • 0 kudos
Latest Reply
arjunappani
New Contributor II
  • 0 kudos

Hi @jose_gonzalez How can we migrate the data from Managed storage account from azure data bricks to new tenant?

  • 0 kudos
1 More Replies
mydefaultlogin
by New Contributor II
  • 787 Views
  • 1 replies
  • 0 kudos

Inconsistent PYTHONPATH, Git folders vs DAB

Hello Databricks Community,I'm encountering an issue related to Python paths when working with notebooks in Databricks. I have a following structure in my project:my_notebooks - my_notebook.py /my_package - __init__.py - hello.py databricks.yml...

  • 787 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi mydefaultlogin,How are you doing today?, As per my understanding, You're right—this happens because when you're running notebooks from your Git folder, Python knows exactly where your project root is and can easily find my_package. But when you de...

  • 0 kudos
hdu
by New Contributor II
  • 890 Views
  • 1 replies
  • 1 kudos

Resolved! Change cluster owner API call failed

I am trying to change cluster's owner using API call. but get following error. I am positive that host, cluster_id and owner_username are all correct. The error message says No API found. Is this related with the compute I am using? or something else...

hdu_0-1742837197352.png
  • 890 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi hdu,How are you doing today?, As per my understanding, It sounds like you’re really close! That “No API found” error usually means either the wrong API endpoint is being used, or the cluster type doesn’t support ownership changes—for example, shar...

  • 1 kudos
Shivap
by New Contributor III
  • 1587 Views
  • 4 replies
  • 3 kudos

What's the recommended way of creating tables in Databricks with unity catalog (External/Managed)

I have databricks with unity catalog enables and created an external ADLS location. when I create the catalog/schema it uses the external location. when I try to create the table it uses the extrenal location but they are managed tables. What's the r...

  • 1587 Views
  • 4 replies
  • 3 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 3 kudos

Hi Shivap,How are you doing today?, As per my understanding, in Unity Catalog, if you want to create an external table, you just need to make sure the external location is registered and approved first. Even though you're specifying a path with LOCAT...

  • 3 kudos
3 More Replies
bidek56
by Contributor
  • 2520 Views
  • 8 replies
  • 2 kudos

Resolved! When will DB release runtime with Scala 2.13

When will DB release runtime with Scala 2.13? Thx

  • 2520 Views
  • 8 replies
  • 2 kudos
Latest Reply
JoseSoto
New Contributor III
  • 2 kudos

Spark 4 is coming and it's only going to support Scala 2.13, so a Databricks Runtime with Spark 3.5.x and Scala 2.13 should be released soonish.

  • 2 kudos
7 More Replies
samye760
by New Contributor II
  • 3187 Views
  • 1 replies
  • 1 kudos

Job Retry Wait Policy and Cluster Shutdown

Hi all,I have a Databricks Workflow job in which the final task makes an external API call. Sometimes this API will be overloaded and the call will fail. In the spirit of automation, I want this task to retry the call an hour later if it fails in the...

Data Engineering
clusters
jobs
retries
Workflows
  • 3187 Views
  • 1 replies
  • 1 kudos
Latest Reply
rmartinezdezaya
New Contributor II
  • 1 kudos

What about this? Any reply? Any alternative? I'm facing the same issue.

  • 1 kudos
Jennifer
by New Contributor III
  • 1768 Views
  • 6 replies
  • 0 kudos

Can external tables be created backed by current cloud files without ingesting files in Databricks?

Hi,We have huge amount of parquet files in s3 with the path pattern <bucket>/<customer>/yyyy/mm/dd/hh/.*.parquet.The question is can I create a external table in Unity Catalog from this external location without actually ingesting the files? Like wha...

  • 1768 Views
  • 6 replies
  • 0 kudos
Latest Reply
Data_Mavericks
New Contributor III
  • 0 kudos

 i think the issue is that you are trying to create a DELTA table in Unity catalog from an Parquet source without converting it to Delta format first.As Unity catalog will not allow delta table to be created in an non-empty location. Since you want t...

  • 0 kudos
5 More Replies
Rakesh007
by New Contributor II
  • 1795 Views
  • 3 replies
  • 0 kudos

Maven library installation issue on 15.4 LTS

recently i upgraded from 10.4LTS databricks runtime version to 15.4LTS version. while installing Maven library i was facing issue like :Library installation attempted on the driver node of cluster 0415-115331-dune977 and failed. Library resolution fa...

  • 1795 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16611530679
Databricks Employee
  • 0 kudos

Hi @Rakesh007, Good Day!This seems to be a compatibility issue with the Apache Spark version, as the DBR 15.4LTS supports 3.5.0. Please try installing the below version and let us know how it goes?Version: com.crealytics:spark-excel_2.12:3.5.0_0.20.3...

  • 0 kudos
2 More Replies
Brad
by Contributor II
  • 6933 Views
  • 2 replies
  • 0 kudos

Why "rror: Invalid access to Org: xxx"

Hi team, I installed Databricks CLI, and run "databricks auth login --profile xxx" successfully. I can also connect from vscode to Databricks. "databricks clusters list -p xxx" also works. But when I tried to rundatabricks bundle validateI got"Error:...

  • 6933 Views
  • 2 replies
  • 0 kudos
Latest Reply
swhite
New Contributor II
  • 0 kudos

I just ran into this issue (in Azure Databricks) and found that it was caused by an incorrect `host` value specified in my databricks.yml file:targets: dev: default: true mode: production workspace: host: https://adb-<workspace-id...

  • 0 kudos
1 More Replies
afisl
by New Contributor II
  • 15297 Views
  • 8 replies
  • 5 kudos

Resolved! Apply unitycatalog tags programmatically

Hello,I'm interested in the "Tags" feature of columns/schemas/tables of the UnityCatalog (described here: https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/tags)I've been able to play with them by hand and would now lik...

Data Engineering
tags
unitycatalog
  • 15297 Views
  • 8 replies
  • 5 kudos
Latest Reply
Jiri_Koutny
New Contributor III
  • 5 kudos

Hi, running ALTER TABLE SET TAGS works on views too!

  • 5 kudos
7 More Replies
Sadam97
by New Contributor III
  • 845 Views
  • 3 replies
  • 0 kudos

GCE cluster chokes the secret api server.

Hi,We upgraded the GKE cluster to GCE cluster as per the databricks documentation. It works fine with one or two notebooks in job. Our production job has more than 40 notebooks and each notebook access the secret api and seems like secret api server ...

  • 845 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Sadam97, This looks to be a known issue. I will share more details soon. There is a known issue with NAT and GKE and we’ll follow up offline

  • 0 kudos
2 More Replies
DataGeek_JT
by New Contributor II
  • 3918 Views
  • 4 replies
  • 4 kudos

Is it possible to use Liquid Clustering on Delta Live Tables / Materialised Views?

Is it possible to use Liquid Clustering on Delta Live Tables? If it is available what is the Python syntax for adding liquid clustering to a Delta Live Table / Materialised view please? 

  • 3918 Views
  • 4 replies
  • 4 kudos
Latest Reply
surajitDE
New Contributor III
  • 4 kudos

@Dlt.table(name=table_name,comment="just_testing",table_properties={"quality": "gold","mergeSchema": "true"},cluster_by=["test_id", "find_date"] # Optimizes for queries filtering on these columns)def testing_table():return create_testing_table(df_fin...

  • 4 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels