cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

espenol
by Databricks Partner
  • 30336 Views
  • 11 replies
  • 13 kudos

input_file_name() not supported in Unity Catalog

Hey, so our notebooks reading a bunch of json files from storage typically use a input_file_name() when moving from raw to bronze, but after upgrading to Unity Catalog we get an error message:AnalysisException: [UC_COMMAND_NOT_SUPPORTED] input_file_n...

  • 30336 Views
  • 11 replies
  • 13 kudos
Latest Reply
ramanpreet
New Contributor II
  • 13 kudos

The reason why the 'input_file_name' is not supported because this function was available in older versions of Databricks runtime. It got deprecated from Databricks Runtime 13.3 LTS onwards

  • 13 kudos
10 More Replies
mydefaultlogin
by New Contributor II
  • 1366 Views
  • 2 replies
  • 0 kudos

Inconsistent PYTHONPATH, Git folders vs DAB

Hello Databricks Community,I'm encountering an issue related to Python paths when working with notebooks in Databricks. I have a following structure in my project:my_notebooks - my_notebook.py /my_package - __init__.py - hello.py databricks.yml...

  • 1366 Views
  • 2 replies
  • 0 kudos
Latest Reply
kenny_hero
New Contributor III
  • 0 kudos

I have a related question.I'm new to Databricks platform. I struggle with PYTHONPATH issue as the original poster raised. I understand using sys.path.append(...) is one approach for notebook. This is acceptable for ad-hoc interactive session, but thi...

  • 0 kudos
1 More Replies
kALYAN5
by Databricks Partner
  • 866 Views
  • 4 replies
  • 3 kudos

Service Principal

Can two service principal have same name,but unique id's ?

  • 866 Views
  • 4 replies
  • 3 kudos
Latest Reply
emma_s
Databricks Employee
  • 3 kudos

Hi @kALYAN5,  Here is an explanation of why service principals share a name but IDs are unique: Names Are for Human Readability: Organizations use human-friendly names like "automation-batch-job" or "databricks-ci-cd" to make it easy for admins to re...

  • 3 kudos
3 More Replies
Askenm
by New Contributor
  • 2156 Views
  • 6 replies
  • 4 kudos

Docker tab missing in create compute

I am running databricks premium and looking to create a compute running conda. It seems that the best way to do this is to boot the compute from a docker image. However, in the ```create_compute > advanced``` I cannot see the the docker option nor ca...

Data Engineering
conda
Docker
  • 2156 Views
  • 6 replies
  • 4 kudos
Latest Reply
mukul1409
Contributor II
  • 4 kudos

Hi @Askenm In Databricks Premium, the Docker option for custom images is not available on all compute types and is not controlled by user level permissions. Custom Docker images are only supported on Databricks clusters that use the legacy VM based c...

  • 4 kudos
5 More Replies
CHorton
by New Contributor II
  • 1113 Views
  • 3 replies
  • 2 kudos

Resolved! Calling a function with parameters via Spark ODBC driver

Hi All,I am having an issue with calling a Databricks SQL user defined function with parameters from my client application using the Spark ODBC driver.I have been able to execute a straight SQL statement using parameters e.g. SELECT * FROM Customer W...

  • 1113 Views
  • 3 replies
  • 2 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 2 kudos

Hi @CHorton The Databricks SQL engine does not support positional (?) parameters inside SQL UDF calls.  When Spark SQL parses GetCustomerData(?), the parameter is unresolved at analysis time, so you get [UNBOUND_SQL_PARAMETER]. This is not an ODBC bu...

  • 2 kudos
2 More Replies
Harun
by Honored Contributor
  • 13394 Views
  • 2 replies
  • 4 kudos

How to change the number of executors instances in databricks

I know that Databricks runs one executor per worker node. Can i change the no.of.exectors by adding params (spark.executor.instances) in the cluster advance option? and also can i pass this parameter when i schedule a task, so that particular task wi...

  • 13394 Views
  • 2 replies
  • 4 kudos
Latest Reply
RandiMacGyver
New Contributor II
  • 4 kudos

In Databricks, the executor model is largely managed by the platform itself. On Databricks clusters, each worker node typically runs a single Spark executor, and this behavior is intentional.

  • 4 kudos
1 More Replies
liquibricks
by Databricks Partner
  • 830 Views
  • 3 replies
  • 3 kudos

Resolved! Spark verison errors in "Build an ETL pipeline with Lakeflow Spark Declarative Pipelines"

I'm trying to define a job for a pipeline using the Asset Bundle Python SDK. I created the pipeline first (using the SDK) and i'm now trying to add the Job. The DAB validates and deploys successfully, but when I run the Job i get an error: UNAUTHORIZ...

  • 830 Views
  • 3 replies
  • 3 kudos
Latest Reply
mukul1409
Contributor II
  • 3 kudos

This happens because the job is not actually linked to the deployed pipeline and the pipeline id is None at runtime. When using Asset Bundles, the pipeline id is only resolved after deployment, so referencing my_pipeline.id in code does not work. Ins...

  • 3 kudos
2 More Replies
mukul1409
by Contributor II
  • 1582 Views
  • 3 replies
  • 1 kudos

Resolved! Iceberg interoperability between Databricks and external catalogs

I would like to understand the current approach for Iceberg interoperability in Databricks. Databricks supports Iceberg using Unity Catalog, but many teams also use Iceberg tables managed outside Databricks. Are there recommended patterns today for s...

  • 1582 Views
  • 3 replies
  • 1 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 1 kudos

Great

  • 1 kudos
2 More Replies
hnnhhnnh
by New Contributor II
  • 642 Views
  • 1 replies
  • 0 kudos

Title: How to handle type widening (int→bigint) in DLT streaming tables without dropping the table

SetupBronze source table (external to DLT, CDF & type widening enabled):# Source table properties:# delta.enableChangeDataFeed: "true"# delta.enableDeletionVectors: "true"# delta.enableTypeWidening: "true"# delta.minReaderVersion: "3"# delta.minWrite...

  • 642 Views
  • 1 replies
  • 0 kudos
Latest Reply
mukul1409
Contributor II
  • 0 kudos

Hi @hnnhhnnh DLT streaming tables that use apply changes do not support widening the data type of key columns such as changing an integer to a bigint after the table is created. Even though Delta and Unity Catalog support type widening in general, DL...

  • 0 kudos
Sunil_Patidar
by Databricks Partner
  • 2830 Views
  • 3 replies
  • 2 kudos

Unable to read from or write to Snowflake Open Catalog via Databricks

I have Snowflake Iceberg tables whose metadata is stored in Snowflake Open Catalog. I am trying to read these tables from the Open Catalog and write back to the Open Catalog using Databricks.I have explored the available documentation but haven’t bee...

  • 2830 Views
  • 3 replies
  • 2 kudos
Latest Reply
mukul1409
Contributor II
  • 2 kudos

Databricks does not currently provide official support to read from or write to Snowflake Open Catalog. Although Snowflake Open Catalog is compatible with the Iceberg REST catalog and open source Spark can work with it, this integration is not suppor...

  • 2 kudos
2 More Replies
Loinguyen318
by New Contributor II
  • 4055 Views
  • 4 replies
  • 0 kudos

Resolved! Public DBFS root is disabled in Databricks free edition

I am using notebook to execute a sample spark to write delta table in dbfs using free edition. However, I face an issue, that I can not access the public DBFS after the code executed.The spark code such as:data = spark.range(0, 5)data.write.format("d...

  • 4055 Views
  • 4 replies
  • 0 kudos
Latest Reply
mukul1409
Contributor II
  • 0 kudos

Yes ,  use UCVolumes instead of DBFS. As Databricks moves toward a serverless architecture, DBFS access is being increasingly restricted and is not intended for long term or production usage. UC Volumes are a better choice than DBFS.

  • 0 kudos
3 More Replies
Phani1
by Databricks MVP
  • 2431 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks - Calling dashboard another dashboard..

Hi Team ,Can we call the dashboard from another dashboard? An example screenshot is attached.Main Dashboard has 3 buttons that point to 3 different dashboards and if we click any of the buttons it has to redirect to the respective dashboard.

  • 2431 Views
  • 2 replies
  • 1 kudos
Latest Reply
thains
New Contributor III
  • 1 kudos

I would also like to see this feature added.

  • 1 kudos
1 More Replies
ciaran
by New Contributor
  • 862 Views
  • 1 replies
  • 0 kudos

Is GCP Workload Identity Federation supported for BigQuery connections in Azure Databricks?

I’m trying to set up a BigQuery connection in Azure Databricks (Unity Catalog / Lakehouse Federation) using GCP Workload Identity Federation (WIF) instead of a GCP service account keyEnvironment:Azure Databricks workspaceBigQuery query federation via...

  • 862 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 0 kudos

I guess that it is only one accepted as doc say "Google service account key json"

  • 0 kudos
pavelhym
by New Contributor
  • 683 Views
  • 1 replies
  • 1 kudos

Usage of MLFlow models inside Streamlit app in Databricks

I have an issue with loading registered MLflow model into streamlit app inside the DatabricksThis is the sample code used for model load:import mlflowfrom mlflow.tracking import MlflowClientmlflow.set_tracking_uri("databricks")mlflow.set_registry_uri...

  • 683 Views
  • 1 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

Authentication context isn’t automatically available in Apps. Notebooks automatically inject workspace host and token for mlflow when you use mlflow.set_tracking_uri("databricks") and mlflow.set_registry_uri("databricks-uc"). In Databricks Apps, you ...

  • 1 kudos
JothyGanesan
by New Contributor III
  • 775 Views
  • 2 replies
  • 4 kudos

Resolved! Vacuum on DLT

We are currently using DLT tables in our target tables. The tables are getting loaded in continuous job pipelines.The liquid cluster is enabled in the tables. Will Vacuum work on these tables when it is getting loaded in continuous mode? How to run t...

  • 775 Views
  • 2 replies
  • 4 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 4 kudos

VACUUM works fine on DLT tables running in continuous mode. DLT does automatic maintenance (OPTIMIZE + VACUUM) roughly every 24 hours if the pipeline has a maintenance cluster configured. Q: The liquid cluster is enabled in the tables. Will Vacuum wo...

  • 4 kudos
1 More Replies
Labels