Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
I have a pipeline that has given me no problems up until today with the following error message:com.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster 0307-134831-tgq587us: Attempt to launch cluster w...
@SB93 The error message you are seeing indicates that the cluster failed to launch because the Spark driver was unresponsive, with possible causes being library conflicts, incorrect metastore configuration, or other configuration issues. Given that t...
I'm using Job cluster and created compute policies for library management and now I'm trying to use pools in databricks. I'm getting error like this : Cluster validation error: Validation failed for azure_attributes.spot_bid_max_price from pool, the ...
@n1399 The error "Validation failed for azure_attributes.spot_bid_max_price from pool, the value must be present" suggests that the spot bid max price is required, but it’s either missing or not correctly inherited from the compute policy when using ...
Hello,I am trying to get the table lineage i.e upstreams and downstreams of all tables in unity catalog into my local database using API calls. I need my db to be up to date, if the lineage is updated in one of the in databricks, i have to update sam...
Hi @Rachana2,As @Alberto_Umana has mentioned I'd check table_lineage / column_lineage tables, as maintaining a lineage through a bespoke pipeline/tooling may not be a right approach.Can you please explain your use case which explains why you don't wa...
I have the exact same issue. Seems like limiting the the display() method works as a temporary solution, but I wonder if there's any long term one. The idea would be to have the possibility of displaying larger datasets within a notebook. How to achi...
We are currently entering data into Excel and then uploading it into Databricks. Is there a built-in spreadsheet-like UI within Databricks that can update data directly in Databricks?
Hello, @j_h_robinson!
Databricks doesn’t have a built-in spreadsheet-like UI for direct data entry or editing. Are you manually uploading the Excel files or using an ODBC driver setup? If you’re doing it manually, you might find this helpful: Connect...
I’m currently working with Databricks autoscaling configurations and trying to better understand how Spark decides when to spin up additional worker nodes. My cluster has a minimum of one worker and can scale up to five. I know that tasks are assigne...
Hi @h_h_ak ,Short Answer:Autoscaling primarily depends on the number of pending tasks.Workspaces on the Premium plan use optimized autoscaling, while those on the Standard plan use standard autoscaling.Long Answer:Databricks autoscaling responds main...
In this document, https://docs.databricks.com/aws/en/notebooks/notebook-format,Jupyter (.ipynb) format is recommended.> Select File from the workspace menu, select Notebook format, and choose the format you want. You can choose either Jupyter (.ipynb...
Hi @Yuki,One other risk that we foresee / encountered recently is how the notebooks will look in your pull requests of external repos (Azure Devops or GitHub). It will be very hard for a pull request reviewer to understand on the code / notebook read...
Hi,I'm not sure if this is a possible scenario, but is there, by any chance a way to query all the columns of a table for searching a value? Explanation: I want to search for a specific value in all the columns of a databricks table. I don't know whi...
I have a Databricks App I need to integrate with volumes using local python os functions. I've setup a simple test: def __init__(self, config: ObjectStoreConfig):
self.config = config
# Ensure our required paths are created
...
If you use the databricks python sdk you can access volume files using built-in app credentials. All you need to do is instantiate the workspace client from the sdk and you can use its methods to operate on volumes.
Im using the Community Edition.Trying to create a storage folder inside DBFS -> Filestore for my datasets. I click on Create, give a folder name, and poof. Nothing. No new folder.Tried refreshing, logging out and logging in. Tried to create folder mu...
I am trying to install a wheel file which is in my volume to a serverless cluster, getting the below error@ken@Retired_mod Note: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.
WARN...
Hi, I'm trying to deploy a rag model from GCP databricks. I've added an external gpt4o endpoint and enabled inference table in settings. But when Im trying to deploy agents I'm still getting the inference table not enabled error. (I've registered the...
Hello,I recently learned about the DatabricksWorkflowTaskGroup operator for Airflow that allows one to run multiple Notebook tasks on a shared job compute cluster from Airflow.Is a similar feature possible to run multiple non-Notebook tasks from Airf...
Currently we are facing a challenge with below use case:The Airflow DAG has 4 tasks (Task1, Task2, Task3 and Task4) and The dependency is like thisTask 1>> Task2 >> Task3 >> Task4 (All tasks are spark-jar task typesIn Airflow DAG for Task2, there is ...
Hi @anil_reddaboina,
Databricks allows you to add control flow logic to tasks based on the success, failure, or completion of their dependencies. This can be achieved using the "Run if" dependencies fiel: https://docs.databricks.com/aws/en/jobs/run-i...
In my solution I am planning to bring in an Azure SQL Database to Azure Databricks Unity Catalog as Foreign Catalog. Are table row filters and column masks supported in my scenario ?
Hi @Arindam19,
Yes. Certain operations, including filtering, can be pushed down from Databricks to SQL Server. This is managed by querying the SQL Server directly via a federated connection, allowing SQL Server to handle the filter criteria and retur...