cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dimitry
by Valued Contributor
  • 6581 Views
  • 2 replies
  • 0 kudos

How to "Python versions in the Spark Connect client and server are different. " in UDF

I've read all relevant articles but none have solution that I could understand. Sorry I'm new to it.I have a simple UDF to demonstrate the problem:df = spark.createDataFrame([(1, 1.0, 'a'), (1, 2.0, 'b'), (2, 3.0, 'c'), (2, 5.0, 'd'), (2, 10.0, 'e')]...

Dimitry_0-1749435601522.png
  • 6581 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @Dimitry ,The error you're seeing indicates that the Python version in your notebook (3.11) doesn't match the version used by Databricks Serverless, which is typically Python 3.12. Since Serverless environments use a fixed Python version, this mis...

  • 0 kudos
1 More Replies
anilsampson
by New Contributor III
  • 1259 Views
  • 1 replies
  • 1 kudos

Databricks Dashboard run from Job issue

Hello, i am trying to trigger a databricks dashboard via workflow task.1.when i deploy the job triggering the dashboard task via local "Deploy bundle" command deployment is successful.2. when i try to deploy to a different environment via CICD while ...

  • 1259 Views
  • 1 replies
  • 1 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 1 kudos

Hi @anilsampson ,The error means your dashboard_task is not properly nested under the tasks section.tasks:- task_key: dashboard_task  dashboard_task:    dashboard_id: ${resources.dashboards.nyc_taxi_trip_analysis.id}    warehouse_id: ${var.warehouse_...

  • 1 kudos
amit_jbs
by New Contributor II
  • 5791 Views
  • 6 replies
  • 2 kudos

In databricks deployment .py files getting converted to notebooks

A critical issue has arisen that is impacting our deployment planning for our client. We have encountered a challenge with our Azure CI/CD pipeline integration, specifically concerning the deployment of Python files (.py). Despite our best efforts, w...

  • 5791 Views
  • 6 replies
  • 2 kudos
Latest Reply
AGivenUser
New Contributor II
  • 2 kudos

Another option is Databricks Asset Bundles.

  • 2 kudos
5 More Replies
Dimitry
by Valued Contributor
  • 2353 Views
  • 1 replies
  • 2 kudos

Resolved! Cannot run merge statement in the notebook

Hi allI'm trialing Databricks for running complex python integration scripts. It will be different data sources (MS SQL, CSV files etc.) that I need to push to a target system via GraphQL. So I selected Databricks vs MS Fabric as it can handle comple...

Dimitry_0-1749101790855.png Dimitry_1-1749101815839.png
  • 2353 Views
  • 1 replies
  • 2 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 2 kudos

Hi @Dimitry ,The issue you're seeing is due to delta.enableRowTracking = true. This feature adds hidden _metadata columns, which serverless compute doesn't support, that's why the MERGE fails there.Try this out:You can disable row tracking with:ALTER...

  • 2 kudos
pargit2
by New Contributor II
  • 1641 Views
  • 2 replies
  • 0 kudos

feature store

i need to build for data science team feature store that will return one big df after one hot encoding for almost each dimension,join and group by. should I create one feature store for final output that contain all the relevant data or create featur...

  • 1641 Views
  • 2 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Here are some things to consider:   The best practice for designing a feature store in your scenario depends on balancing scalability, maintainability, and the dynamic nature of some dimensions like doctor names. Here's an outlined recommendation bas...

  • 0 kudos
1 More Replies
VigneshJaisanka
by New Contributor II
  • 1924 Views
  • 2 replies
  • 0 kudos

Databricks DLT ADLS Access issue

We have a DLT pipeline configure with spn inside the notebook, which was working fine. Now after credentials expiry, we created new one and updated the same in notebook. Now we are pipeline is not able to read from ADLS.SPN and my UserId is having co...

  • 1924 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @VigneshJaisanka The issue likely comes from a permissions or configuration mismatch. Here are a few things worth checking:Make sure the SPN is set as the pipeline owner and has the necessary permissions on the ADLS resource.If you’re using Unity ...

  • 0 kudos
1 More Replies
mooze456
by New Contributor
  • 901 Views
  • 1 replies
  • 0 kudos

Delta Sharing & UC: Understanding the Initial Empty Predicate Query

We're testing our Delta Sharing server with Unity Catalog (UC) and noticed a behavior where a simple query like SELECT COUNT(1) FROM table_name WHERE col1 = 'value' triggers two /query requests to our server.The initial request arrives with empty pre...

  • 901 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

The initial /query request during a Delta Sharing operation with Unity Catalog serves a critical purpose in the query lifecycle. It is intended to retrieve the schema and basic metadata of the table, which helps in query planning and optimization. Th...

  • 0 kudos
Pratikmsbsvm
by Contributor
  • 2092 Views
  • 2 replies
  • 0 kudos

Migration of PowerBI reports from Synapse to Databricks sql (DBSQL)

We have 250 powerbi reports build on top of Azure Synapse, now we are migrating from Azure Synapse to Databricks (DB SQL). How to plan for cutover and strategy for PowerBII just seeking high level points we have to take care for planning. Any techie ...

  • 2092 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

While your account Solution Architect (SA) will be able to guide you, if you still want to check what peers did here https://community.databricks.com/t5/warehousing-analytics/migrate-azure-synapse-analytics-data-to-databricks/td-p/90663 and here http...

  • 0 kudos
1 More Replies
NIK251
by New Contributor III
  • 3149 Views
  • 3 replies
  • 1 kudos

Resolved! Delta Live Table Pipeline

I have the error message when try to create a delta live table pipeline.My error is: com.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster 1207-112912-8e84v9h5: Encountered Quota Exhaustion issue in ...

  • 3149 Views
  • 3 replies
  • 1 kudos
Latest Reply
NIK251
New Contributor III
  • 1 kudos

Thanks sir, I solved it.

  • 1 kudos
2 More Replies
DebIT2011
by New Contributor III
  • 17562 Views
  • 4 replies
  • 9 kudos

Choosing between Azure Data Factory (ADF) and Databricks PySpark notebooks

I’m working on a project where I need to pull large datasets from Cosmos DB into Databricks for further processing, and I’m trying to decide whether to use Azure Data Factory (ADF) or Databricks PySpark notebooks for the extraction and processing tas...

  • 17562 Views
  • 4 replies
  • 9 kudos
Latest Reply
Johns404
New Contributor II
  • 9 kudos

Hi @DebIT2011,You're facing a classic architectural decision between orchestration with ADF versus direct transformation using Databricks PySpark notebooks. Both tools are powerful but serve different purposes depending on your project needs. Below i...

  • 9 kudos
3 More Replies
makerandcoder12
by New Contributor
  • 1405 Views
  • 1 replies
  • 0 kudos

How can I leverage Databricks for building end-to-end machine learning pipelines?

I’ve been following practical tutorials on makerandcoder, which often showcase hands-on machine learning projects using Python, scikit-learn, and Spark. I’m looking to scale my projects using the Databricks platform for better collaboration, data han...

  • 1405 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Databricks enables the creation of scalable, end-to-end machine learning (ML) pipelines by providing a comprehensive and collaborative platform that integrates key components for data handling, experimentation, and model deployment. Here’s how Databr...

  • 0 kudos
rafal_walisko
by New Contributor II
  • 2999 Views
  • 1 replies
  • 0 kudos

Optimal Strategies for downloading large query results with Databricks API

Hi everyone,I'm currently facing an issue with handling a large amount of data using the Databricks API. Specifically, I have a query that returns a significant volume of data, sometimes resulting in over 200 chunks.My initial approach was to retriev...

  • 2999 Views
  • 1 replies
  • 0 kudos
Latest Reply
Datagyan
New Contributor II
  • 0 kudos

I am also facing the same issue now one approach tomorrow i will try I will create a job that using serverless job cluster. Then whenever user will click on download button from UI. This should trigger the job now this job. Will read the table as dat...

  • 0 kudos
arnas
by New Contributor II
  • 1480 Views
  • 3 replies
  • 0 kudos

S3 limited bucket permissions

Hi,can I run Databricks on limited/restricted S3 bucket folder, no access to bucket root level as it is restricted per project folder in IAM?i.e s3://mybucket/myproject_abc/Now I configured all permissions as per documentationhttps://docs.databricks....

  • 1480 Views
  • 3 replies
  • 0 kudos
Latest Reply
arnas
New Contributor II
  • 0 kudos

Thanks, but no thanks, spam resides in JUNK folder

  • 0 kudos
2 More Replies
Labels