cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

dbx-user7354
by New Contributor III
  • 738 Views
  • 3 replies
  • 3 kudos

Create a Job via SKD with JobSettings Object

Hey, I want to create a Job via the Python SDK with a JobSettings object.import os import time from databricks.sdk import WorkspaceClient from databricks.sdk.service import jobs from databricks.sdk.service.jobs import JobSettings w = WorkspaceClien...

  • 738 Views
  • 3 replies
  • 3 kudos
Latest Reply
nenetto
New Contributor II
  • 3 kudos

I just faced the same problem. The issue is that the when you do JobSettings.as_dict()the settings are parsed to a dict where all the values are also parsed recursively. When you pass the parameters as **params, the create method again tries to parse...

  • 3 kudos
2 More Replies
noname123
by New Contributor III
  • 550 Views
  • 2 replies
  • 0 kudos

Resolved! Delta table version protocol

I do:df.write.format("delta").mode("append").partitionBy("timestamp").option("mergeSchema", "true").save(destination)If table doesn't exist, it creates new table with "minReaderVersion":3,"minWriterVersion":7.Yesterday it was creating table with "min...

  • 550 Views
  • 2 replies
  • 0 kudos
Latest Reply
noname123
New Contributor III
  • 0 kudos

Thanks for help.Issue was caused by "Auto-Enable Deletion Vectors" setting. 

  • 0 kudos
1 More Replies
nihar_ghude
by New Contributor II
  • 649 Views
  • 2 replies
  • 0 kudos

OSError: [Errno 107] Transport endpoint is not connected

Hi,I am facing this error when performing write operation in foreach() on a dataframe. The piece of code was working fine for over 3 months but started failing since last week.To give some context, I have a dataframe extract_df which contains 2 colum...

nihar_ghude_0-1710175215407.png
Data Engineering
ADLS
azure
python
spark
  • 649 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @nihar_ghude,  Instead of using foreach(), consider using foreachBatch(). This method allows you to apply custom logic on the output of each micro-batch, which can help address parallelism issues.Unlike foreach(), which operates on individual rows...

  • 0 kudos
1 More Replies
oussValrho
by New Contributor
  • 596 Views
  • 1 replies
  • 0 kudos

Cannot resolve due to data type mismatch: incompatible types ("STRING" and ARRAY<STRING>

hey i have this error from a while : Cannot resolve "(needed_skill_id = needed_skill_id)" due to data type mismatch: the left and right operands of the binary operator have incompatible types ("STRING" and "ARRAY<STRING>"). SQLSTATE: 42K09;and these ...

  • 596 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @oussValrho, The error message you’re encountering indicates a data type mismatch in your SQL query. Specifically, it states that the left and right operands of the binary operator have incompatible types: a STRING and an ARRAY<STRING>. Let’s bre...

  • 0 kudos
Lightyagami
by New Contributor
  • 1874 Views
  • 1 replies
  • 0 kudos

Save workbook with macros

Hi, Is there any way to save a workbook without losing the macros in databricks?

Data Engineering
Databricks
pyspark
  • 1874 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Lightyagami, When working with Databricks and dealing with macros, there are a few approaches you can consider to save a workbook without losing the macros: Export to Excel with Macros Enabled: You can generate an Excel file directly from PyS...

  • 0 kudos
philipkd
by New Contributor III
  • 387 Views
  • 1 replies
  • 0 kudos

Cannot get past Query Data tutorial for Azure Databricks

I created a new workspace on Azure Databricks, and I can't get past this first step in the tutorial: DROP TABLE IF EXISTS diamonds; CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", hea...

  • 387 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @philipkd, It appears you’ve encountered an issue while creating a table in Azure Databricks using the Unity Catalog. Let’s address this step by step: URI Format: The error message indicates that the URI for your CSV file is missing a cloud f...

  • 0 kudos
alxsbn
by New Contributor III
  • 639 Views
  • 1 replies
  • 0 kudos

Resolved! Compute pool and AWS instance profiles

Hi everyone,We're looking at using the compute pool feature. Now we're mostly relying on all-purpose and job compute. On these two we're using instance profiles to let the clusters access our s3 buckets and more.We don't see anything related to insta...

  • 639 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @alxsbn , Let’s delve into the details of compute pools and instance profiles. Compute Pools: Compute pools in Databricks allow you to manage and allocate compute resources efficiently. They provide a way to organize and share compute resource...

  • 0 kudos
BjarkeM
by New Contributor II
  • 1471 Views
  • 6 replies
  • 0 kudos

Schema migration of production delta tables

GoalWe would like to be in control of schema migrations of delta tables in all dev and production environments, and it must be automatically deployed.I anticipated this to be a common problem with a well-known standard solution. But unfortunately, I ...

  • 1471 Views
  • 6 replies
  • 0 kudos
Latest Reply
zerobugs
New Contributor II
  • 0 kudos

Hello, so does this mean that it's necessary to migrate away from hive_metastore to unity_catalog in order to be able to use schema migrations?

  • 0 kudos
5 More Replies
GOW
by New Contributor II
  • 200 Views
  • 2 replies
  • 1 kudos

Databricks to s3

I am new to data engineering in Databricks. I need some guidance surrounding Databricks to s3. Can I get an example job or approach to do this?

  • 200 Views
  • 2 replies
  • 1 kudos
Latest Reply
GOW
New Contributor II
  • 1 kudos

Thank you for the reply. Can I apply this to dbt or using a dbt macro to unload the data? So dbt models running in Databricks?

  • 1 kudos
1 More Replies
exilon
by New Contributor
  • 413 Views
  • 1 replies
  • 0 kudos

DLT streaming with sliding window missing last windows interval

Hello, I have a DLT pipeline where I want to calculate the rolling average of a column for the last 24 hours which is updated every hour.I'm using the below code to achieve this:       @Dlt.table() def gold(): df = dlt.read_stream("silver_table")...

Data Engineering
dlt
spark
streaming
window
  • 413 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @exilon, It seems like you’re trying to calculate a rolling average for a specific time window in your DLT pipeline. Let’s address the issue you’re facing. The behavior you’re observing is due to the way the window specification is defined. Whe...

  • 0 kudos
dbph
by New Contributor
  • 507 Views
  • 1 replies
  • 0 kudos

Databricks asset bundles error "failed to instantiate provider"

Hi all,I'm trying to deploy with databricks asset bundles. When running bundle deploy, the process fails with following error message:failed execution pid=25092 exit_code=1 error="terraform apply: exit status 1\n\nError: failed to read schema for dat...

  • 507 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @dbph , It seems you’re encountering an issue with deploying Databricks Asset Bundles. Let’s troubleshoot this step by step. Terraform Provider Issue: The error message indicates a problem with the Terraform provider for Databricks. Specifical...

  • 0 kudos
Stellar
by New Contributor II
  • 444 Views
  • 1 replies
  • 0 kudos

DLT DatePlane Error

Hi everyone,I am trying to build the pipeline but when I run it I receive an errorDataPlaneException: Failed to start the DLT service on the cluster. Please check the driver logs for more details or contact Databricks support.This is from the driver ...

  • 444 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Stellar, The error message you’re encountering indicates an issue with starting the Delta Live Tables (DLT) service on your cluster. Let’s break it down: DataPlaneException: This is a generic exception related to data plane operations.Failed ...

  • 0 kudos
dasiekr
by New Contributor II
  • 365 Views
  • 3 replies
  • 0 kudos

Merge operation replaces most of the underlying parquets

Hello,I have the following situation which I would like to fully understand.I have the delta table that consists of 10k active parquet files. Everyday I run merge operation based on new deliveries and joining by product_id key attribute. I checked me...

  • 365 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @dasiekr , Please refer to the below content that might help you -MERGE: Under the hoodDelta Lake completes a MERGE in two steps.Perform an inner join between the target table and source table to select all files that have matches.Perform an outer...

  • 0 kudos
2 More Replies
Avinash_Narala
by New Contributor III
  • 1743 Views
  • 3 replies
  • 1 kudos

Resolved! export notebook

Hi,I want to export notebook in python programming.is there a way to leverage databricks cli in python.Or any other way to export the notebook to my local PC. 

  • 1743 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Avinash_Narala, Let’s break down the steps for exporting a Databricks Notebook and explore options for leveraging the Databricks CLI in Python. Exporting a Databricks Notebook: Databricks allows you to import and export notebooks in various f...

  • 1 kudos
2 More Replies
NarenderKumar
by New Contributor II
  • 952 Views
  • 2 replies
  • 1 kudos

Resolved! Unable to read data from ADLS using databricks serverless sql pool

I have a data bricks workspace and an Azure data lake storage account.Both are present in the same Vnet.Unity catalog is enabled in the worksapce.I have created some tables in unity catalog.I am able to query the data from the tables when I use the a...

  • 952 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @NarenderKumar, Configuring an ETL framework using Delta Live Tables (DLT) can be powerful, especially when you want to maintain flexibility and avoid hardcoding configurations directly in your notebook. Let’s explore some options for managing yo...

  • 1 kudos
1 More Replies
Labels
Top Kudoed Authors