cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Skully
by New Contributor
  • 999 Views
  • 1 replies
  • 0 kudos

Workflow Fail safe query

I have a large SQL query that includes multiple Common Table Expressions (CTEs) and joins across various tables, totaling approximately 2,500 lines. I want to ensure that if any part of the query or a specific CTE fails—due to a missing table or colu...

  • 999 Views
  • 1 replies
  • 0 kudos
Latest Reply
LingeshK
Databricks Employee
  • 0 kudos

There are few options you can try. Based of the information shared, I am assuming a skeleton for you complicated query as follows: WITH cte_one AS (SELECT *FROM view_one),-- Other CTEs...-- Your main query logicSELECTFROM cte_one-- Joins and other cl...

  • 0 kudos
Krizofe
by New Contributor II
  • 8101 Views
  • 7 replies
  • 5 kudos

Resolved! Migrating data from synapse to databricks

Hello team,I have a requirement of moving all the table from Azure Synapse (dedicated sql pool) to databricks.we have a data coming up from source to azure data lake frequently.we have Azure data factory to load data (data flow does the basic transfo...

  • 8101 Views
  • 7 replies
  • 5 kudos
Latest Reply
thelogicplus
Contributor II
  • 5 kudos

Hi @Krizofe , Just gone through you deatils and thought our similar experience  with  Azure Synapse to databrick migration. We faced a similar situation and were initially hesitant, One of the my colleague recommanded to use Travinto Technologies acc...

  • 5 kudos
6 More Replies
somedeveloper
by New Contributor III
  • 1553 Views
  • 3 replies
  • 0 kudos

Databricks Setting Dynamic Local Configuration Properties

It seems that Databricks is somehow setting the properties of local spark configurations for each notebook. Can someone point me to exactly how and where this is being done? I would like to set the scheduler to utilize a certain pool by default, but ...

  • 1553 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

You will need to leverage cluster-level Spark configurations or global init scripts.  This will allow you to set "spark.scheduler.poo" property automatically for all workloads on the cluster. You can try navigationg to "Compute", select the cluster y...

  • 0 kudos
2 More Replies
Sega2
by New Contributor III
  • 2665 Views
  • 1 replies
  • 0 kudos

cannot import name 'Buffer' from 'typing_extensions' (/databricks/python/lib/python3.10/site-package

I am trying to add messages to an azure service bus from a notebook. But I get error from title. Any suggestions how to solve this?import asynciofrom azure.servicebus.aio import ServiceBusClientfrom azure.servicebus import ServiceBusMessagefrom azure...

  • 2665 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@Sega2 it sounds like the error occurs because the typing_extensions library version in your Databricks environment is outdated and does not include the Buffer class, which is being imported by one of the Azure libraries. Can you first try: %pip inst...

  • 0 kudos
Sudic29
by New Contributor
  • 1615 Views
  • 1 replies
  • 0 kudos

Bookmark in pdf

I am creating a pdf using pyspark and trying to make bookmarks for each table in the pages. All the bookmarks end up pointing to the first table in the first page. Please help me out here.

  • 1615 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@Sudic29 can you please share more about what you have implemented so far? This requires dynamically tracking the page number during the PDF creation process. Example in python:   from PyPDF2 import PdfReader, PdfWriter def add_bookmarks_to_pdf(inpu...

  • 0 kudos
pemidexx
by New Contributor III
  • 1212 Views
  • 2 replies
  • 0 kudos

AI_QUERY does not accept modelParameters argument

I am trying to pass a column of data from python/pandas to Spark, then run AI_QUERY. However, when I attempt to pass modelParameters (such as temperature), the function fails. Below is a minimal example: import pandas as pdqueries = pd.DataFrame([ ...

  • 1212 Views
  • 2 replies
  • 0 kudos
Latest Reply
pemidexx
New Contributor III
  • 0 kudos

Hi @Walter_C , yes, I am receiving this error when only attempting to set temperature, which should be supported on most if not all models, including the specific models I'm working with. The error message seems to indicate this is a problem with AI_...

  • 0 kudos
1 More Replies
ChristianRRL
by Valued Contributor III
  • 1342 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Workflows - Generate Tasks Programmatically

Hi there,I've used databricks workflows to explicitly create tasks with known input parameters (either user input or default parameters). But I'm wondering, what if I want the output of one task to be a list of specific ID's (e.g. id = [7,8,10,13,27]...

  • 1342 Views
  • 1 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

This sounds like a great fit for the For Each task type! Here is the blog, and the documentation

  • 1 kudos
jeroaranda
by New Contributor II
  • 2429 Views
  • 1 replies
  • 0 kudos

How to pass task name as parameter in scheduled job that will be used as a schema name in query

I want to run a parametrized sql query in a task. Query: select * from {{client}}.catalog.table with client value being {{task.name}}.if client is a string parameter, it is replaced with quotes which throws an error.if table is a dropdown list parame...

  • 2429 Views
  • 1 replies
  • 0 kudos
Latest Reply
Zach_Jacobson23
Databricks Employee
  • 0 kudos

Try this:select * from identifier(:catalog||'.schema.table') The :catalog is a parameter within DBSQLReplace schema and table with actual names

  • 0 kudos
VicS
by Contributor
  • 2245 Views
  • 3 replies
  • 2 kudos

How to use custom whl file + pypi repo with a job cluster in asset bundles?

I tried looking through the documentation but it is confusing at best and misses important parts at worst.  Is there any place where the entire syntax and ALL options for asset bundle YAMLs are described? I found this https://docs.databricks.com/en/d...

  • 2245 Views
  • 3 replies
  • 2 kudos
Latest Reply
VicS
Contributor
  • 2 kudos

It took me a while to realize the distinction of the keys inside the task - so for anyone else looking into this: only one of the following keys can exist in a task definition:    tasks: - task_key: ingestion_delta # existing_c...

  • 2 kudos
2 More Replies
ns_casper
by New Contributor II
  • 2816 Views
  • 5 replies
  • 1 kudos

Databricks Excel ODBC driver bug

Hello!I might have experienced a bug with the ODBC driver. We have an issue where given certain priviledges in databricks, the ODBC driver is unable to show any schemas/tables.When we click the 'expand' button on any catalog in the list (of which we ...

  • 2816 Views
  • 5 replies
  • 1 kudos
Latest Reply
jbibs
New Contributor II
  • 1 kudos

Following this post - we are also faced with the same issue. @KTheJoker- when I'm connecting and trying to expand a catalog, I do see the query fire off in the SQL Warehouse query history but in Excel nothing is returned. I can see the schemas/tables...

  • 1 kudos
4 More Replies
JissMathew
by Valued Contributor
  • 2508 Views
  • 6 replies
  • 1 kudos

Structured streaming in Databricks using delta table

Hi everyone, I’m new to Databricks and exploring its features. I’m trying to implement Change Data Capture (CDC) from the bronze layer to the silver layer using streaming. Could anyone share sample code or reference materials for implementing CDC wit...

  • 2508 Views
  • 6 replies
  • 1 kudos
Latest Reply
Mike_Szklarczyk
Contributor
  • 1 kudos

You can also look at https://www.databricks.com/resources/demos#tutorials 

  • 1 kudos
5 More Replies
guangyi
by Contributor III
  • 1428 Views
  • 2 replies
  • 0 kudos

DLT pipeline observability questions (and maybe suggestions)

All my questions is around this code block@Dlt.append_flow(target=”target_table”): def flow_01(): df = spark.readStream.table(“table_01”) @dlt.append_flow(target=”target_table”): def flow_02(): df = spark.readStream.table(“table_02”)The first qu...

  • 1428 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nam_Nguyen
Databricks Employee
  • 0 kudos

Hello @guangyi , I am getting back to you with some insights Regarding your first question about checkpointing You can manually check the checkpointing location of your stream table. The checkpoints of your Delta Live Tables are under Storage locatio...

  • 0 kudos
1 More Replies
madams
by Contributor III
  • 1230 Views
  • 1 replies
  • 0 kudos

Databricks Asset Bundle - variables for job trigger

I'm using the Databricks CLI to deploy an asset bundle for a job.  I'm attempting to setup the configuration such that the "dev" target does not have a trigger on it, and the "prod" target does.  Essentially the dev job is not scheduled to run and th...

  • 1230 Views
  • 1 replies
  • 0 kudos
Latest Reply
SigaEd
New Contributor II
  • 0 kudos

in your dev target, you can add mode to pause all trigger.targets: dev: mode: development DAB also have new update,you can also use preset  to handle different target setting.targets: dev: presets: name_prefix: "testing_" # pre...

  • 0 kudos
AWS1567
by New Contributor III
  • 25001 Views
  • 10 replies
  • 6 kudos

We've encountered an error logging you in.

I'm trying to login for past two days and i'm still facing this error: "We've encountered an error logging you in." I've tried to reset the password multiple times and nothing happened. My friend is also not able to login. I request you to resolve t...

Databricks_login_issue
  • 25001 Views
  • 10 replies
  • 6 kudos
Latest Reply
rmutili
New Contributor II
  • 6 kudos

 Hey, I am not able to login to my work databrick's account. I am getting the above errors.

  • 6 kudos
9 More Replies
JensV
by New Contributor II
  • 1205 Views
  • 1 replies
  • 0 kudos

Resolved! Using sql inside Notebook using question marks

Hi all,I have a very quick question that I hope someone can help with.I want to execute a very simple sql statement like %sql select * from json.`/Volumes/adfmeta/Objects.json` where ObjectName like '%SGm$RITWebsader$911a%'However, the sql does not ...

  • 1205 Views
  • 1 replies
  • 0 kudos
Latest Reply
JAHNAVI
Databricks Employee
  • 0 kudos

Hi @JensV , Question marks are used as parameter placeholders, so could you please try to escape the question mark using Backslashes?select * from json.`/Volumes/adfmeta/Objects.json` where ObjectName like '%SGm\\$RITWebsader\\$911a%'Alternatively, w...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels