cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

PB-Data
by New Contributor III
  • 932 Views
  • 2 replies
  • 1 kudos

right semi join

Hi All,I am having issue running a simple right semi join in my community databricks edition.select * from Y right semi join X on Y.y = X.a;Error : [PARSE_SYNTAX_ERROR] Syntax error at or near 'semi': extra input 'semi'. Not sure what is the issue wi...

  • 932 Views
  • 2 replies
  • 1 kudos
Latest Reply
PB-Data
New Contributor III
  • 1 kudos

Thanks @szymon_dybczak 

  • 1 kudos
1 More Replies
NCat
by New Contributor III
  • 5946 Views
  • 4 replies
  • 0 kudos

ipywidgets: Uncaught RefferenceError require is not defined

Hi,When I tried to use ipywidgets, it returns the following error.I’m using Databricks with PrivateLink enabled on AWS, and Runtime version is 12.2 LTS.Is there something that I need to use ipywidgets in my environment?

CA0045C4-83C6-46FC-95DC-6857199FE69D.jpeg
  • 5946 Views
  • 4 replies
  • 0 kudos
Latest Reply
jvjvjvjvjv
New Contributor II
  • 0 kudos

I am currently experiencing the same error, Azure DataBricks, Runtime version is 15.3 ML, default Notebook Editor.

  • 0 kudos
3 More Replies
Avinash_Narala
by Contributor
  • 1949 Views
  • 1 replies
  • 0 kudos

Resolved! Liquid clustering vs partitioning

Hi,Is liquid clustering a replacement to partitioning?should we use still partitioning when we use liquid clustering?Can we use liquid clustering for all cases and ignore partitioning?

  • 1949 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @Avinash_Narala Yeah, you can think of it as a partitioning replacement. According with documentation: https://learn.microsoft.com/en-us/azure/databricks/delta/clusteringDelta Lake liquid clustering replaces table partitioning and ZORDER to simpli...

  • 0 kudos
PushkarDeole
by New Contributor III
  • 560 Views
  • 1 replies
  • 0 kudos

State store configuration with applyInPandasWithState for optimal performance

Hello,We are using a stateful pipeline for data processing and analytics. For state store, we are using applyInPandasWithState function however the state needs to be persistent across node restarts etc. At this point, we are not sure how the state ca...

  • 560 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hello,We are using a stateful pipeline for data processing and analytics. For state store, we are using applyInPandasWithState function however the state needs to be persistent across node restarts etc. At this point, we are not sure how the state ca...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
Avinash_Narala
by Contributor
  • 552 Views
  • 0 replies
  • 0 kudos

Serverless Cluster Issue

Hi,While using Serverless cluster I'm not able to access dbfs files, saying I don't have permission to the file.But while accessing them using All-purpose cluster I'm able to access them.Why am I facing this issue?

  • 552 Views
  • 0 replies
  • 0 kudos
youcanlearn
by New Contributor III
  • 611 Views
  • 2 replies
  • 2 kudos

Saving failed records with failed expectation name(s)

Hi all,I am using Databricks expectations to manage my data quality. But I wanted to save the failed records along side with the expectation name(s) - one or many - that the record failed. The only way I figure out is, not to use Databricks expectati...

  • 611 Views
  • 2 replies
  • 2 kudos
Latest Reply
iakshaykr
New Contributor III
  • 2 kudos

@youcanlearn Have you explore this : https://docs.databricks.com/en/delta-live-tables/expectations.html  

  • 2 kudos
1 More Replies
seefoods
by New Contributor III
  • 436 Views
  • 0 replies
  • 0 kudos

use dbutils outside a notebook

Hello everyone, I want to use dbtuil function outside my notebook, so i will use it in my external jar.I have add dbutil librairies in my build.sbt file "com.databricks" %% "dbutils-api" % "0.0.6"I have import the librairie on top of my code import c...

  • 436 Views
  • 0 replies
  • 0 kudos
erwingm10
by New Contributor
  • 384 Views
  • 1 replies
  • 0 kudos

Get Level Cluster Metrics

Im looking for a way to Optimize the consumption of the jobs in my company and the last piece of data to achieve this is the statistics of the Cluster Level Metrics called Active Tasks over time. Do we have any way to get this? Seems easy when is alr...

  • 384 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

 Hi @erwingm10 ,Unfortunately, currently that there is no direct endpoint in REST API to get cluster metrics. You can extract some ganglia metrics through custom scripting, but they're not so detailed like the one you looking for.Look at below links ...

  • 0 kudos
Avinash_Narala
by Contributor
  • 362 Views
  • 1 replies
  • 0 kudos

Mosaic AI

Hi,While going through recent releases of databricks, I came to know about the Mosaic AI.And i am little bit confused what mosaic AI exactly is? what it is offering? from a data engineering point of view what benefits can i expect?Anyone please answe...

  • 362 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @Avinash_Narala ,I think it is more targeted to people that are creating machine learning models or data scientist.Here you can read about it and as you can see it's all related to ML models, Gen AI, RAG etc.:https://www.databricks.com/product/mac...

  • 0 kudos
wallco26
by New Contributor III
  • 4684 Views
  • 3 replies
  • 0 kudos

Databricks External Data SQL Server Connection Dirty Reads

I've connected a SQL Server database as an external connection in Unity Catalog. It looks like when I write SELECT queries to that connection I end up locking my tables on the SQL Server. Is there a way to query these tables using a "with (nolock)" c...

Data Engineering
Database
SQL Server
  • 4684 Views
  • 3 replies
  • 0 kudos
Latest Reply
wallco26
New Contributor III
  • 0 kudos

Thanks Slash - where would the "with (nolock)" command fall into the SQL Syntax...within the OPTIONS section? What would the specific command look like? 

  • 0 kudos
2 More Replies
Devsql
by New Contributor III
  • 1071 Views
  • 4 replies
  • 1 kudos

What is difference between _RAW tables and _APPEND_RAW tables of Bronze-Layer of Azure Databricks

Hi Team,I would like to know difference between _RAW tables and _APPEND_RAW tables of Bronze-Layer.As both are STREAMING Tables then why we need 2 separate tables.Note: we are following Medalion Architecture. Also above tables are created via Delta L...

Data Engineering
Azure Databricks
Delta Live Table
Delta Live Table Pipeline
  • 1071 Views
  • 4 replies
  • 1 kudos
Latest Reply
Devsql
New Contributor III
  • 1 kudos

Hi @Retired_mod , I saw your replies to other posts, so thought to ask you....would you like to help me on this...!!!

  • 1 kudos
3 More Replies
PP09
by New Contributor II
  • 954 Views
  • 1 replies
  • 1 kudos

job failing with below error massage

Caused by: HTTP Error -1; url='https://login.microsoftonline.com/271df5c2-953a-497b-93ad-7adf7a4b3cd7/oauth2/token' AzureADAuthenticator.getTokenCall threw java.net.UnknownHostException : login.microsoftonline.comshaded.databricks.azurebfs.org.apache...

  • 954 Views
  • 1 replies
  • 1 kudos
Latest Reply
jenshumrich
Contributor
  • 1 kudos

This was caused for me by the line (pyspark):children = [f for f in dbutils.fs.ls(node)]with the node being a"dbfs:/mnt/lifestrategy-blob/scada/"and this a mounted directory. It seems like the implementation of dbutils.fs is done with the same qualit...

  • 1 kudos
guangyi
by Contributor III
  • 1265 Views
  • 5 replies
  • 6 kudos

Resolved! Why is the workflow trigger status always paused?

I create a workflow job via Asset Bundle. However, after deploying the job to the databricks the trigger status is always paused even no matter how I update the cron expression. I can manually trigger it successfully. I cannot figure out why. Am I mi...

Screenshot 2024-07-16 at 12.05.51.png Screenshot 2024-07-16 at 12.06.50.png
  • 1265 Views
  • 5 replies
  • 6 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 6 kudos

Next to the cron expression, you also need the following property: pause_statusFor example:schedule: quartz_cron_expression: 0 0 6 * * ? timezone_id: Europe/Amsterdam pause_status: UNPAUSEDThe property can be set to PAUSED and UNPAUSED. Hope th...

  • 6 kudos
4 More Replies
ksenija
by Contributor
  • 643 Views
  • 2 replies
  • 1 kudos

Resolved! DLT pipeline - reading from external tables

Hello!I created a DLT pipeline where my sources are external tables. I have to apply changes (stored_as_scd_type = 1). However, when I run my pipeline, I don't see any incremental uploads. The data remains in the same state as when I first created th...

  • 643 Views
  • 2 replies
  • 1 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 1 kudos

Hello @ksenija, I hope this message finds you well. Is your source table receiving new records? If so, are the fields (operation/sequenceNum) being filled? If possible, please provide a sample of the code you are using to create your target table wit...

  • 1 kudos
1 More Replies
Avinash_Narala
by Contributor
  • 545 Views
  • 0 replies
  • 0 kudos

shared serverless vs dedicated serverless?

Hi All,I gone through https://docs.databricks.com/en/admin/system-tables/serverless-billing.html and wondering..How serverless compute is shared across workloads.is there a option to setup that? difference between shared serverless vs dedicated serve...

  • 545 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels