cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

seefoods
by New Contributor III
  • 37 Views
  • 1 replies
  • 0 kudos

use dbutils outside a notebook

Hello everyone, I want to use dbtuil function outside my notebook, so i will use it in my external jar.I have add dbutil librairies in my build.sbt file "com.databricks" %% "dbutils-api" % "0.0.6"I have import the librairie on top of my code import c...

  • 37 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @seefoods, In order to use the dbutils functions, you'll need to initialize the dbutils instance. You can do this by adding the following code at the beginning of your Jar's main function:   val dbutils = com.databricks.dbutils_v1.DBUtilsHolder...

  • 0 kudos
Avinash_Narala
by Contributor
  • 60 Views
  • 1 replies
  • 1 kudos

Resolved! Serverless Cluster Issue

Hi,While using Serverless cluster I'm not able to access dbfs files, saying I don't have permission to the file.But while accessing them using All-purpose cluster I'm able to access them.Why am I facing this issue?

  • 60 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Avinash_Narala,  When you use a Serverless cluster, it’s associated with a Databricks-managed IAM role that accesses AWS resources. However, this role might lack the necessary permissions to access DBFS resources in your account.On the other hand...

  • 1 kudos
Poovarasan
by New Contributor III
  • 356 Views
  • 7 replies
  • 1 kudos

Error while installing ODBC to shared cluster

I previously used the following script to install and configure the ODBC driver on our shared cluster in Databricks, and it was functioning correctly. However, I am currently experiencing issues where the installation is not working as expected. Plea...

  • 356 Views
  • 7 replies
  • 1 kudos
Latest Reply
imsabarinath
New Contributor
  • 1 kudos

The below approach is working for me... I had to download the packages upfront and place it on a volume though.#!/bin/bashset -euxo pipefailecho 'debconf debconf/frontend select Noninteractive' | debconf-set-selectionssudo ACCEPT_EULA=Y dpkg -i odbci...

  • 1 kudos
6 More Replies
Skr7
by New Contributor II
  • 701 Views
  • 2 replies
  • 0 kudos

Databricks Asset Bundles

Hi, I'm implementing Databricks Asset bundles, my scripts are in GitHub and my /resource has all the .yml of my Databricks workflow which are pointing to the main branch      git_source: git_url: https://github.com/xxxx git_provider: ...

Data Engineering
Databricks
  • 701 Views
  • 2 replies
  • 0 kudos
Latest Reply
JacekLaskowski
New Contributor III
  • 0 kudos

Why not use Substitutions and Custom variables that can be specified on command line using --var="<key>=<value>"?With all the features your databricks.yml would look as follows:variables:  git_branch:    default: maingit_source:  git_url: https://git...

  • 0 kudos
1 More Replies
PB09
by New Contributor II
  • 420 Views
  • 2 replies
  • 1 kudos

right semi join

Hi All,I am having issue running a simple right semi join in my community databricks edition.select * from Y right semi join X on Y.y = X.a;Error : [PARSE_SYNTAX_ERROR] Syntax error at or near 'semi': extra input 'semi'. Not sure what is the issue wi...

  • 420 Views
  • 2 replies
  • 1 kudos
Latest Reply
PB09
New Contributor II
  • 1 kudos

Thanks @Slash 

  • 1 kudos
1 More Replies
NCat
by New Contributor III
  • 4853 Views
  • 6 replies
  • 0 kudos

ipywidgets: Uncaught RefferenceError require is not defined

Hi,When I tried to use ipywidgets, it returns the following error.I’m using Databricks with PrivateLink enabled on AWS, and Runtime version is 12.2 LTS.Is there something that I need to use ipywidgets in my environment?

CA0045C4-83C6-46FC-95DC-6857199FE69D.jpeg
  • 4853 Views
  • 6 replies
  • 0 kudos
Latest Reply
jvjvjvjvjv
New Contributor
  • 0 kudos

I am currently experiencing the same error, Azure DataBricks, Runtime version is 15.3 ML, default Notebook Editor.

  • 0 kudos
5 More Replies
Avinash_Narala
by Contributor
  • 55 Views
  • 1 replies
  • 0 kudos

Resolved! Liquid clustering vs partitioning

Hi,Is liquid clustering a replacement to partitioning?should we use still partitioning when we use liquid clustering?Can we use liquid clustering for all cases and ignore partitioning?

  • 55 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slash
New Contributor II
  • 0 kudos

Hi @Avinash_Narala Yeah, you can think of it as a partitioning replacement. According with documentation: https://learn.microsoft.com/en-us/azure/databricks/delta/clusteringDelta Lake liquid clustering replaces table partitioning and ZORDER to simpli...

  • 0 kudos
PushkarDeole
by New Contributor
  • 81 Views
  • 2 replies
  • 0 kudos

State store configuration with applyInPandasWithState for optimal performance

Hello,We are using a stateful pipeline for data processing and analytics. For state store, we are using applyInPandasWithState function however the state needs to be persistent across node restarts etc. At this point, we are not sure how the state ca...

  • 81 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @PushkarDeole, To leverage RocksDB as the state store with `applyInPandasWithState` in Databricks, configure your Spark session with the following setting: spark.conf.set("spark.sql.streaming.stateStore.providerClass", "com.databricks.sql.streamin...

  • 0 kudos
1 More Replies
MyTrh
by New Contributor
  • 248 Views
  • 6 replies
  • 2 kudos

Delta table with unique columns incremental refresh

Hi Team,We have one huge streaming table from which we want to create another streaming table in which we will pick few columns from the original streaming table. But in this new table the rows must be unique.Can someone please help me with the imple...

  • 248 Views
  • 6 replies
  • 2 kudos
Latest Reply
Slash
New Contributor II
  • 2 kudos

Hi @MyTrh ,Ok, I think I created similiar use case to yours. I have streaming table with column structure based on your exampleCREATE OR REFRESH STREAMING TABLE clicks_raw AS SELECT *, current_timestamp() as load_time FROM cloud_files('/Volumes/dev/d...

  • 2 kudos
5 More Replies
Monsem
by New Contributor III
  • 5435 Views
  • 10 replies
  • 3 kudos

Resolved! No Course Materials Widget below Lesson

Hello everyone,In my Databricks partner academy account, there is no course material while it should be under the lesson video. How can I resolve this problem? Does anyone else face the same problem? I had submitted a ticket to ask Databricks team bu...

  • 5435 Views
  • 10 replies
  • 3 kudos
Latest Reply
Medhat_Elassi
New Contributor II
  • 3 kudos

I have the same problem, can't find the course materials, only the slides in the last section.

  • 3 kudos
9 More Replies
youcanlearn
by New Contributor III
  • 163 Views
  • 2 replies
  • 2 kudos

Saving failed records with failed expectation name(s)

Hi all,I am using Databricks expectations to manage my data quality. But I wanted to save the failed records along side with the expectation name(s) - one or many - that the record failed. The only way I figure out is, not to use Databricks expectati...

  • 163 Views
  • 2 replies
  • 2 kudos
Latest Reply
iakshaykr
New Contributor
  • 2 kudos

@youcanlearn Have you explore this : https://docs.databricks.com/en/delta-live-tables/expectations.html  

  • 2 kudos
1 More Replies
erwingm10
by New Contributor
  • 62 Views
  • 1 replies
  • 0 kudos

Get Level Cluster Metrics

Im looking for a way to Optimize the consumption of the jobs in my company and the last piece of data to achieve this is the statistics of the Cluster Level Metrics called Active Tasks over time. Do we have any way to get this? Seems easy when is alr...

  • 62 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slash
New Contributor II
  • 0 kudos

 Hi @erwingm10 ,Unfortunately, currently that there is no direct endpoint in REST API to get cluster metrics. You can extract some ganglia metrics through custom scripting, but they're not so detailed like the one you looking for.Look at below links ...

  • 0 kudos
Avinash_Narala
by Contributor
  • 60 Views
  • 1 replies
  • 0 kudos

shared serverless vs dedicated serverless?

Hi All,I gone through https://docs.databricks.com/en/admin/system-tables/serverless-billing.html and wondering..How serverless compute is shared across workloads.is there a option to setup that? difference between shared serverless vs dedicated serve...

  • 60 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Avinash_Narala,  Serverless Compute Overview: Serverless compute allows you to run jobs and notebooks without managing infrastructure. It’s designed for simplicity and efficiency.With serverless compute, you focus on implementing your data pr...

  • 0 kudos
Avinash_Narala
by Contributor
  • 51 Views
  • 1 replies
  • 0 kudos

Mosaic AI

Hi,While going through recent releases of databricks, I came to know about the Mosaic AI.And i am little bit confused what mosaic AI exactly is? what it is offering? from a data engineering point of view what benefits can i expect?Anyone please answe...

  • 51 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slash
New Contributor II
  • 0 kudos

Hi @Avinash_Narala ,I think it is more targeted to people that are creating machine learning models or data scientist.Here you can read about it and as you can see it's all related to ML models, Gen AI, RAG etc.:https://www.databricks.com/product/mac...

  • 0 kudos
Prasad_Koneru
by New Contributor III
  • 62 Views
  • 1 replies
  • 0 kudos

How to export metadata of catalog objects

Hi All,I want to export metadata of catalog objects (schemas, tables, volumes, functions models) and import the metadata to another catalog.So do we have any readymade process/notebook/method/api is available to do this?Please help on this.Thanks in ...

  • 62 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Prasad_Koneru, There is no direct import functionality in the Databricks Unity Catalog.

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels