cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kwasi
by New Contributor II
  • 20953 Views
  • 10 replies
  • 2 kudos

Kafka timout

Hello, I am trying to read topics from a kafaka stream but I am getting the time out error below.java.util.concurrent.ExecutionException: kafkashaded.org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: describeT...

  • 20953 Views
  • 10 replies
  • 2 kudos
Latest Reply
VZLA
Databricks Employee
  • 2 kudos

What's your Kafka Broker version and which Kafka client is in use (spark's, python-kafka, kafka-confluent,...) ?

  • 2 kudos
9 More Replies
himanshu_k
by New Contributor
  • 6720 Views
  • 3 replies
  • 0 kudos

Clarification Needed: Ensuring Correct Pagination with Offset and Limit in PySpark

Hi community,I hope you're all doing well. I'm currently engaged in a PySpark project where I'm implementing pagination-like functionality using the offset and limit functions. My aim is to retrieve data between a specified starting_index and ending_...

  • 6720 Views
  • 3 replies
  • 0 kudos
Latest Reply
Mathias_Peters
Contributor II
  • 0 kudos

Hi, did you find answer to this question? I am having similar problems and a slow solution, which I need to improve upon. Thanks in advance

  • 0 kudos
2 More Replies
Reza
by New Contributor III
  • 13828 Views
  • 11 replies
  • 6 kudos

Resolved! How can search in a specific folder in Databricks?

There is a keyword search option in Databricks that searches for a command or word in the entire workspace. How can search for a command in a specific folder or repository?

  • 13828 Views
  • 11 replies
  • 6 kudos
Latest Reply
Jensz007
New Contributor II
  • 6 kudos

@AtanuI agree with nelsoncardenas, the problem is not solved, and the answer currently only provides us with saying we need to raise a feature request.Would it be possible to at least link the feature requested by nelsoncardenas to this post/answer? ...

  • 6 kudos
10 More Replies
nayan_wylde
by Esteemed Contributor
  • 922 Views
  • 3 replies
  • 0 kudos

Installing Maven in UC enabled Standard mode cluster.

Curios if anyone face the issue of installing Maven packages in UC enabled cluster. Traditionally we use to install maven packages from artifactory repo. I am trying to install the same package from a UC enabled cluster (Standard mode). It worked whe...

  • 922 Views
  • 3 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @nayan_wylde Yes, this is a common challenge when transitioning to Unity Catalog (UC) enabled clusters.The installation of Maven packages from Artifactory repositories does work differently in UC environments,but there are several approaches you c...

  • 0 kudos
2 More Replies
PedroFaria2135
by New Contributor II
  • 1959 Views
  • 1 replies
  • 0 kudos

Resolved! How to add permissions to a Databricks Workflow deployed via Asset Bundle YAML?

Hey! I was deploying a new Databricks Workflow into my workspace via Databricks Asset Bundles. Currently, I have a very simple workflow, defined in a YAML file like this: resources:  jobs:    example_job:      name: example_job      schedule:        ...

  • 1959 Views
  • 1 replies
  • 0 kudos
Latest Reply
nikhilj0421
Databricks Employee
  • 0 kudos

Hi @PedroFaria2135, this can be done using the permission key in the YAML file. Please refer to this document: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/reference#permissions    permissions: - level: CAN_VIEW group_name: te...

  • 0 kudos
Sangamswadik
by New Contributor III
  • 3080 Views
  • 5 replies
  • 2 kudos

Resolved! Unable to see All purpose compute

In the workspace, I can only see SQL warehouse, and apps, I've attached a screenshot. I don't see an option to create all purpose compute. Can you please tell me if there is a way to create one? Under user entitlements page look Identity and access >...

TWn25NCajM.png
  • 3080 Views
  • 5 replies
  • 2 kudos
Latest Reply
Execute
New Contributor II
  • 2 kudos

Please let us know how did you resolve this

  • 2 kudos
4 More Replies
karthikmani
by New Contributor
  • 864 Views
  • 1 replies
  • 1 kudos

Resolved! How to log the errors?

We have a notebook with some generic framework that we created to run for multiple tables everyday. We wanted to log the error/success/exceptions any such errors needs to be recorded in a log table so that we can troubleshoot based on the error log f...

  • 864 Views
  • 1 replies
  • 1 kudos
Latest Reply
nayan_wylde
Esteemed Contributor
  • 1 kudos

You can basically create some custom functions to log the events and write it to a data lake and then use structured streaming to read the data from data lake to a delta table.%scala// Functionsdef set_local_variables() = {      // get the variables ...

  • 1 kudos
OODataEng
by New Contributor III
  • 2074 Views
  • 6 replies
  • 1 kudos

Liquid clustering performance issue

Hello,I have a table with approximately 300 million records. It weighs 3.4 GB and consists of 305 files.I wanted to create liquid clustering for it and chose a date column as the key for clustering. When I created a new table with the above details b...

  • 2074 Views
  • 6 replies
  • 1 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 1 kudos

Hey @OODDATAEng To create a new table in Databricks using the schema and data from an existing table, you can use the CREATE TABLE AS SELECT command. This command allows you to define a new table based on the results of a SELECT query executed on the...

  • 1 kudos
5 More Replies
JohanS
by New Contributor III
  • 5855 Views
  • 2 replies
  • 1 kudos

Resolved! WorkspaceClient authentication fails when running on a Docker cluster

from databricks.sdk import WorkspaceClientw = WorkspaceClient()ValueError: default auth: cannot configure default credentials ...I'm trying to instantiate a WorkspaceClient in a notebook on a cluster running a Docker image, but authentication fails.T...

  • 5855 Views
  • 2 replies
  • 1 kudos
Latest Reply
kyle_scherer1_5
New Contributor II
  • 1 kudos

Any progress here? Same issue, over a year later

  • 1 kudos
1 More Replies
OODataEng
by New Contributor III
  • 821 Views
  • 2 replies
  • 0 kudos

Resolved! Git cerdentials for serivce principal running jobs

Hello, I have a permission issue when trying to access Azure DevOps and run a job using a Service Principal.I’ve read about the whole credentials topic, and indeed, when I create a PAT (Personal Access Token) through my personal user account, I can s...

OODataEng_0-1749968869036.png
  • 821 Views
  • 2 replies
  • 0 kudos
Latest Reply
loui_wentzel
Contributor
  • 0 kudos

Using a PAT is how you authenticate as a user, so that you can configure your Service Principal (SP) - if you follow this link, there's a guide to the next steps (you're on step 3 now)Thie article explains a bit more on how to setup up the SP in Azur...

  • 0 kudos
1 More Replies
KristiLogos
by Contributor
  • 1735 Views
  • 4 replies
  • 1 kudos

Simba JDBC Exception When Querying Tables via BigQuery Databricks Connection

Hello, I have a federated connection to BigQuery that has GA events tables for each of our projects. I'm trying to query each daily table which contains about 400,000 each day, and load into another table, but I keep seeig this Simba JDBC exception. ...

  • 1735 Views
  • 4 replies
  • 1 kudos
Latest Reply
tsekityam_2
New Contributor II
  • 1 kudos

I also have this issue, and I resolved it by cast all the records columns in bigquery to string before I dump the data.I first create a view likecreate view xxx as select string_1, string_2, string_3, to_json_string(record_1) as record_1, to_json_s...

  • 1 kudos
3 More Replies
mkwparth
by New Contributor III
  • 2063 Views
  • 3 replies
  • 1 kudos

Resolved! How Increase REPL time to prevent timeout error

Hi everyone, I've tried setting the Spark configuration spark.databricks.repl.timeout to 300, but I’m still getting a REPL timeout error saying it took longer than 60 seconds. It seems like the configuration might be incorrect. Can someone guide me o...

mkwparth_0-1749620824347.png mkwparth_1-1749620849807.png
  • 2063 Views
  • 3 replies
  • 1 kudos
Latest Reply
mkwparth
New Contributor III
  • 1 kudos

Hi @Saritha_S ,Yes! I've configured spark config that you said. I'll observe for few days and let you know.Thanks! For your Help.

  • 1 kudos
2 More Replies
mickniz
by Contributor
  • 25382 Views
  • 8 replies
  • 2 kudos

Connect to Databricks from PowerApps

Hi All,Currently I trying to connect databricks Unity Catalog from Powerapps Dataflow by using spark connector specifying http url and using databricks personal access token as specified in below screenshot: I am able to connect but the issue is when...

mickniz_0-1714487746554.png mickniz_1-1714487891958.png
  • 25382 Views
  • 8 replies
  • 2 kudos
Latest Reply
Toussaint_Webb
Databricks Employee
  • 2 kudos

If you are an Azure Databricks customer, there is now a connector for Power Platform (Power Apps, Copilot Studio, and Power Automate) in Public Preview.BlogDocumentation 

  • 2 kudos
7 More Replies
data4life
by New Contributor II
  • 1054 Views
  • 4 replies
  • 5 kudos

Relative Path Reading Ambiguity in running nested run commands

Hello All,I came across an unusual error while using the %run & dbutils.notebook.run() functionalities of the notebook in tandem and the particular scenarios are listed below -I have below directory structure(simplified) where all 3 notebooks are loc...

main.png NB1.png NB2.png
  • 1054 Views
  • 4 replies
  • 5 kudos
Latest Reply
jameshughes
Contributor II
  • 5 kudos

I'm going to run an experiment in my workspace and let you know if I see the same thing.  I'm not sure if I have seen this, but also not sure if my use of relative pathing previously had notebooks in different directories as you have listed.  General...

  • 5 kudos
3 More Replies
ashokv
by New Contributor II
  • 760 Views
  • 2 replies
  • 0 kudos

Range join hint does not help in faster execution of spark sql

Spark SQL execution did not complete even after 12 hours, i ran it on i3.xlarge with 4 worker nodes.only two worker nodes showed as running, with CPU at 100%what should i do differently? --SQLINSERT into  attribute_results...SELECT  /*+ BROADCAST(t) ...

  • 760 Views
  • 2 replies
  • 0 kudos
Latest Reply
saiprasadambati
New Contributor III
  • 0 kudos

can you share the result of the below query ?select count(1) from transaction_attributes where analysis_start_date = '2025-05-01' and analysis_end_date = '2025-05-01' ,  If it has multiple entries , the join condition will lead to cross join and henc...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels