cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Erik
by Valued Contributor III
  • 204 Views
  • 1 replies
  • 0 kudos

Use unity catalog access connector for autoloader file notification events

We have a databricks access connector, and we have granted it access to file events.  But how do we now use that access connector in cloudfiles/autoloader with file-notifications? If I provide the id in the "cloudFiles.clientId" option, I am asked to...

  • 204 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your question!  If the access connector is still prompting for a secret or certificate when used in cloudFiles.clientId, this typically indicates that the authentication method is not being properly recognized. Here's what to check: Access...

  • 0 kudos
vbajaj1
by New Contributor II
  • 207 Views
  • 1 replies
  • 1 kudos

Resolved! Integrating Databricks Table with Web Page

Hi Guys,We need to integrate Databricks table with Web Page. Where I want to read Databricks table and show it in grid in WebPage and also want to give ability to update this table from web page. Databricks table is in Unity catalog. Have anyone trie...

  • 207 Views
  • 1 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Thank you for your question! To integrate a Databricks table with a web page, you can follow these basic steps: Read the Table: Use a Databricks SQL endpoint or JDBC/ODBC driver to query the table from your web application backend. For example, if y...

  • 1 kudos
MattHeidebrecht
by New Contributor II
  • 417 Views
  • 3 replies
  • 1 kudos

Resolved! Translations from T-SQL: TOP 1 OUTER APPLY or LEFT JOIN

Hi All,I am wondering how you would go about translating either of the below to Spark SQL in Databricks.  They are more or less equivalent statements in T-SQL.Please note that I am attempting to pair each unique Policy (IPI_ID) record with its highes...

  • 417 Views
  • 3 replies
  • 1 kudos
Latest Reply
MattHeidebrecht
New Contributor II
  • 1 kudos

Thanks filipniziol!  I'll start running with that when I run into cases where I need an embedded TOP 1.

  • 1 kudos
2 More Replies
smit_tw
by New Contributor III
  • 313 Views
  • 3 replies
  • 2 kudos

Resolved! Creating a Databricks Asset Bundle with Sequential Pipelines and Workflow using YAML

Is it possible to create a repository with a Databricks asset bundle that includes the following pipelines?Test1 (Delta Live Table Pipeline)Test2 (Delta Live Table Pipeline)Test3 (Delta Live Table Pipeline)Workflow JobWorkflow to execute the above pi...

  • 313 Views
  • 3 replies
  • 2 kudos
Latest Reply
filipniziol
Contributor III
  • 2 kudos

Hi @smit_tw,Great! If this resolves your question, please consider marking it as the solution. It helps others in the community find answers more easily.

  • 2 kudos
2 More Replies
satyasamal
by New Contributor II
  • 412 Views
  • 1 replies
  • 1 kudos

Resolved! org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows

Hello All,My Dataframe has 1 million records and it Contain XML files as column value . I am trying to parse the XML using Xpath function . It working fine for small records count . But it failed while trying to run 1 million records.Error Message : ...

  • 412 Views
  • 1 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Thank you for your question. The error is likely caused by memory issues or inefficient processing of the large dataset. Parsing XML with XPath is resource-intensive, and handling 1 million records requires optimization. You can try df = df.repartiti...

  • 1 kudos
fzlrfk
by New Contributor II
  • 323 Views
  • 2 replies
  • 0 kudos

Data bricks Internal error

I am new to data bricks. I am trying to debug an existing notebook which failsintermittently (few times a month). Once it is re-run it runs fine. Any help will be appreciated. I have attached sample code. environment:Data bricks Runtime Version : 14....

  • 323 Views
  • 2 replies
  • 0 kudos
Latest Reply
fzlrfk
New Contributor II
  • 0 kudos

HiThanks for your response. As I said the assistant give recommendation on changing the code. Would changing the code help ? When will a fix be released ? error: NoSuchElementException: None.get at scala.None$.get(Option.scala:527)assistant fix:The e...

  • 0 kudos
1 More Replies
ashraf1395
by Valued Contributor
  • 202 Views
  • 2 replies
  • 0 kudos

Issue while Migration from hive metastrore to ucx

tables : 4 tables in the schema :databricks_log that are not being migrated error showing they are in dbfs root location and cannot be managed while they are in dbfs mount locationExample this model_notebook _logs :   its location is dbfs:/mnt but in...

ashraf1395_1-1733480572318.jpeg ashraf1395_0-1733480572316.jpeg
  • 202 Views
  • 2 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

@ashraf1395 To migrate managed tables stored at DBFS root to UC, you can do it through Deep Clone or Create Table As Select (CTAS). This also means that the HMS table data needs to be moved to the cloud storage location governed by UC. Please ensure ...

  • 0 kudos
1 More Replies
Barbarossa
by New Contributor II
  • 246 Views
  • 2 replies
  • 1 kudos

Resolved! Issue with Using Parameters for Table Version Selection in Databricks Dashboards

Hello Databricks Community,I have a quick question regarding the dashboarding functionality in Databricks:I’m utilizing table versioning, but I’m having trouble using a parameter to select a specific version as an input filter for my dashboard. Despi...

Dashboard_Version_Not_Working.png Dashboard_Timestamp_Working.png
  • 246 Views
  • 2 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

It seems that indeed there is a restriction with the usage of params and the VERSION AS OF functionality as the same behavior is showed when running it directly in the SQL editor. As per docs it states that only date or timestamp strings are accepted...

  • 1 kudos
1 More Replies
mac08_flo
by New Contributor
  • 417 Views
  • 1 replies
  • 0 kudos

Creación de log

Buenas tardes.Estoy intentando agregar logs en la creación de mi código.El detalle es que aún no encuentro la manera de poder ingresar los logsen un archivo independiente, no que salga desde la terminal, si no,que se almacene en un archivo (example.l...

  • 417 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Para almacenar logs en un archivo en lugar de la terminal, puedes utilizar la configuración básica de logging en Python. A continuación, te muestro un ejemplo de cómo hacerlo: import logging # Configuración básica del logging logging.basicConfig( fi...

  • 0 kudos
costi9992
by New Contributor III
  • 383 Views
  • 1 replies
  • 0 kudos

Missing Fields in Databricks REST API Documentation & SDK Due to OpenAPI Spec Gaps

Hi Community,I've been working with the Databricks REST APIs and noticed some inconsistencies between the API documentation and the actual API responses. Specifically, there are a few fields returned in the API responses that are not documented but a...

  • 383 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello thanks for your question, in regards the last_time_activity and disk_spec this fields have been deprecated and this is the reason why it is no longer showing in the API docs, you can refer to https://kb.databricks.com/clusters/databricks-api-la...

  • 0 kudos
Leszek1
by New Contributor II
  • 315 Views
  • 1 replies
  • 0 kudos

Workflow job tasks waits

Hi,I'm having issues with Workflow Pipelines since 3-4 days. The performance is degraded and very strange behavior of the Pipeline is that Tasks waits ~2-3 minutes to start executing code in the Notebook.This is visible when you look at one of the ta...

  • 315 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello are you still behaving this issue? Are you counting the time it is taking for the cluster to start up or cluster was already running or using Serverless?

  • 0 kudos
flamezi2
by New Contributor
  • 330 Views
  • 1 replies
  • 0 kudos

Invalid request when using the Manual generation of an account-level access token

I need to generate access token using REST API and was using the guide seen here:manually-generate-an-account-level-access-tokenWhen i try this cURL in postman, i get an error but the error description is not helpfulError: I don't know what I'm missi...

flamezi2_1-1727934079195.png flamezi2_0-1727934045043.png
  • 330 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Are you replacing the Account_id with your actual account id associated with your subscription? Also what token are you using to authenticate or run this API call?

  • 0 kudos
GodSpeed
by New Contributor
  • 424 Views
  • 1 replies
  • 0 kudos

Postman Collection Alternatives for Data-Centric API Management?

I’ve been using Postman collections to manage APIs in my data projects, but I’m exploring alternatives. Are there tools like Apidog or Insomnia that perform better for API management, particularly when working with large data sets or data-driven work...

  • 424 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Insomnia: Insomnia is another strong alternative that is frequently recommended. It is known for its simplicity and effectiveness in making REST API requests. Insomnia supports the import of Postman collections and is praised for its performance and ...

  • 0 kudos
Jcowell
by New Contributor II
  • 244 Views
  • 2 replies
  • 0 kudos

Is Limit input rate Docs not correct?

In databricks docs it says "If you use maxBytesPerTrigger in conjunction with maxFilesPerTrigger, the micro-batch processes data until either the maxFilesPerTrigger or maxBytesPerTrigger limit is reached."But based on the source code this is not true...

  • 244 Views
  • 2 replies
  • 0 kudos
Latest Reply
ozaaditya
Contributor
  • 0 kudos

In my opinion, the reason for not using both options simultaneously is that the framework would face a logical conflict:Should it stop reading after the maximum number of files is reached, even if the size limit hasn’t been exceeded?OrShould it stop ...

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels