cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

labromb
by Contributor
  • 11505 Views
  • 9 replies
  • 4 kudos

How to pass configuration values to a Delta Live Tables job through the Delta Live Tables API

Hi Community,I have successfully run a job through the API but would need to be able to pass parameters (configuration) to the DLT workflow via the APII have tried passing JSON in this format:{ "full_refresh": "true", "configuration": [ ...

  • 11505 Views
  • 9 replies
  • 4 kudos
Latest Reply
Manjula_Ganesap
Contributor
  • 4 kudos

@Mo - it worked. Thank you so much.

  • 4 kudos
8 More Replies
Ulman
by New Contributor II
  • 1928 Views
  • 8 replies
  • 0 kudos

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

Hello,We are currently utilizing an autoloader with file listing mode for a stream, which is experiencing significant latency due to the non-incremental naming of files in the directory—a condition that cannot be altered.In an effort to mitigate this...

Data Engineering
ADLS gen2
autoloader
file notification mode
  • 1928 Views
  • 8 replies
  • 0 kudos
Latest Reply
Rah_Cencora
New Contributor II
  • 0 kudos

You should also reevaluate your use of premium storage for your landing area files. Typically, storage for raw files does not need to be the fastest and most resilient and expensive tier. Unless you have a compelling reason for premium storage for la...

  • 0 kudos
7 More Replies
ibrahim21124
by New Contributor III
  • 1049 Views
  • 7 replies
  • 0 kudos

Autoloader File Notification Mode not working as expected

I am using this given code to read from a source location in ADLS Gen 2 Azure Storage Container. core_df = (        spark.readStream.format("cloudFiles")        .option("cloudFiles.format", "json")        .option("multiLine", "false")        .option(...

  • 1049 Views
  • 7 replies
  • 0 kudos
Latest Reply
Rishabh_Tiwari
Community Manager
  • 0 kudos

Hi @ibrahim21124 , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your fe...

  • 0 kudos
6 More Replies
jindalharsh2511
by New Contributor
  • 226 Views
  • 1 replies
  • 0 kudos

facing frequent session expiration in Databricks community edition

facing frequent session expiration in Databricks community edition since 15-Aug. Is it a bug or a any technical update going on.Please confirm.Thanks

  • 226 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay203
New Contributor II
  • 0 kudos

Facing the same issue.Is it still going on for you ?

  • 0 kudos
YFL
by New Contributor III
  • 5560 Views
  • 12 replies
  • 6 kudos

Resolved! When delta is a streaming source, how can we get the consumer lag?

Hi, I want to keep track of the streaming lag from the source table, which is a delta table. I see that in query progress logs, there is some information about the last version and the last file in the version for the end offset, but this don't give ...

  • 5560 Views
  • 12 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hey @Yerachmiel Feltzman​ I hope all is well.Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 6 kudos
11 More Replies
lshar
by New Contributor III
  • 23303 Views
  • 8 replies
  • 5 kudos

Resolved! How do I pass arguments/variables from widgets to notebooks?

Hello,I am looking for a solution to this problem, which is known since 7 years: https://community.databricks.com/s/question/0D53f00001HKHZfCAP/how-do-i-pass-argumentsvariables-to-notebooksWhat I need is to parametrize my notebooks using widget infor...

example_if_run
  • 23303 Views
  • 8 replies
  • 5 kudos
Latest Reply
T_Ash
New Contributor II
  • 5 kudos

Can we create paginated reports with multiple parameters(one parameter can dynamically change other parameter) or we can pass one variable from one dataset to other dataset like power bi paginated report using Databricks dashboard, please let me know...

  • 5 kudos
7 More Replies
rameshybr
by New Contributor II
  • 351 Views
  • 3 replies
  • 0 kudos

Workflow - How to find the task id at run time in current notebook

There are four tasks in the workflow. How can I get the task ID at the beginning of the notebook, store it after finishing all the code cells in the notebook, and then save it into a table?

  • 351 Views
  • 3 replies
  • 0 kudos
Latest Reply
menotron
Valued Contributor
  • 0 kudos

Hi @rameshybr,You can capture these as parameters in the task configuration.And from within the notebook you could use the widget utils to get their values.

  • 0 kudos
2 More Replies
Divs2308
by New Contributor II
  • 387 Views
  • 2 replies
  • 1 kudos

Apply changes in Delta Live tables

Hi,I have created delta live tables (@dlt), need to capture all CDCs (all insert, update, delete) happening on our source.I tried with by creating streaming live table but still not able to achieve.Does delta live tables (@dlt) does support only appe...

  • 387 Views
  • 2 replies
  • 1 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 1 kudos

The APPLY CHANGES APIs: Simplify change data capture with Delta Live Tables | Databricks on AWS check with this 

  • 1 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 449 Views
  • 3 replies
  • 2 kudos

Multi languages support in Databricks

Hi Team,How can I set up multiple languages in Databricks? For example, if I connect from Germany, the workspace and data should support German. If I connect from China, it should support Chinese, and if I connect from the US, it should be in English...

  • 449 Views
  • 3 replies
  • 2 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 2 kudos

1-In that case you need to encode the data in that language format , ex if the data is in japanease then u need to encode in UTF-8 REATE OR REPLACE TEMP VIEW japanese_dataAS SELECT * FROM csv.`path/to/japanese_data.csv` OPTIONS ('encoding'='UTF-8')al...

  • 2 kudos
2 More Replies
databricks8923
by New Contributor
  • 203 Views
  • 0 replies
  • 0 kudos

DLT Pipeline, Autoloader, Streaming Query Exception: Could not find ADLS Gen2 Token

I have set up autoloader to form a streaming table in my DLT pipeline,  import dlt@dlt.tabledef streamFiles_new():        return (            spark.readStream.format("cloudFiles")                .option("cloudFiles.format", "json")                .op...

  • 203 Views
  • 0 replies
  • 0 kudos
explorer
by New Contributor III
  • 3458 Views
  • 4 replies
  • 1 kudos

Resolved! Deleting records manually in databricks streaming table.

Hi Team , Let me know if there is any ways I can delete records manually from databricks streaming table without corrupting table and data.Can we delete the few records (based on some condition) manually in databricks streaming table (having checkpoi...

  • 3458 Views
  • 4 replies
  • 1 kudos
Latest Reply
JunYang
Contributor
  • 1 kudos

  If you use the applyChanges method in DLT for Change Data Capture (CDC), you can delete records manually without affecting the consistency of the table, as applyChanges respects manual deletions. You must configure your DLT pipeline to respect manu...

  • 1 kudos
3 More Replies
alexkychen
by New Contributor II
  • 317 Views
  • 2 replies
  • 0 kudos

How to read csv files stored in my Databricks workspace using a Python script in my local computer?

I am developing a Python app on my local computer, and I would like to let it read some data stored in my Databricks workspace using preferably Pandas. The data are stored in .csv files in the workspace. How can I make this happen? Is it possible to ...

  • 317 Views
  • 2 replies
  • 0 kudos
Latest Reply
alexkychen
New Contributor II
  • 0 kudos

Hi Eni,Thank you very much for your reply. I also did some research, but realized that storing sensitive data (which is in my case) in DBFS is no longer recommended by Databricks due to security reason as it states here: https://docs.databricks.com/e...

  • 0 kudos
1 More Replies
BeardyMan
by New Contributor III
  • 4991 Views
  • 10 replies
  • 3 kudos

Resolved! MLFlow Serve Logging

When using Azure Databricks and serving a model, we have received requests to capture additional logging. In some instances, they would like to capture input and output or even some of the steps from a pipeline. Is there any way we can extend the lo...

  • 4991 Views
  • 10 replies
  • 3 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 3 kudos

Another word from a Databricks employee:"""You can use the custom model approach but configuring it is painful. Plus you have ended every loggable model in the custom model. Another less intrusive solution would be to have a proxy server do the loggi...

  • 3 kudos
9 More Replies
berk
by New Contributor II
  • 419 Views
  • 2 replies
  • 1 kudos

Delete Managed Table from S3 Bucket

Hello,I am encountering an issue with our managed tables in Databricks. The tables are stored in S3 Bucket. When I drop a managed table (either through UI or through running a drop table code in a notebook), the associated data is not being deleted f...

  • 419 Views
  • 2 replies
  • 1 kudos
Latest Reply
berk
New Contributor II
  • 1 kudos

@kenkoshaw, thank you for your reply. It is indeed interesting that the data isn't immediately deleted after the table is dropped, and that there's no way to force this process. I suppose I'll have to manually delete the files from the S3 Bucket if I...

  • 1 kudos
1 More Replies
Ian_Neft
by New Contributor
  • 9675 Views
  • 3 replies
  • 0 kudos

Data Lineage in Unity Catalog not Populating

I have been trying to get the data lineage to populate with the simplest of queries on a unity enabled catalog with a unity enabled cluster.  I am essentially running the example provided with more data to see how it works with various aggregates dow...

  • 9675 Views
  • 3 replies
  • 0 kudos
Latest Reply
AlexYu
New Contributor III
  • 0 kudos

You might need to update your outbound firewall rules to allow for connectivity to the Amazon Kinesis / Event Hubs endpoint.https://docs.databricks.com/en/data-governance/unity-catalog/data-lineage.html#:~:text=To%20view%20lineage%20for%20a,Runtime%2...

  • 0 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels
Top Kudoed Authors