cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

AL1
by Contributor
  • 1666 Views
  • 3 replies
  • 2 kudos

In the spirit of the Holiday season, share us a picture of reward/s you received from Databricks Community Rewards Store below!  

In the spirit of the Holiday season, share us a picture of reward/s you received from Databricks Community Rewards Store below! 

databricks shirt
  • 1666 Views
  • 3 replies
  • 2 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 2 kudos

Your tshirt is super cool n awesome

  • 2 kudos
2 More Replies
PriyaAnanthram
by Contributor III
  • 4199 Views
  • 6 replies
  • 0 kudos

Resolved! change data feed on delta live tables

I have a delta live table where I am reading cdc data and merging this data in silver using apply changes. In silver can I find out what all data has changed since the last run similar to change data feed table_changes?

  • 4199 Views
  • 6 replies
  • 0 kudos
Latest Reply
PriyaAnanthram
Contributor III
  • 0 kudos

I also have a requirment where i write to a live table (materialized view) and have cdf enabled i want to see the changes but here to i see overwrites happening after dlt pipeline runs

  • 0 kudos
5 More Replies
rlink
by New Contributor II
  • 2662 Views
  • 3 replies
  • 2 kudos

Resolved! Data Science & Engineering Dashboard Refresh Issue Using Databricks

Hi everyone,I create a Data Science & Engineering notebook in databricks to display some visualizations and also set up a schedule for the notebook to run every hour. I can see that the scheduled run is successful every hour, but the dashboard I crea...

  • 2662 Views
  • 3 replies
  • 2 kudos
Latest Reply
luis_herrera
Contributor
  • 2 kudos

To schedule a dashboard to refresh at a specified interval, schedule the notebook that generates the dashboard graphs.PS: Check #DAIS2023 talks

  • 2 kudos
2 More Replies
Prannu
by New Contributor II
  • 1600 Views
  • 2 replies
  • 1 kudos

Location of files previously uploaded on DBFS

I have uploaded a csv data file and used it in a spark job three months back. I am now running the same spark job with a new cluster created. Program is running properly. I want to know where I can see the previously uploaded csv data file.

  • 1600 Views
  • 2 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@Pranay Gupta​ you can see that in dbfs root directory, based on path you provided in job. please check .please go to data explorer and select below option that i shown in screen shot

  • 1 kudos
1 More Replies
Gopal269673
by Contributor
  • 1514 Views
  • 2 replies
  • 0 kudos

Calling jobs inside another job

Hi All.. I had created 2 job flows and one for transaction layer and another for datamart layer. I need to specify the job dependency between job1 vs Job2 and need to trigger the job2 after completing job1 without using any other orchestration tool o...

  • 1514 Views
  • 2 replies
  • 0 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 0 kudos

Verify with documentation

  • 0 kudos
1 More Replies
SK21
by New Contributor II
  • 1753 Views
  • 3 replies
  • 1 kudos

CICD for Jobs @ WorkFlows

I had created Jobs to trigger the respective notebooks in Databricks Workflow.Now I need to move them to further environments.Would you please help me with an CICD process to promote jobs to further environments.

  • 1753 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Please use jobs API 2.1 You can get job and save JSON with that jobs to git.In git then set variables defining databricks workspaces (URL and token) and after push define that API call is triggered with your json stored in git.

  • 1 kudos
2 More Replies
fijoy
by Contributor
  • 6936 Views
  • 1 replies
  • 2 kudos

Resolved! Using widget values in a shell script cell

I have a Databricks notebook containing a mix of SQL, Python, and shell script cells. I know I can retrieve and use values of widgets in Python cells using dbutils.widgets.get('key') and in SQL cells using ${key}.How can I use widget values in shell ...

  • 6936 Views
  • 1 replies
  • 2 kudos
Latest Reply
fijoy
Contributor
  • 2 kudos

For those interested, I found and am for now using this workaround:https://stackoverflow.com/questions/54662605/how-to-pass-a-python-variables-to-shell-script-in-azure-databricks-notebookbleswhile I wait for a more direct method.

  • 2 kudos
AmanSehgal
by Honored Contributor III
  • 15533 Views
  • 6 replies
  • 15 kudos

Job cluster vs All purpose cluster

Environment: AzureI've a workflow that takes approximately a minute to execute and I want to run the job every 2 minutes.. All purpose cluster:On attaching all purpose cluster to the job, it takes approx. 60 seconds to execute.Using job cluster:On at...

  • 15533 Views
  • 6 replies
  • 15 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 15 kudos

Thanks for sharing

  • 15 kudos
5 More Replies
Siddu07
by New Contributor II
  • 5535 Views
  • 3 replies
  • 1 kudos

How to change the audit log delivery Service Account?

Hi Team,I'm trying to set up Audit log delivery based on the documentation "https://docs.gcp.databricks.com/administration-guide/account-settings-gcp/log-delivery.html". As per the document, I've created a multi-region storage bucket however I'm not ...

  • 5535 Views
  • 3 replies
  • 1 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 1 kudos

Documentation helps in many tasks

  • 1 kudos
2 More Replies
Mike_016978
by New Contributor II
  • 11452 Views
  • 3 replies
  • 3 kudos

Resolved! What are differences between Materialized view and Streaming table in delta live table?

Hi,I was wondering that what are differences between Materialized view and Streaming table? which one should I use when I extract data from bronze table to silver table since I found that both CREATE LIVE TABLE and CREATE STREAMING LIVE TABLE could a...

  • 11452 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Mike Chen​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wi...

  • 3 kudos
2 More Replies
J_M_W
by Contributor
  • 4165 Views
  • 3 replies
  • 3 kudos

Resolved! Can you use %run or dbutils.notebook.run in a Delta Live Table pipeline?

Hi there, Can you use a %run or dbutils.notebook.run() in a Delta Live Table (DLT) pipeline?When I try, I get the following error: "IllegalArgumentException: requirement failed: To enable notebook workflows, please upgrade your Databricks subscriptio...

  • 4165 Views
  • 3 replies
  • 3 kudos
Latest Reply
J_M_W
Contributor
  • 3 kudos

Hi all.@Kaniz Fatma​ thanks for your answer. I am on the premium pricing tier in Azure.After digging around the logs it would seem that you cannot run magic commands in a Delta Live Table pipeline. Therefore, you cannot use %run in a DLT pipeline - w...

  • 3 kudos
2 More Replies
logan0015
by Contributor
  • 4163 Views
  • 6 replies
  • 4 kudos

Resolved! Getting a key mismatch error with Delta Live Tables.

I am attempting to create a streaming delta live table. The main issue I am experiencing is the error below.com.databricks.sql.cloudfiles.errors.CloudFilesIllegalStateException: Found mismatched event: keyI have an aws appflow that is creating a fold...

  • 4163 Views
  • 6 replies
  • 4 kudos
Latest Reply
VijaC_97468
New Contributor II
  • 4 kudos

Hi, I am also facing the same issue, but I found nothing on the documentation to fix it.

  • 4 kudos
5 More Replies
Mikki007
by New Contributor II
  • 6057 Views
  • 3 replies
  • 0 kudos

How to extract the start and end time of the command line cell of the notebook using REST API in Azure Databricks?

HiI have a notebook with many command line cells in it.I want to extract the execution time of each cell using Databricks REST API? How can I do that?Please note - I managed to get the Start & End time of the Job using REST API (/2.1/jobs/runs/get) f...

  • 6057 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Milind Keer​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 0 kudos
2 More Replies
g96g
by New Contributor III
  • 1818 Views
  • 3 replies
  • 0 kudos

data is not written back to data lake

I have this strange case where data is not written back to data lake. I have 3 container- . Bronze, Silver and Gold. I have done the mounting and have not problem to read the source data and write it Bronze layer ( using hive meta store catalog). T...

  • 1818 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Givi Salu​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 0 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels