cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16835756816
by Valued Contributor
  • 2446 Views
  • 4 replies
  • 11 kudos

How can I extract data from different sources and transform it into a fresh, reliable data pipeline?

Tip: These steps are built out for AWS accounts and workspaces that are using Delta Lake. If you would like to learn more watch this video and reach out to your Databricks sales representative for more information.Step 1: Create your own notebook or ...

  • 2446 Views
  • 4 replies
  • 11 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 11 kudos

Thanks @Nithya Thangaraj​ 

  • 11 kudos
3 More Replies
BigJay
by New Contributor II
  • 3036 Views
  • 5 replies
  • 5 kudos

Capture num_affected_rows in notebooks

If I run some code, say for an ETL process to migrate data from bronze to silver storage, when a cell executes it reports num_affected_rows in a table format. I want to capture that and log it in my logger. Is it stored in a variable or syslogged som...

  • 3036 Views
  • 5 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Good one Dan! I never thought of using the delta api for this but there you go.

  • 5 kudos
4 More Replies
User16826994223
by Honored Contributor III
  • 706 Views
  • 1 replies
  • 0 kudos

How is the ETL process different than trigger once stream

I am little confused between what to use between structured stream(trigger once) and etl batch jobs, can I get help here on which basis i should make my decision.

  • 706 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

In Structured Streaming, triggers are used to specify how often a streaming query should produce results. A RunOnce trigger will fire only once and then will stop the query - effectively running it like a batch job.Now, If your source data is a strea...

  • 0 kudos
Labels