<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to process images and video through structured streaming using Delta Lake? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18738#M12477</link>
    <description>&lt;P&gt;Can we scan though videos and identify and alert in real time if something goes wrong? what are best practices for this kind of use case?&lt;/P&gt;</description>
    <pubDate>Fri, 25 Jun 2021 20:37:39 GMT</pubDate>
    <dc:creator>Srikanth_Gupta_</dc:creator>
    <dc:date>2021-06-25T20:37:39Z</dc:date>
    <item>
      <title>How to process images and video through structured streaming using Delta Lake?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18738#M12477</link>
      <description>&lt;P&gt;Can we scan though videos and identify and alert in real time if something goes wrong? what are best practices for this kind of use case?&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 20:37:39 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18738#M12477</guid>
      <dc:creator>Srikanth_Gupta_</dc:creator>
      <dc:date>2021-06-25T20:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to process images and video through structured streaming using Delta Lake?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18739#M12478</link>
      <description>&lt;P&gt;Yes you can use &lt;A href="https://docs.databricks.com/data/data-sources/image.html" alt="https://docs.databricks.com/data/data-sources/image.html" target="_blank"&gt;Delta Lake to process images&lt;/A&gt;. Typically with video processing you would process each frame of the video (which is very similar to image processing). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To do what you are saying you would either stream the image data using a service bus that can be read by Databricks OR you can drop image/video files into cloud storage and load those into a streaming dataframe. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Once you have a streaming dataframe you would likely want to use a foreach batch function to score each image/frame using your ML/DL model that you previously trained to identify any alerts that you are interested in. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Code below: &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;// function to upsert data from silver to gold
def upsertBatchData(microBatchDF: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row], batchId: scala.Long) = {
    // apply transformations as needed
&amp;nbsp;
    // write the data
    if (DeltaTable.isDeltaTable(gold_data)){
        var deltaTable = DeltaTable.forPath(spark, gold_data) // set the delta table for upsert
&amp;nbsp;
        (deltaTable.alias("delta_table")
        .merge(microBatchDF.alias("updates"), "updates.word = delta_table.word") // join dataframe 'updates' with delta table 'delta_table' on the key
        .whenMatched().updateAll() // if we match a key then we update all columns
        .whenNotMatched().insertAll() // if we do not match a key then we insert all columns
        .execute() )
    } else {
        microBatchDF.write.format("delta").mode("overwrite").option("mergeSchema", "true").save(gold_data)
    }
  
  
}
&amp;nbsp;
&amp;nbsp;
var silverDF = spark.readStream.format("delta").option("ignoreChanges", "true").load(silver_data) // Read the silver data as a stream
&amp;nbsp;
&amp;nbsp;
// write the silver data as a stream update
silverDF.writeStream
    .format("delta")
    .option("checkpointLocation", silver_checkpoint)
    .trigger(Trigger.Once())
    .foreachBatch(upsertBatchData _)
    .outputMode("update")
    .start()
&amp;nbsp;
display(spark.read.format("delta").load(gold_data))&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jul 2021 15:59:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18739#M12478</guid>
      <dc:creator>Ryan_Chynoweth</dc:creator>
      <dc:date>2021-07-30T15:59:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to process images and video through structured streaming using Delta Lake?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18740#M12479</link>
      <description>&lt;P&gt;Maybe I'm a little off topic, but can you recommend companies that are engaged in video production? I want to make an explanatory video for my site.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Dec 2021 21:23:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-process-images-and-video-through-structured-streaming/m-p/18740#M12479</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-14T21:23:57Z</dc:date>
    </item>
  </channel>
</rss>

