cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

azera
by New Contributor II
  • 778 Views
  • 2 replies
  • 2 kudos

Stream-stream window join after time window aggregation not working in 13.1

Hey,I'm trying to perform Time window aggregation in two different streams followed by stream-stream window join described here. I'm running Databricks Runtime 13.1, exactly as advised.However, when I'm reproducing the following code:clicksWindow = c...

  • 778 Views
  • 2 replies
  • 2 kudos
Latest Reply
Happyfield7
New Contributor II
  • 2 kudos

Hey,I'm currently facing the same problem, so I would to know if you've made any progress in resolving this issue.

  • 2 kudos
1 More Replies
chanansh
by Contributor
  • 611 Views
  • 1 replies
  • 0 kudos

stream from azure credentials

I am trying to read stream from azure:(spark.readStream .format("cloudFiles") .option('cloudFiles.clientId', CLIENT_ID) .option('cloudFiles.clientSecret', CLIENT_SECRET) .option('cloudFiles.tenantId', TENTANT_ID) .option("header", "true") .opti...

  • 611 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Hanan Shteingart​ :It looks like you're using the Azure Blob Storage connector for Spark to read data from Azure. The error message suggests that the credentials you provided are not being used by the connector.To specify the credentials, you can se...

  • 0 kudos
Daba
by New Contributor III
  • 3792 Views
  • 5 replies
  • 5 kudos

DLT streaming table and LEFT JOIN

I'm trying to build gold level streaming live table based on two streaming silver live tables with left join.This attempt fails with the next error:"Append mode error: Stream-stream LeftOuter join between two streaming DataFrame/Datasets is not suppo...

  • 3792 Views
  • 5 replies
  • 5 kudos
Latest Reply
Daba
New Contributor III
  • 5 kudos

Thanks Fatma,I do understand the need for watermarks, but I'm just wondering if this supported by SQL syntax?

  • 5 kudos
4 More Replies
Mado
by Valued Contributor II
  • 826 Views
  • 3 replies
  • 1 kudos

Resolved! When should I use STREAM() when defining a DLT table?

Hi, I am a little confused when I should use STREAM() when we define a table based on a DLT table. There is a pattern explained in the documentation. CREATE OR REFRESH STREAMING LIVE TABLE streaming_bronze   AS SELECT * FROM cloud_files(   "s3://p...

  • 826 Views
  • 3 replies
  • 1 kudos
Latest Reply
Mado
Valued Contributor II
  • 1 kudos

Thanks @Landan George​ Since "streaming_silver" is a streaming live table, I expected the last line of the code to be:AS SELECT count(*) FROM STREAM(LIVE.streaming_silver) GROUP BY user_idBut, as you can see the "live_gold" is defined by: AS SELECT c...

  • 1 kudos
2 More Replies
LearnerShahid
by New Contributor II
  • 2861 Views
  • 6 replies
  • 4 kudos

Resolved! Lesson 6.1 of Data Engineering. Error when reading stream - java.lang.UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.resolvePathOnPhysicalStorage(path: Path)

Below function executes fine: def autoload_to_table(data_source, source_format, table_name, checkpoint_directory):  query = (spark.readStream         .format("cloudFiles")         .option("cloudFiles.format", source_format)         .option("cloudFile...

I have verified that source data exists.
  • 2861 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Autoloader is not supported on community edition.

  • 4 kudos
5 More Replies
DarshilDesai
by New Contributor II
  • 9007 Views
  • 3 replies
  • 3 kudos

Resolved! How to Efficiently Read Nested JSON in PySpark?

I am having trouble efficiently reading & parsing in a large number of stream files in Pyspark! Context Here is the schema of the stream file that I am reading in JSON. Blank spaces are edits for confidentiality purposes. root |-- location_info: ar...

  • 9007 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Darshil Desai​ , How are you? Were you able to resolve your problem?

  • 3 kudos
2 More Replies
User16826994223
by Honored Contributor III
  • 492 Views
  • 1 replies
  • 0 kudos

Stream is not getting started from kafka after 2 hours of cluster statrt

Hi Team I am setting up the Kafka cluster on databricks to ingest the data on delta, but it seems like the cluster is running from last 2 hours but still, the stream is not started and I am not seeing any failure also.

  • 492 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

This Type of issue happens if you have firewall on cloud account and your ip is not whitelisted, so pleaae whitelist the ip and issue will resolve

  • 0 kudos
Labels