balajij8
Contributor III

The configuration is correct & mostly upstream is the issue. The Parquet sink can only write files when it receives data from the upstream. You can validate the 2 key configurations given below

  • startingOffsets - latest - Code skips all historical Kafka data and it only processes messages that arrive after the stream starts. You can set it to earliest & validate

  • WHALE_THRESHOLD_USD 50000 - Typical value can be 5 - 10. You can lower the threshold & validate temporarily and set it to 50000 later

Even if Kafka has messages the pipeline filters out them because of the configurations.