Data source V2 streaming is not supported on table acl or credential passthrough clusters

Anonymous
Not applicable

Using:

( hostname is hidden )

kafka = spark.readStream\

    .format("kafka")\

    .option("kafka.sasl.mechanism", "SCRAM-SHA-512")\

    .option("kafka.security.protocol", "SASL_SSL")\

    .option("kafka.sasl.jaas.config", f'org.apache.kafka.common.security.scram.ScramLoginModule required username="{user_stg}" password="{pass_stg}"')\

    .option("kafka.bootstrap.servers", "b-1.data...amazonaws.com:9096")\

    .option("subscribe", "app-events")\

    .option("startingOffsets", "earliest")

I'm getting this error:

ava.lang.SecurityException: Data source V2 streaming is not supported on table acl or credential passthrough clusters. StreamingRelationV2 org.apache.spark.sql.kafka010.KafkaSourceProvider@11002bae, kafka, org.apache.spark.sql.kafka010.KafkaSourceProvider$KafkaTable@35ae434, [kafka.sasl.mechanism=SCRAM-SHA-512, subscribe=app--ddpg--evaluation--events, kafka.sasl.jaas.config=*********(redacted), kafka.bootstrap.servers=b-1.dataservices-msk-st.sydr4w.c1.kafka.eu-central-1.amazonaws.com:9096, startingOffsets=earliest, kafka.security.protocol=SASL_SSL], [key#137, value#138, topic#139, partition#140, offset#141L, timestamp#142, timestampType#143], StreamingRelation DataSource(org.apache.spark.sql.SparkSession@65e0bcfb,kafka,List(),None,List(),None,Map(kafka.sasl.mechanism -> SCRAM-SHA-512, subscribe -> app--ddpg--evaluation--events, kafka.sasl.jaas.config -> org.apache.kafka.common.security.scram.ScramLoginModule required  username="[REDACTED]" password="[REDACTED]", kafka.bootstrap.servers -> b-1.data....amazonaws.com:9096, startingOffsets -> earliest, kafka.security.protocol -> SASL_SSL),None), kafka, [key#130, value#131, topic#132, partition#133, offset#134L, timestamp#135, timestampType#136]

am I doing something wrong or there is a problem elsewhere?

Anybody get this and found a solution?

Hubert-Dudek
Databricks MVP
  • With TACL enabled cluster, you got many restrictions, so streaming will not work. Generally, you can read only things registered in metastore; please disable it for your use case,
  • Additionally, remember that the unity catalog doesn't support streaming on clusters using shared access mode,
  • Shouldn't it be sasl.token.mechanism ? (but maybe the sasl.mechanism is an alias)


My blog: https://databrickster.medium.com/

Anonymous
Not applicable

Ok, so we closed ourselves out. The least privileges policy has gone out of control.

about sasl.token.mechanism, yes it could be wrong, however, this is the last of the problems. So we need: unity catalog to manage the permission and we have to run the code over a cluster without shared mode access. It's far far away from what we have.

Does using Scala ease something ?