I am looking for a simple way to have a structured streaming pipeline that would automatically register a schema to Azure schema registry when converting a df col into avro and that would be able to deserialize an avro col based on schema registry url.
E.g. when I am using Scala and confluent schema registry, there is a great integration library for structured streaming from Absa - https://github.com/AbsaOSS/ABRiS
I am ideally looking for something similar on Azure that could be using in Python with Azure Schema Registry.
Found e.g. this databricks guide https://docs.databricks.com/structured-streaming/avro-dataframe.html#language-python that looks pretty close but only mentions integration to confluent schema registry, not how to use and authenticate against an Azure one.
On a semirelated note - I have issues importing the correct to_avro / from_avro method on Azure databricks - trying to pass schemaRegistryUrl to them raises
TypeError: to_avro() takes from 1 to 2 positional arguments but 3 were given
so these seem to be the vanilla spark avro methods, not the databricks ones?
My environment is Azure Databricks - DBR 11.3 LTS
importing them via
from pyspark.sql.avro.functions import from_avro, to_avro
are they on some different path/in a shaded jar?
Thanks!