Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-11-2024 03:14 AM - edited 09-11-2024 03:23 AM
Below you're passing query variable to read_sql_query function but I don't see any place in your code where you defined this.
# Error - configuration query not available
parameters = {}
df = pyspark.pandas.read_sql_query(sql=query, con=conn)
print("pyspark dataframe: ", df)
And you don't need to use pandas, you can leverage pyspark to read from jdbc sources or you can also try Lakehouse Federation:
table = (spark.read
.format("jdbc")
.option("url", "<jdbc-url>")
.option("dbtable", "<table-name>")
.option("user", "<username>")
.option("password", "<password>")
.load()
)
Query databases using JDBC - Azure Databricks | Microsoft Learn
Query databases using JDBC - Azure Databricks | Microsoft Learn