I created a separate pipeline notebook to generate the table via DLT, and a separate notebook to write the entire output to redshift at the end. The table created via DLT is called spark.read.table("{schema}.{table}").This way, I can import[MATERIALI...
If anyone has example code for building a CDC live streaming pipeline generated by AWS DMS using import dlt, I'd love to see it.I'm currently able to see the parquet file starting with Load on the first full load to S3 and the cdc parquet file after ...
from databricks.connect import DatabricksSession
from data.dbx_conn_info import DbxConnInfo
class SparkSessionManager:
_instance = None
_spark = None
def __new__(cls):
if cls._instance is None:
cls._instance = s...
If there is no data abnormality in redshift connecting to spark from shared in databricks, and the data suddenly decreases, what cause should I check? Also, is there any way to check the variables in widget or code on each execution?
@Kaniz Enable Materialized Views:Ensure that materialized view features are enabled for your workspace.Consider using DBSQL Serverless (recommended) or Pro warehouse for materialized views. Can you point me to the documentation for this workaround?
Enable Materialized Views:- Consider using DBSQL Serverless (recommended) or Pro warehouse for materialized views.- Ensure that materialized view features are enabled for your workspace.Can you point me to the documentation for this workaround?
@Kaniz If you don't mind me asking, can I see the code or documentation for this? I searched for DelataTableBuilder and only found Scala related stuff.