Downstream delta live table is unable to read data frame from upstream table

arw1070
New Contributor II

I have been trying to work on implementing delta live tables to a pre-existing workflow. Currently trying to create two tables: appointments_raw and notes_raw, where notes_raw is "downstream" of appointments_raw. Following this as a reference, I'm attempting to load the appointments_raw table using dlt.read (inside notes_raw), but the result of dlt.read("appointments_raw") appears to be an empty DataFrame. Appointments raw data frame does seem to be read correctly according to pipeline storage and hive metastore. we are following this example: https://docs.databricks.com/_extras/notebooks/source/dlt-wikipedia-python.html

Specifically, where “top referring pages” code is referencing dlt.read(“clickstream_prepared”). We are trying to do the same but facing an error.

image.png

Anonymous
Not applicable

@Anna Wuest​ : Could you please send me the code snippet here? Thanks.

arw1070
New Contributor II

Do you mean this?

@dlt.table(

comment="Raw table of appointments from EDW",

)

def appointments_raw():

return fetch_data.fetch_appointments(spark=spark, secret_handler=SECRET_HANDLER)

@dlt.table(

comment="Raw table of notes from SOLR",

)

def notes_raw():

appointments = dlt.read("appointments_raw")

print(type(appointments))

print(appointments.head())

appointments = appointments.pandas_api()

mrns = fetch_data.select_mrns(

appointments, today=TIMESTAMP, days_ahead=APPOINTMENTS_DAYS_AHEAD

)

notes = fetch_data.fetch_notes(

mrns, cohort_id=COHORT_ID, secret_handler=SECRET_HANDLER, spark=spark

)

return notes