cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Koalas dropna in DLT

Thefan
New Contributor II

Greetings !

I've been trying out DLT for a few days but I'm running into an unexpected issue when trying to use Koalas dropna in my pipeline.

My goal is to drop all columns that contain only null/na values before writing it.

Current code is this :

  @dlt.table(name=f"silver_table")
  def silver():
    df = (dlt
          .read(f"bronze_table")
           .to_koalas()
           .dropna(axis=1, how="all")
         )
    return df

Running the pipeline, I get the following error message :

org.apache.spark.sql.AnalysisException: 
You are trying to create an external table [...]
from `dbfs:/pipelines/[...]` using Databricks Delta, but there is no transaction log present at
`dbfs:/pipelines/[...]/_delta_log`. Check the upstream job to make sure that it is writing using
format("delta") and that the path is the root of the table.

Am I missing something or is the dropna function not usable in DLT for some reason ?

Thanks a lot !

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.