Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-11-2022 05:30 AM
After additional googling on "withColumnRenamed", I was able to replace all spaces in column names with "_" all at once by using select and alias instead:
@dlt.view(
comment=""
)
def vw_raw():
return (
spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "csv")
.options(header='true')
.option("inferSchema", "true")
.load(path_to_load)
)
@dlt.table(
comment=""
)
def table_raw():
return (
dlt.readStream("vw_raw")
.select([col(c).alias(c.replace(" ", "_")) for c in dlt.readStream("vw_raw").columns])
)It also works using "cloudFiles.inferColumnTypes" = "true" and "cloudFiles.schemaHints" in the view definition.