how to i able to read column mapping metadata for delta tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2023 10:28 AM
i want to read column mapping metadata
https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-mapping
in above link we can able to find the code block with json data. the same data i want to read in pyspark.. is there any option to read that metadata from databricks please let me know
below code block i want to read for my delta tables
{
"name" : "e",
"type" : {
"type" : "array",
"elementType" : {
"type" : "struct",
"fields" : [ {
"name" : "d",
"type" : "integer",
"nullable" : false,
"metadata" : {
"delta.columnMapping.id": 5,
"delta.columnMapping.physicalName": "col-a7f4159c-53be-4cb0-b81a-f7e5240cfc49"
}
} ]
},
"containsNull" : true
},
"nullable" : true,
"metadata" : {
"delta.columnMapping.id": 4,
"delta.columnMapping.physicalName": "col-5f422f40-de70-45b2-88ab-1d5c90e94db1"
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-20-2023 03:12 PM
Hi,
Information about the delta table such as history information could be found by running a `describe history table_name`. A `rename column` operation could be found in the `operation` column with a value of `RENAME COLUMN`.
If you then look at the corresponding delta transaction json file in the table's `_delta_log` directory, you will find the details. For example, if the column rename occurred in Delta version 1, you could go to the `<tableLocation>/_delta_log/00000000000000000001.json` file for the details.
You could read this json file with `spark.read.json(path)`. Much of the data of interest to you would be in the `metaData.schemaString` column.
I hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-13-2024 11:44 PM
thanks

