Hi @databicky, The BLOCK_OFFSET_INSIDE_BLOCK and ROW_OFFSET_INSIDE_BLOCK functions, specific to Hive, are not supported in Spark SQL due to differences in syntax and query execution. To achieve similar functionality in Spark SQL, you can use the row_number() window function.
Here's an example of how to do that:
SELECT *,
row_number() OVER (
PARTITION BY col1, col2
ORDER BY col3 ASC
) - 1 AS BLOCK_OFFSET_INSIDE_BLOCK,
row_number() OVER (
PARTITION BY col1, col2, col3
ORDER BY col4 ASC
) - 1 AS ROW_OFFSET_INSIDE_BLOCK
FROM your_table
In this example, we use the row_number() function with the OVER clause to partition the data by specified columns and order it as needed. Subtracting 1 from each result aligns it with the zero-based values returned by the BLOCK_OFFSET_INSIDE_BLOCK and ROW_OFFSET_INSIDE_BLOCK functions. Keep in mind that the specific syntax may vary depending on your table and use case, but this approach can help you achieve similar results in Spark SQL.