Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have STRING column in a DLT table that was loaded using SQL Autoloader via a JSON file. When I use the "schema_of_json" function in a SQL statement passing in the literal string from the STRING column then I get this output:ARRAY<STRUCT<firstFetchD...
I have a string column which is a concatenation of elements with a hyphen as follows. Let 3 values from that column looks like below, Row 1 - A-B-C-D-E-FRow 2 - A-B-G-C-D-E-FRow 3 - A-B-G-D-E-FI want to compare 2 consecutive rows and create a column ...
Hi,I think you can follow these steps:1. Use window function to create a new column by shifting, then your df will look like thisid value lag1 A-B-C-D-E-F null2 A-B-G-C-D-E-F A-B-C-D-E-F3 A-B-G-D-E-F ...
So i have two partitions defined for this delta table, One is year('GJHAR') contains year values, and the other is a string column('BUKS') with around 124 unique values. However, there is one problem with the 2nd partition column('BUKS'), The values ...
@nafri A​ , So to make sure I understand correctly: if you partition the table with only numeric data in BUKS, new incoming data cannot be added if it contains a string; but the other way around it does work?Could it be that spark has inferred the co...
I have a nested struct , where on of the field is a string , it looks something like this ....string = "[{\"to_loc\":\"6183\",\"to_loc_type\":\"S\",\"qty_allocated\":\"18\"},{\"to_loc\":\"6137\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\"},{\"to_lo...
The user is trying to cast string to decimal when encountering zeros. The cast function displays the '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F
df = spark.s...
If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...