cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ela
by New Contributor III
  • 621 Views
  • 1 replies
  • 1 kudos

Checking for availability of dynamic data masking functionality in SQL.

I am looking forward for functionality similar to snowflake which allows attaching masking to a existing column. Documents found related to masking with encryption but my use case is on the existing table. Solutions using views along with Dynamic Vie...

  • 621 Views
  • 1 replies
  • 1 kudos
Latest Reply
sivankumar86
New Contributor II
  • 1 kudos

Unity catalog provide similar feature https://docs.databricks.com/en/data-governance/unity-catalog/row-and-column-filters.html

  • 1 kudos
elgeo
by Valued Contributor II
  • 5365 Views
  • 6 replies
  • 4 kudos

Resolved! Data type length enforcement

Hello. Is there a way to enforce the length of a column in SQL? For example that a column has to be exactly 18 characters? Thank you!

  • 5365 Views
  • 6 replies
  • 4 kudos
Latest Reply
databricks31
New Contributor II
  • 4 kudos

we are facing similar issues while write into adls location delta format, after that we created on top delta location unity catalog tables. below format of data type length should be possible to change spark sql supported ?Azure SQL Spark            ...

  • 4 kudos
5 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 662 Views
  • 3 replies
  • 7 kudos

docs.databricks.com

Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...

  • 662 Views
  • 3 replies
  • 7 kudos
Latest Reply
Poovarasan
New Contributor II
  • 7 kudos

Above mentioned feature is not working in the DLT pipeline. if the scrip has more than 4 columns 

  • 7 kudos
2 More Replies
numersoz
by New Contributor III
  • 2205 Views
  • 5 replies
  • 6 kudos

Resolved! Z-Ordering Timestamp Column

Hi,I've large Delta Table for IoT data for over 10K different sensors with timestamp, sensor name and value columns at 1 second precision.Query pattern is usually random 5-100 sensors at a time. But typically involves specific year/month/day interval...

  • 2205 Views
  • 5 replies
  • 6 kudos
Latest Reply
Oliver_Angelil
Valued Contributor II
  • 6 kudos

@numersoz did you z-order on the timestamp column or on less granular columns, like Year, Month, or Day. timestamp column is very granular (high cardinality) since it also includes hour, minute, second...

  • 6 kudos
4 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 3067 Views
  • 3 replies
  • 0 kudos

Parquet column cannot be converted. Column: [Rainfall_Value], Expected: DoubleType, Found: INT64

df.printSchema()root |-- Device_ID: string (nullable = true) |-- Location: string (nullable = true) |-- Latitude: double (nullable = true) |-- Longitude: double (nullable = true) |-- DateTime: timestamp (nullable = true) |-- Rainfall_Value: double (n...

  • 3067 Views
  • 3 replies
  • 0 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 0 kudos

Hi @THIAM HUAT TAN​ , The issue is because the schema defined for the column "Rainfall_Value" is of DoubleType and the values present in the data frame are of Integer type. This could be because of one or multiple values. Depending on the data, you ...

  • 0 kudos
2 More Replies
Rubens
by New Contributor II
  • 1140 Views
  • 1 replies
  • 3 kudos

how to alter a column into an IDENTITY column

Here's me use case: I'm migrating out of an old DWH, into Databricks. When moving dimension tables into Databricks, I'd like old SKs (surrogate keys) to be maintained, while creating the SKs column as an IDENTITY column, so new dimension values get a...

  • 1140 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ronen Levi​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
guostong
by New Contributor III
  • 2601 Views
  • 1 replies
  • 1 kudos

How to update the items in array of struct column with sql

create table test.json_test_01 ( id int, description string, struct_address STRUCT<street_number: STRING, street_name: STRING, city: STRING, province: STRING>, arrary_phone ARRAY<STRUCT<phone_number: STRING, phone_type: STRING>> );   insert into ...

  • 2601 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Richard Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
DB_795688_DB_44
by New Contributor II
  • 936 Views
  • 4 replies
  • 2 kudos

error: at least one column must be specified for the table.

error: at least one column must be specified for the table.

  • 936 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @anand R​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ca...

  • 2 kudos
3 More Replies
Jujiro
by New Contributor III
  • 4656 Views
  • 11 replies
  • 7 kudos

Random error: At least one column must be specified for the table?

I have the following code in a notebook. It is randomly giving me the error, "At least one column must be specified for the table." The error occurs (if at all it occurs) only on the first run after attaching to a cluster.Cluster details:Summary5-1...

dbr-bug
  • 4656 Views
  • 11 replies
  • 7 kudos
Latest Reply
Harold
New Contributor II
  • 7 kudos

Please check if this could help or not:spark.databricks.delta.catalog.update.enabled false

  • 7 kudos
10 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 2396 Views
  • 2 replies
  • 3 kudos

Resolved! Column is accessible after dropping the same column

Hi Today I have seen very Strang behavior of databricks.I have dropped one column from a dataframe and assigned the result to a new dataframe but I am able to use the dropped column in the filter command.In general scenario I should get an error but ...

image.png
  • 2396 Views
  • 2 replies
  • 3 kudos
Latest Reply
Sandeep
Contributor III
  • 3 kudos

@Ajay Pandey​ , this is a known behavior. Please refer this JIRA for details: https://issues.apache.org/jira/browse/SPARK-30421

  • 3 kudos
1 More Replies
Leszek
by Contributor
  • 1301 Views
  • 1 replies
  • 1 kudos

IDENTITY column duplication when using BY DEFAULT parameter

Hi, I created delta table with identity column using this syntax:Id BIGINT GENERATED BY DEFAULT AS IDENTITYMy steps:1) Created table with Id using syntax above.2) Added two rows with Id = 1 and Id = 2 (BY DEFAULT allows to do that).3) Run Insert (wit...

image.png
  • 1301 Views
  • 1 replies
  • 1 kudos
Latest Reply
dileep_vikram
New Contributor II
  • 1 kudos

Use below alter command to sync the identity column.alter table table_name change column col_name sync identity

  • 1 kudos
rbelidrv
by New Contributor II
  • 3821 Views
  • 3 replies
  • 1 kudos

How to apply a UDF to a property in an array of structs

I have a column that contains an array of structs as follows:"column" : [ { "struct_field1": "struct_value", "struct_field2": "struct_value" }, { "struct_field1": "struct_value", "struct_field2": "struct_value" } ]I want to apply a udf to each f...

  • 3821 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Richard Belihomji​, It looks like you are trying to apply a UDF to each field of the structs in an array column in a Spark DataFrame. However, it seems you are encountering an issue with the UDF not receiving the context.To nest a UDF inside a tr...

  • 1 kudos
2 More Replies
Pawelski
by New Contributor
  • 681 Views
  • 2 replies
  • 1 kudos
  • 681 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Paweł Tomczyk​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 1 kudos
1 More Replies
QuicKick
by New Contributor
  • 1465 Views
  • 2 replies
  • 0 kudos

How do I search for all the columns/field names starting with "XYZ"

I would like to do a big search on all field/columns names that contain "XYZ".I tried below sql but it's giving me an error.SELECT table_name,column_nameFROM information_schema.columnsWHERE column_name like '%<account>%'order by table_name, column_na...

  • 1465 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ian Fox​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 0 kudos
1 More Replies
Istuti
by Contributor
  • 1153 Views
  • 1 replies
  • 2 kudos
  • 1153 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Istuti Gupta​ :There are several algorithms you can use to mask a column in Databricks in a way that is compatible with SQL Server. One commonly used algorithm is called pseudonymization or tokenization.Here's an example of how you can implement pse...

  • 2 kudos
Labels