cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ela
by New Contributor III
  • 1318 Views
  • 1 replies
  • 1 kudos

Checking for availability of dynamic data masking functionality in SQL.

I am looking forward for functionality similar to snowflake which allows attaching masking to a existing column. Documents found related to masking with encryption but my use case is on the existing table. Solutions using views along with Dynamic Vie...

  • 1318 Views
  • 1 replies
  • 1 kudos
Latest Reply
sivankumar86
New Contributor II
  • 1 kudos

Unity catalog provide similar feature https://docs.databricks.com/en/data-governance/unity-catalog/row-and-column-filters.html

  • 1 kudos
elgeo
by Valued Contributor II
  • 19370 Views
  • 3 replies
  • 2 kudos

Data type length enforcement

Hello. Is there a way to enforce the length of a column in SQL? For example that a column has to be exactly 18 characters? Thank you!

  • 19370 Views
  • 3 replies
  • 2 kudos
Latest Reply
databricks31
New Contributor II
  • 2 kudos

we are facing similar issues while write into adls location delta format, after that we created on top delta location unity catalog tables. below format of data type length should be possible to change spark sql supported ?Azure SQL Spark            ...

  • 2 kudos
2 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1817 Views
  • 2 replies
  • 7 kudos

docs.databricks.com

Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...

  • 1817 Views
  • 2 replies
  • 7 kudos
Latest Reply
Poovarasan
New Contributor III
  • 7 kudos

Above mentioned feature is not working in the DLT pipeline. if the scrip has more than 4 columns 

  • 7 kudos
1 More Replies
numersoz
by New Contributor III
  • 4209 Views
  • 3 replies
  • 5 kudos

Resolved! Z-Ordering Timestamp Column

Hi,I've large Delta Table for IoT data for over 10K different sensors with timestamp, sensor name and value columns at 1 second precision.Query pattern is usually random 5-100 sensors at a time. But typically involves specific year/month/day interval...

  • 4209 Views
  • 3 replies
  • 5 kudos
Latest Reply
Oliver_Angelil
Valued Contributor II
  • 5 kudos

@numersoz did you z-order on the timestamp column or on less granular columns, like Year, Month, or Day. timestamp column is very granular (high cardinality) since it also includes hour, minute, second...

  • 5 kudos
2 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 6880 Views
  • 3 replies
  • 0 kudos

Parquet column cannot be converted. Column: [Rainfall_Value], Expected: DoubleType, Found: INT64

df.printSchema()root |-- Device_ID: string (nullable = true) |-- Location: string (nullable = true) |-- Latitude: double (nullable = true) |-- Longitude: double (nullable = true) |-- DateTime: timestamp (nullable = true) |-- Rainfall_Value: double (n...

  • 6880 Views
  • 3 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

Hi @THIAM HUAT TAN​ , The issue is because the schema defined for the column "Rainfall_Value" is of DoubleType and the values present in the data frame are of Integer type. This could be because of one or multiple values. Depending on the data, you ...

  • 0 kudos
2 More Replies
Rubens
by New Contributor II
  • 2294 Views
  • 1 replies
  • 3 kudos

how to alter a column into an IDENTITY column

Here's me use case: I'm migrating out of an old DWH, into Databricks. When moving dimension tables into Databricks, I'd like old SKs (surrogate keys) to be maintained, while creating the SKs column as an IDENTITY column, so new dimension values get a...

  • 2294 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ronen Levi​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
guostong
by New Contributor III
  • 4893 Views
  • 1 replies
  • 1 kudos

How to update the items in array of struct column with sql

create table test.json_test_01 ( id int, description string, struct_address STRUCT<street_number: STRING, street_name: STRING, city: STRING, province: STRING>, arrary_phone ARRAY<STRUCT<phone_number: STRING, phone_type: STRING>> );   insert into ...

  • 4893 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Richard Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
DB_795688_DB_44
by New Contributor II
  • 2108 Views
  • 4 replies
  • 2 kudos

error: at least one column must be specified for the table.

error: at least one column must be specified for the table.

  • 2108 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @anand R​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ca...

  • 2 kudos
3 More Replies
Jujiro
by New Contributor III
  • 9112 Views
  • 11 replies
  • 7 kudos

Random error: At least one column must be specified for the table?

I have the following code in a notebook. It is randomly giving me the error, "At least one column must be specified for the table." The error occurs (if at all it occurs) only on the first run after attaching to a cluster.Cluster details:Summary5-1...

dbr-bug
  • 9112 Views
  • 11 replies
  • 7 kudos
Latest Reply
Harold
New Contributor II
  • 7 kudos

Please check if this could help or not:spark.databricks.delta.catalog.update.enabled false

  • 7 kudos
10 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 3502 Views
  • 2 replies
  • 3 kudos

Resolved! Column is accessible after dropping the same column

Hi Today I have seen very Strang behavior of databricks.I have dropped one column from a dataframe and assigned the result to a new dataframe but I am able to use the dropped column in the filter command.In general scenario I should get an error but ...

image.png
  • 3502 Views
  • 2 replies
  • 3 kudos
Latest Reply
Sandeep
Contributor III
  • 3 kudos

@Ajay Pandey​ , this is a known behavior. Please refer this JIRA for details: https://issues.apache.org/jira/browse/SPARK-30421

  • 3 kudos
1 More Replies
Leszek
by Contributor
  • 2568 Views
  • 1 replies
  • 1 kudos

IDENTITY column duplication when using BY DEFAULT parameter

Hi, I created delta table with identity column using this syntax:Id BIGINT GENERATED BY DEFAULT AS IDENTITYMy steps:1) Created table with Id using syntax above.2) Added two rows with Id = 1 and Id = 2 (BY DEFAULT allows to do that).3) Run Insert (wit...

image.png
  • 2568 Views
  • 1 replies
  • 1 kudos
Latest Reply
dileep_vikram
New Contributor II
  • 1 kudos

Use below alter command to sync the identity column.alter table table_name change column col_name sync identity

  • 1 kudos
RichardDriven
by New Contributor III
  • 7719 Views
  • 2 replies
  • 1 kudos

How to apply a UDF to a property in an array of structs

I have a column that contains an array of structs as follows:"column" : [ { "struct_field1": "struct_value", "struct_field2": "struct_value" }, { "struct_field1": "struct_value", "struct_field2": "struct_value" } ]I want to apply a udf to each f...

  • 7719 Views
  • 2 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
I have a column that contains an array of structs as follows:"column" : [ { "struct_field1": "struct_value", "struct_field2": "struct_value" }, { "struct_field1": "struct_value", "struct_field2": "struct_value" } ]I want to apply a udf to each f...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
1 More Replies
Pawelski
by New Contributor
  • 1346 Views
  • 1 replies
  • 0 kudos
  • 1346 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @PaweÅ‚ Tomczyk​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
QuicKick
by New Contributor
  • 6517 Views
  • 2 replies
  • 0 kudos

How do I search for all the columns/field names starting with "XYZ"

I would like to do a big search on all field/columns names that contain "XYZ".I tried below sql but it's giving me an error.SELECT table_name,column_nameFROM information_schema.columnsWHERE column_name like '%<account>%'order by table_name, column_na...

  • 6517 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ian Fox​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 0 kudos
1 More Replies
Istuti
by Contributor
  • 2412 Views
  • 1 replies
  • 2 kudos
  • 2412 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Istuti Gupta​ :There are several algorithms you can use to mask a column in Databricks in a way that is compatible with SQL Server. One commonly used algorithm is called pseudonymization or tokenization.Here's an example of how you can implement pse...

  • 2 kudos
Labels