Data Engineering

Forum Posts

Sorted by:

by Constantine • Contributor III

03-30-2022 9:19:56 AM

7328 Views
2 replies
4 kudos

Resolved! How does merge schema work

Let's say I create a table like CREATE TABLE IF NOT EXISTS new_db.data_table ( key STRING, value STRING, last_updated_time TIMESTAMP ) USING DELTA LOCATION 's3://......';Now when I insert into this table I insert data which has say 20 columns a...

Data Engineering

7328 Views
2 replies
4 kudos

03-30-2022 9:19:56 AM

View Replies

Latest Reply

timdriscoll22
New Contributor II

07-11-2023 12:51:24 PM

4 kudos

I tried running "REFRESH TABLE tablename;" but I still do not see the added columns in the data explorer columns, while I do see the added columns in the sample data

4 kudos

07-11-2023 12:51:24 PM

1 More Replies

by sree1567 • New Contributor II

06-01-2023 6:25:29 AM

1185 Views
1 replies
1 kudos

Azure-EventHub Schema Registry with Spark-Scala

Hi all,Is there a way to consume the schemas from schema registry defined in Azure EventHub using apache spark and scala.

Data Engineering

1185 Views
1 replies
1 kudos

06-01-2023 6:25:29 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-15-2023 11:53:36 PM

1 kudos

Hi @sreeranjani thevan Great to meet you, and thanks for your question!Let's see if your peers in the community have an answer to your question. Thanks.

1 kudos

06-15-2023 11:53:36 PM

by js54123875 • New Contributor III

06-01-2023 5:45:02 PM

3944 Views
3 replies
3 kudos

Setup for Unity Catalog, autoloader, three-level namespace, SCD2

I am trying to setup delta live tables pipelines to ingest data to bronze and silver tables. Bronze and Silver are separate schema. This will be triggered by a daily job. It appears to run fine when set as continuous, but fails when triggered.Table...

Data Engineering

3944 Views
3 replies
3 kudos

06-01-2023 5:45:02 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 12:17:18 AM

3 kudos

Hi @Jennette Shepard Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

3 kudos

06-14-2023 12:17:18 AM

2 More Replies

by gg_047320_gg_94 • New Contributor II

05-27-2023 9:09:48 PM

8271 Views
1 replies
1 kudos

DLT Spark readstream fails on the source table which is overwritten

I am reading the source table which gets updated every day. It is usually append/merge with updates and is occasionally overwritten for other reasons. df = spark.readStream.schema(schema).format("delta").option("ignoreChanges", True).option('starting...

Data Engineering

8271 Views
1 replies
1 kudos

05-27-2023 9:09:48 PM

View Replies

Latest Reply

Debayan
Databricks Employee

06-05-2023 12:31:43 AM

1 kudos

Hi, Could you please confirm DLT and DBR versions? Also please tag @Debayan with your next response which will notify me, Thank you!

1 kudos

06-05-2023 12:31:43 AM

by Dave_Nithio • Contributor

11-01-2022 2:03:11 PM

8432 Views
1 replies
3 kudos

Delta Live Table Schema Error

I'm using Delta Live Tables to load a set of csv files in a directory. I am pre-defining the schema to avoid issues with schema inference. This works with autoloader on a regular delta table, but is failing for Delta Live Tables. Below is an example ...

Data Engineering

8432 Views
1 replies
3 kudos

11-01-2022 2:03:11 PM

View Replies

Latest Reply

shagun
New Contributor III

06-01-2023 6:39:19 AM

3 kudos

i was facing similar issue in loading json files through autoloader for delta live tables.Was able to fix with this option .option("cloudFiles.inferColumnTypes", "True")From the docs "For formats that don’t encode data types (JSON and CSV), Auto Load...

3 kudos

06-01-2023 6:39:19 AM

by Abhradwip • New Contributor II

03-09-2023 2:29:34 AM

3699 Views
3 replies
0 kudos

How to create Delta Live table from Json files using Custom schema? I am getting the below error for the attached code # Error org.apache.spark.sql.AnalysisException: Table has a user-specified schema that is incompatible with the schema

#### Code# CodeImport DataTypefrom pyspark.sql.types import StructType, StructField, TimestampType, IntegerType, StringType, FloatType, BooleanType, LongType# Define Custom Schemacall_schema = StructType( [ StructField("RecordType", StringType(),...

Data Engineering

3699 Views
3 replies
0 kudos

03-09-2023 2:29:34 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:22:23 PM

0 kudos

Hi @Abhradwip Mukherjee Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

0 kudos

03-31-2023 5:22:23 PM

2 More Replies

by Dave_Nithio • Contributor

10-14-2022 1:29:07 PM

4826 Views
4 replies
7 kudos

Resolved! Delta Live Table Schema Comment

I predefined my schema for a Delta Live Table Autoload. This included comments for some attributes. When performing a standard readStream, my comments appear, but when in Delta Live Tables I get no comments. Is there anything I need to do get comment...

Data Engineering

4826 Views
4 replies
7 kudos

10-14-2022 1:29:07 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

10-20-2022 6:24:52 AM

7 kudos

You need to add your schema to dlt declaration:@dlt.table( name="test_bronze", comment = "test account data incrementally ingested from S3 Raw landing zone", table_properties={ "quality": "bronze" }, schema=schema)

7 kudos

10-20-2022 6:24:52 AM

3 More Replies

by AndriusVitkausk • New Contributor III

12-07-2022 5:31:55 AM

1612 Views
1 replies
0 kudos

Reading multi-dimensional json files

So I've been having some issues reading a json file that's been provided to the business with another nesting layer, so instead of a json being an:'array of objects' -> [ {} ,{} ,{} ] It's an 'array of arrays of objects' -> [ [ {}, {} ,{} ], [ {} ,{}...

Data Engineering

1612 Views
1 replies
0 kudos

12-07-2022 5:31:55 AM

View Replies

Latest Reply

ashish1
New Contributor III

01-30-2023 1:20:09 PM

0 kudos

You can use the explode function to flatten the array to rows, can you post a simple example of your data?

0 kudos

01-30-2023 1:20:09 PM

by SIRIGIRI • Contributor

12-31-2022 5:38:45 AM

1980 Views
3 replies
2 kudos

sharikrishna26.medium.com

Spark Dataframes SchemaSchema inference is not reliable.We have the following problems in schema inference:Automatic inferring of schema is often incorrectInferring schema is additional work for Spark, and it takes some extra timeSchema inference is ...

Data Engineering

1980 Views
3 replies
2 kudos

12-31-2022 5:38:45 AM

View Replies

Latest Reply

Varshith
New Contributor III

01-01-2023 7:05:25 PM

2 kudos

one other difference between those 2 approaches is that In Schema DDL String approach we use STRING, INT etc.. But In Struct Type Object approach we can only use Spark datatypes such as StringType(), IntegerType(), etc..

2 kudos

01-01-2023 7:05:25 PM

2 More Replies

by hello_world • New Contributor III

12-26-2022 5:20:35 PM

2780 Views
1 replies
4 kudos

What is the purpose of the USAGE privilege?

I watched a couple of courses on Databricks Academy, none of which clearly explains or demonstrates the purpose of the USAGE privilege.USAGE: does not give any abilities, but is an additional requirement to perform any action on a schema object.I hav...

Data Engineering

2780 Views
1 replies
4 kudos

12-26-2022 5:20:35 PM

View Replies

Latest Reply

Rishabh-Pandey
Esteemed Contributor

12-26-2022 10:59:56 PM

4 kudos

hey @S L I also have these questions , and what i get to know that usage is the minimum and mandot requirement which individual should have to perform any actions , that does not mean that you can do any actions by only usage permission , usage is...

4 kudos

12-26-2022 10:59:56 PM

by tassiodahora • New Contributor III

05-23-2022 6:00:35 AM

61600 Views
2 replies
7 kudos

Resolved! Failed to merge incompatible data types LongType and StringType

Guys, good morning!I am writing the results of a json in a delta table, only the json structure is not always the same, if the field does not list in the json it generates type incompatibility when I append(dfbrzagend.write .format("delta") .mode("ap...

Data Engineering

61600 Views
2 replies
7 kudos

05-23-2022 6:00:35 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-06-2022 5:39:36 AM

7 kudos

Hi @Tássio Santos The delta table performs schema validation of every column, and the source dataframe column data types must match the column data types in the target table. If they don’t match, an exception is raised.For reference-https://docs.dat...

7 kudos

06-06-2022 5:39:36 AM

1 More Replies

by Chris_Konsur • New Contributor III

11-10-2022 3:20:52 PM

1251 Views
0 replies
2 kudos

Schema supported by Autoloader

We do not want to use schema inference with schema evolution in Autoloader. Instead, we want to apply our schema and use the merge option. Our schema is very complex, with multiple nested following levels. When I apply this schema to Autoloader, it r...

Data Engineering

1251 Views
0 replies
2 kudos

11-10-2022 3:20:52 PM

by hari • Contributor

10-14-2022 6:16:00 AM

2594 Views
2 replies
5 kudos

Resolved! Best way to automatically update a delta table schema

We have multiple environments where the same tables are added so it's really hard to manually update the schema of the table across all the environments. We know that it's not ideal to update table schema a lot but our product is still evolving and s...

Data Engineering

2594 Views
2 replies
5 kudos

10-14-2022 6:16:00 AM

View Replies

Latest Reply

hari
Contributor

11-01-2022 2:38:05 AM

5 kudos

Thanks for the reply @Pat Sienkiewicz .

5 kudos

11-01-2022 2:38:05 AM

1 More Replies

by venkad • Contributor

10-19-2022 5:23:27 AM

1396 Views
0 replies
4 kudos

Default location for Schema/Database in Unity

Hello Bricksters,We organize the delta lake in multiple storage accounts. One storage account per data domain and one container per database. This helps us to isolate the resources and cost on the business domain level.Earlier, when a schema/database...

Data Engineering

1396 Views
0 replies
4 kudos

10-19-2022 5:23:27 AM

by nbakh • New Contributor II

09-26-2022 6:22:10 AM

3422 Views
1 replies
1 kudos

import of tables to a new metastore failed with schema mismatch "specified schema does not match the existing schema"

The error i get when importing certain delta table isThe specified schema does not match the existing schema at dbfs:/mnt/mart/tablenamehowever, when i check the metadata table in the old workspace and the exported file, they match. However, it seems...

Data Engineering

3422 Views
1 replies
1 kudos

09-26-2022 6:22:10 AM

View Replies

Latest Reply

nbakh
New Contributor II

09-26-2022 6:39:09 AM

1 kudos

below is example error. however, in existing metadata i still see varchar 100 as the type. Specified metadata for field Percentage is different from existing schema:\n Specified: {}\n Existing: {\"HIVE_TYPE_STRING\":\"varchar(100)\"}\n\nIf your inte...

1 kudos

09-26-2022 6:39:09 AM