Topics with Label: Schema

Forum Posts

Sorted by:

by BenLambert • Contributor

09-06-2022 12:48:17 AM

1143 Views
2 replies
2 kudos

Resolved! Delta Live Tables not inferring table schema properly.

I have a delta live tables pipeline that is loading and transforming data. Currently I am having a problem that the schema inferred by DLT does not match the actual schema of the table. The table is generated via a groupby.pivot operation as follows:...

Data Engineering

1143 Views
2 replies
2 kudos

09-06-2022 12:48:17 AM

View Replies

Latest Reply

BenLambert
Contributor

09-06-2022 1:44:58 AM

2 kudos

I was able to get around this by specifying the table schema in the table decorator.

2 kudos

09-06-2022 1:44:58 AM

1 More Replies

by pmt • New Contributor III

07-28-2022 11:44:47 AM

1846 Views
7 replies
1 kudos

Handling Changing Schema in CDC DLT

We are building a DLT pipeline and the autoloader is handling schema evolution fine. However, further down the pipeline we are trying to load that streamed data with the apply_changes() function into a new table and, from the looks of it, doesn't see...

Data Engineering

1846 Views
7 replies
1 kudos

07-28-2022 11:44:47 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-06-2022 4:48:19 AM

1 kudos

Hey there @Palani Thangaraj Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear fro...

1 kudos

09-06-2022 4:48:19 AM

6 More Replies

by Swann • New Contributor

07-25-2022 1:54:11 AM

502 Views
0 replies
0 kudos

How to enforce schema check and benefit from badRecordsPath when using autoloader

We would like to have a robust reader that ensure that the data we read and write using the autoloader respect the schema which is provided to the autoloader reader.We also provide the option "badRecordsPath" (refer to https://docs.databricks.com/spa...

Data Engineering

502 Views
0 replies
0 kudos

07-25-2022 1:54:11 AM

by Confused • New Contributor III

12-03-2021 3:18:17 AM

6266 Views
7 replies
2 kudos

Schema evolution issue

Hi AllI am loading some data using auto loader but am having trouble with Schema evolution.A new column has been added to the data I am loading and I am getting the following error:StreamingQueryException: Encountered unknown field(s) during parsing:...

Data Engineering

6266 Views
7 replies
2 kudos

12-03-2021 3:18:17 AM

View Replies

Latest Reply

rgrosskopf
New Contributor II

07-15-2022 7:16:06 AM

2 kudos

I agree that hints are the way to go if you have the schema available but the whole point of schema evolution is that you might not always know the schema in advance.I received a similar error with a similar streaming query configuration. The issue w...

2 kudos

07-15-2022 7:16:06 AM

6 More Replies

by LukaszJ • Contributor III

03-14-2022 4:15:00 AM

2523 Views
6 replies
4 kudos

Resolved! Terraform: get metastore id without creating new metastore

Hello,I want to create database (schema) and tables in my Databricks workspace using terraform.I found this resources: databricks_schemaIt requires databricks_catalog, which requires metastore_id.However, I have databricks_workspace and I did not cre...

Data Engineering

2523 Views
6 replies
4 kudos

03-14-2022 4:15:00 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-27-2022 12:59:25 AM

4 kudos

Hi @Łukasz Jaremek , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer) and @Atanu Sarkar 's response help you to find the solution? Please let us know.

4 kudos

04-27-2022 12:59:25 AM

5 More Replies

by Bency • New Contributor III

03-24-2022 12:01:42 PM

1032 Views
3 replies
2 kudos

Invalid field schema option provided-DatabricksDeltaLakeSinkConnector

I have configured a Delta Lake Sink connector which reads from an AVRO topic and writes to the Delta lake . I have followed the docs and my config looks like below . { "name": "dev_test_delta_connector", "config": { "topics": "dl_test_avro", "inp...

Data Engineering

1032 Views
3 replies
2 kudos

03-24-2022 12:01:42 PM

View Replies

Latest Reply

Bency
New Contributor III

03-24-2022 12:23:17 PM

2 kudos

@Hubert Dudek , Should I be configuring anything with respect to schema in the connector config ? Because I did successfully stage some data from another topic of a different format(JSON_SR) into delta lake table , but its with AVRO topic that I ge...

2 kudos

03-24-2022 12:23:17 PM

2 More Replies

by shrikant_kulkar • New Contributor II

03-04-2022 6:10:40 AM

535 Views
0 replies
0 kudos

autoloader schema inference date column

I have hire_date and term_dates in the "MM/dd/YYYY" format in underneath csv files. Schema hint "cloudFiles.schemaHints" : "Hire_Date Date,Term_Date Date" - push data into _rescued_data column due to conversion failure. I am looking out solution to c...

Data Engineering

535 Views
0 replies
0 kudos

03-04-2022 6:10:40 AM

by SailajaB • Valued Contributor III

02-08-2022 5:07:00 AM

2996 Views
4 replies
5 kudos

Resolved! Ways to validate final Dataframe schema against JSON schema config file

Hi Team,We have to validate transformed dataframe output schema with json schema config file.Here is the scenario Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to v...

Data Engineering

2996 Views
4 replies
5 kudos

02-08-2022 5:07:00 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-09-2022 7:57:39 AM

5 kudos

@Sailaja B - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins' response answers your question, would you be happy to come back and mark their answer as best?...

5 kudos

02-09-2022 7:57:39 AM

3 More Replies

by irfanaziz • Contributor II

01-17-2022 7:49:47 AM

5467 Views
3 replies
2 kudos

Resolved! Issue in reading parquet file in pyspark databricks.

One of the source systems generates from time to time a parquet file which is only 220kb in size.But reading it fails."java.io.IOException: Could not read or convert schema for file: 1-2022-00-51-56.parquetCaused by: org.apache.spark.sql.AnalysisExce...

Data Engineering

5467 Views
3 replies
2 kudos

01-17-2022 7:49:47 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-09-2022 8:13:04 AM

2 kudos

@nafri A - Howdy! My name is Piper, and I'm a community moderator for Databricks. Would you be happy to mark @Hubert Dudek's answer as best if it solved the problem? That will help other members find the answer more quickly. Thanks

2 kudos

02-09-2022 8:13:04 AM

2 More Replies

by irfanaziz • Contributor II

01-13-2022 4:39:22 AM

1824 Views
3 replies
1 kudos

Resolved! What is the difference between passing the schema in the options or using the .schema() function in pyspark for a csv file?

I have observed a very strange behavior with some of our integration pipelines. This week one of the csv files was getting broken when read with read function given below.def ReadCSV(files,schema_struct,header,delimiter,timestampformat,encode="utf8...

Data Engineering

1824 Views
3 replies
1 kudos

01-13-2022 4:39:22 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

02-08-2022 4:41:55 PM

1 kudos

Hi @nafri A ,What is the error you are getting, can you share it please? Like @Hubert Dudek mentioned, both will call the same APIs

1 kudos

02-08-2022 4:41:55 PM

2 More Replies

by lprevost • New Contributor II

08-09-2021 12:45:34 PM

1613 Views
1 replies
1 kudos

Resolved! Schema inferrence CSV picks up \r carriage returns

I'm using: frame = spark.read.csv(path=bucket+folder, inferSchema = True, header = True, multiLine=True ) to read in a series of CSV ...

Data Engineering

1613 Views
1 replies
1 kudos

08-09-2021 12:45:34 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-22-2021 4:47:52 AM

1 kudos

Files saved in Windows operation system contain carriage return and line feed in every line.Please add following option it can help: .option("ignoreTrailingWhiteSpace", true)

1 kudos

11-22-2021 4:47:52 AM

by StephanieRivera • Valued Contributor II

07-06-2021 11:56:40 AM

653 Views
1 replies
0 kudos

Is the delta schema enforcement flexible?

In the sense that, is it possible to only check for column names or column data types or will it always be both?

Data Engineering

653 Views
1 replies
0 kudos

07-06-2021 11:56:40 AM

View Replies

Latest Reply

StephanieRivera
Valued Contributor II

07-06-2021 12:19:41 PM

0 kudos

No, I do not believe that is possible. However, I would be interested in understanding a use case where that is ideal behavior. How Does Schema Enforcement Work?Delta Lake uses schema validation on write, which means that all new writes to a table ar...

0 kudos

07-06-2021 12:19:41 PM

by User16826987838 • Contributor

06-25-2021 12:57:52 PM

437 Views
0 replies
0 kudos

is there a way to remove a structfield from schema by name instead of index?

Data Engineering

437 Views
0 replies
0 kudos

06-25-2021 12:57:52 PM

by Jasam • New Contributor

07-19-2016 8:17:07 AM

7987 Views
3 replies
0 kudos

how to infer csv schema default all columns like string using spark- csv?

I am using spark- csv utility, but I need when it infer schema all columns be transform in string columns by default. Thanks in advance.

Data Engineering

7987 Views
3 replies
0 kudos

07-19-2016 8:17:07 AM

View Replies

Latest Reply

jhoop2002
New Contributor II

04-19-2021 2:09:25 PM

0 kudos

@peyman what if I don't want to manually specify the schema? For example, I have a vendor that can't build a valid .csv file. I just need to import it somewhere so I can explore the data and find the errors. Just like the original author's question?...

0 kudos

04-19-2021 2:09:25 PM

2 More Replies