- 1143 Views
- 2 replies
- 2 kudos
I have a delta live tables pipeline that is loading and transforming data. Currently I am having a problem that the schema inferred by DLT does not match the actual schema of the table. The table is generated via a groupby.pivot operation as follows:...
- 1143 Views
- 2 replies
- 2 kudos
Latest Reply
I was able to get around this by specifying the table schema in the table decorator.
1 More Replies
by
pmt
• New Contributor III
- 1846 Views
- 7 replies
- 1 kudos
We are building a DLT pipeline and the autoloader is handling schema evolution fine. However, further down the pipeline we are trying to load that streamed data with the apply_changes() function into a new table and, from the looks of it, doesn't see...
- 1846 Views
- 7 replies
- 1 kudos
Latest Reply
Hey there @Palani Thangaraj Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear fro...
6 More Replies
by
Swann
• New Contributor
- 502 Views
- 0 replies
- 0 kudos
We would like to have a robust reader that ensure that the data we read and write using the autoloader respect the schema which is provided to the autoloader reader.We also provide the option "badRecordsPath" (refer to https://docs.databricks.com/spa...
- 502 Views
- 0 replies
- 0 kudos
- 6266 Views
- 7 replies
- 2 kudos
Hi AllI am loading some data using auto loader but am having trouble with Schema evolution.A new column has been added to the data I am loading and I am getting the following error:StreamingQueryException: Encountered unknown field(s) during parsing:...
- 6266 Views
- 7 replies
- 2 kudos
Latest Reply
I agree that hints are the way to go if you have the schema available but the whole point of schema evolution is that you might not always know the schema in advance.I received a similar error with a similar streaming query configuration. The issue w...
6 More Replies
- 2523 Views
- 6 replies
- 4 kudos
Hello,I want to create database (schema) and tables in my Databricks workspace using terraform.I found this resources: databricks_schemaIt requires databricks_catalog, which requires metastore_id.However, I have databricks_workspace and I did not cre...
- 2523 Views
- 6 replies
- 4 kudos
Latest Reply
Hi @Łukasz Jaremek , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer) and @Atanu Sarkar 's response help you to find the solution? Please let us know.
5 More Replies
by
Bency
• New Contributor III
- 1032 Views
- 3 replies
- 2 kudos
I have configured a Delta Lake Sink connector which reads from an AVRO topic and writes to the Delta lake . I have followed the docs and my config looks like below . { "name": "dev_test_delta_connector", "config": { "topics": "dl_test_avro", "inp...
- 1032 Views
- 3 replies
- 2 kudos
Latest Reply
Bency
New Contributor III
@Hubert Dudek , Should I be configuring anything with respect to schema in the connector config ? Because I did successfully stage some data from another topic of a different format(JSON_SR) into delta lake table , but its with AVRO topic that I ge...
2 More Replies
- 535 Views
- 0 replies
- 0 kudos
I have hire_date and term_dates in the "MM/dd/YYYY" format in underneath csv files. Schema hint "cloudFiles.schemaHints" : "Hire_Date Date,Term_Date Date" - push data into _rescued_data column due to conversion failure. I am looking out solution to c...
- 535 Views
- 0 replies
- 0 kudos
- 2996 Views
- 4 replies
- 5 kudos
Hi Team,We have to validate transformed dataframe output schema with json schema config file.Here is the scenario Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to v...
- 2996 Views
- 4 replies
- 5 kudos
Latest Reply
@Sailaja B - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins' response answers your question, would you be happy to come back and mark their answer as best?...
3 More Replies
- 5467 Views
- 3 replies
- 2 kudos
One of the source systems generates from time to time a parquet file which is only 220kb in size.But reading it fails."java.io.IOException: Could not read or convert schema for file: 1-2022-00-51-56.parquetCaused by: org.apache.spark.sql.AnalysisExce...
- 5467 Views
- 3 replies
- 2 kudos
Latest Reply
@nafri A - Howdy! My name is Piper, and I'm a community moderator for Databricks. Would you be happy to mark @Hubert Dudek's answer as best if it solved the problem? That will help other members find the answer more quickly. Thanks
2 More Replies
- 1824 Views
- 3 replies
- 1 kudos
I have observed a very strange behavior with some of our integration pipelines. This week one of the csv files was getting broken when read with read function given below.def ReadCSV(files,schema_struct,header,delimiter,timestampformat,encode="utf8...
- 1824 Views
- 3 replies
- 1 kudos
Latest Reply
Hi @nafri A ,What is the error you are getting, can you share it please? Like @Hubert Dudek mentioned, both will call the same APIs
2 More Replies
- 1613 Views
- 1 replies
- 1 kudos
I'm using:
frame = spark.read.csv(path=bucket+folder,
inferSchema = True,
header = True,
multiLine=True
)
to read in a series of CSV ...
- 1613 Views
- 1 replies
- 1 kudos
Latest Reply
Files saved in Windows operation system contain carriage return and line feed in every line.Please add following option it can help: .option("ignoreTrailingWhiteSpace", true)
- 653 Views
- 1 replies
- 0 kudos
In the sense that, is it possible to only check for column names or column data types or will it always be both?
- 653 Views
- 1 replies
- 0 kudos
Latest Reply
No, I do not believe that is possible. However, I would be interested in understanding a use case where that is ideal behavior. How Does Schema Enforcement Work?Delta Lake uses schema validation on write, which means that all new writes to a table ar...
by
Jasam
• New Contributor
- 7987 Views
- 3 replies
- 0 kudos
I am using spark- csv utility, but I need when it infer schema all columns be transform in string columns by default.
Thanks in advance.
- 7987 Views
- 3 replies
- 0 kudos
Latest Reply
@peyman what if I don't want to manually specify the schema?
For example, I have a vendor that can't build a valid .csv file. I just need to import it somewhere so I can explore the data and find the errors.
Just like the original author's question?...
2 More Replies