Data Engineering

Forum Posts

Sorted by:

by sage5616 • Valued Contributor

10-05-2022 12:45:31 PM

13220 Views
11 replies
10 kudos

Error in SQL statement: AnalysisException: Cannot up cast documents from array

Hi Everyone,I am getting the following error when running a SQL query and do not understand what it means or what can be done to resolve it. Any recommendations?View DDL:CREATE VIEW myschema.table ( accountId, agreementType, capture_file_name, ...

Data Engineering

13220 Views
11 replies
10 kudos

10-05-2022 12:45:31 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-13-2022 11:04:55 PM

10 kudos

Hi @Michael Okulik Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

10 kudos

11-13-2022 11:04:55 PM

10 More Replies

by sage5616 • Valued Contributor

08-03-2022 3:06:05 PM

24100 Views
3 replies
2 kudos

Resolved! Choosing the optimal cluster size/specs.

Hello everyone,I am trying to determine the appropriate cluster specifications/sizing for my workload:Run a PySpark task to transform a batch of input avro files to parquet files and create or re-create persistent views on these parquet files. This t...

Data Engineering

24100 Views
3 replies
2 kudos

08-03-2022 3:06:05 PM

View Replies

Latest Reply

Anonymous
Not applicable

08-07-2022 1:25:11 PM

2 kudos

If the data is 100MB, then I'd try a single node cluster, which will be the smallest and least expensive. You'll have more than enough memory to store it all. You can automate this and use a jobs cluster.

2 kudos

08-07-2022 1:25:11 PM

2 More Replies

by sage5616 • Valued Contributor

07-08-2022 8:39:55 AM

6755 Views
3 replies
4 kudos

Resolved! Spark persistent view on a partition parquet file

In Spark, is it possible to create a persistent view on a partitioned parquet file in Azure BLOB? The view must be available when the cluster restarted, without having to re-create that view, hence it cannot be a temp view.I can create a temp view, b...

Data Engineering

6755 Views
3 replies
4 kudos

07-08-2022 8:39:55 AM

View Replies

Latest Reply

sage5616
Valued Contributor

07-08-2022 10:06:20 AM

4 kudos

Here is what worked for me. Hope this helps someone else: https://stackoverflow.com/questions/72913913/spark-persistent-view-on-a-partition-parquet-file/72914245#72914245CREATE VIEW test as select * from parquet.`/mnt/folder-with-parquet-file(s)/`@Hu...

4 kudos

07-08-2022 10:06:20 AM

2 More Replies

Databricks Community

Error in SQL statement: AnalysisException: Cannot up cast documents from array

Resolved! Choosing the optimal cluster size/specs.

Resolved! Spark persistent view on a partition parquet file