<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic data_centers_q2_q3.snappy in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/data-centers-q2-q3-snappy/m-p/24993#M641</link>
    <description>&lt;P&gt;I am writing to inquire about the following error:&lt;/P&gt;
&lt;P&gt;om.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$13(DataSource.scala:234)&lt;/P&gt;
&lt;P&gt;at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:234)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:468)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:89)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:235)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3825)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:130)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:273)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:223)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3823)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.&amp;lt;init&amp;gt;(Dataset.scala:235)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:104)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Which is generated from the the following code (copied directly from the Databricks academy SQL training):&lt;/P&gt;
&lt;P&gt;DROP TABLE IF EXISTS dc_data_raw;&lt;/P&gt;
&lt;P&gt;CREATE TABLE dc_data_raw&lt;/P&gt;
&lt;P&gt;USING parquet&lt;/P&gt;
&lt;P&gt;OPTIONS (&lt;/P&gt;
&lt;P&gt;&amp;nbsp;PATH "/FileStore/Tables/data_centers_q2_q3.snappy.parquet"&lt;/P&gt;
&lt;P&gt;);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have done some investigation on my own but have turned up nothing that solves this. I have attached the downloaded version of the parquet file included for the course.&lt;/P&gt;</description>
    <pubDate>Fri, 21 Mar 2025 11:42:25 GMT</pubDate>
    <dc:creator>William</dc:creator>
    <dc:date>2025-03-21T11:42:25Z</dc:date>
    <item>
      <title>data_centers_q2_q3.snappy</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/data-centers-q2-q3-snappy/m-p/24993#M641</link>
      <description>&lt;P&gt;I am writing to inquire about the following error:&lt;/P&gt;
&lt;P&gt;om.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$13(DataSource.scala:234)&lt;/P&gt;
&lt;P&gt;at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:234)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:468)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:89)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:235)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3825)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:130)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:273)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:223)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3823)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset.&amp;lt;init&amp;gt;(Dataset.scala:235)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:104)&lt;/P&gt;
&lt;P&gt;at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Which is generated from the the following code (copied directly from the Databricks academy SQL training):&lt;/P&gt;
&lt;P&gt;DROP TABLE IF EXISTS dc_data_raw;&lt;/P&gt;
&lt;P&gt;CREATE TABLE dc_data_raw&lt;/P&gt;
&lt;P&gt;USING parquet&lt;/P&gt;
&lt;P&gt;OPTIONS (&lt;/P&gt;
&lt;P&gt;&amp;nbsp;PATH "/FileStore/Tables/data_centers_q2_q3.snappy.parquet"&lt;/P&gt;
&lt;P&gt;);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have done some investigation on my own but have turned up nothing that solves this. I have attached the downloaded version of the parquet file included for the course.&lt;/P&gt;</description>
      <pubDate>Fri, 21 Mar 2025 11:42:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/data-centers-q2-q3-snappy/m-p/24993#M641</guid>
      <dc:creator>William</dc:creator>
      <dc:date>2025-03-21T11:42:25Z</dc:date>
    </item>
  </channel>
</rss>

