<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit... in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11268#M6274</link>
    <description>&lt;P&gt;Thanks Werners for your Reply,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How to pass schema(ColumnName &amp;amp;&amp;amp; Types) to CSV file ??&lt;/P&gt;</description>
    <pubDate>Thu, 11 Nov 2021 05:58:41 GMT</pubDate>
    <dc:creator>dataEngineer3</dc:creator>
    <dc:date>2021-11-11T05:58:41Z</dc:date>
    <item>
      <title>Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11263#M6269</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am trying to read a csv file from datalake and loading data into sql table using Copyinto.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;am facing an issue &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/2344iF3A00B624032D39A/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here i created one table with 6 columns same as data in csv file.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;but unable to load the data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;can anyone helpme on this&lt;/P&gt;</description>
      <pubDate>Tue, 09 Nov 2021 12:55:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11263#M6269</guid>
      <dc:creator>dataEngineer3</dc:creator>
      <dc:date>2021-11-09T12:55:21Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11265#M6271</link>
      <description>&lt;P&gt;one option is you can delete the underlying delta file or add mergeschema true while you are writing the delta table.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Nov 2021 13:57:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11265#M6271</guid>
      <dc:creator>Sebastian</dc:creator>
      <dc:date>2021-11-09T13:57:48Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11266#M6272</link>
      <description>&lt;P&gt;Thanks for quick response i added option mergeschema true still unable to load the data&lt;/P&gt;</description>
      <pubDate>Tue, 09 Nov 2021 14:01:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11266#M6272</guid>
      <dc:creator>dataEngineer3</dc:creator>
      <dc:date>2021-11-09T14:01:52Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11267#M6273</link>
      <description>&lt;P&gt;It looks like mergeschema is not active.&lt;/P&gt;&lt;P&gt;You can try to set the parameter active for the session with spark.databricks.delta.schema.automerge.enabled.  That is for the whole sparksession.&lt;/P&gt;&lt;P&gt;A better solution is to pass a schema to your csv file (column names and types).&lt;/P&gt;&lt;P&gt;mergeSchema is pretty interesting , but it does not always work or give the results you want.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 09:37:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11267#M6273</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-11-10T09:37:06Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11268#M6274</link>
      <description>&lt;P&gt;Thanks Werners for your Reply,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How to pass schema(ColumnName &amp;amp;&amp;amp; Types) to CSV file ??&lt;/P&gt;</description>
      <pubDate>Thu, 11 Nov 2021 05:58:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11268#M6274</guid>
      <dc:creator>dataEngineer3</dc:creator>
      <dc:date>2021-11-11T05:58:41Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11269#M6275</link>
      <description>&lt;P&gt;That can be done directly in the SQL (with the COPY INTO function) or by using dataframes (classic way).&lt;/P&gt;&lt;P&gt;As you started out with SQL:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;-- The example below loads CSV files without headers on ADLS Gen2 using COPY INTO.
-- By casting the data and renaming the columns, you can put the data in the schema you want
COPY INTO delta.`abfss://container@storageAccount.dfs.core.windows.net/deltaTables/target`
FROM (
  SELECT _c0::bigint key, _c1::int index, _c2 textData
  FROM 'abfss://container@storageAccount.dfs.core.windows.net/base/path'
)
FILEFORMAT = CSV
PATTERN = 'folder1/file_[a-g].csv'&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;(Snippet from the &lt;A href="https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-copy-into#load-csv-files" alt="https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-copy-into#load-csv-files" target="_blank"&gt;Azure Docs&lt;/A&gt;)&lt;/P&gt;&lt;P&gt;There is also an 'inferSchema' option which will try to determine the schema itself (by reading the data twice).  But the quality of the result varies, it might not give you the result you expected (doubleType instead of decimalType etc) and it is slower (because you read 2 times).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Using dataframes is similar, but you read the csv using pyspark/scala with a manually defined schema or inferSchema&lt;/P&gt;&lt;P&gt;&lt;A href="https://sparkbyexamples.com/spark/spark-read-csv-file-into-dataframe/" alt="https://sparkbyexamples.com/spark/spark-read-csv-file-into-dataframe/" target="_blank"&gt;(https://sparkbyexamples.com/spark/spark-read-csv-file-into-dataframe/)&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Nov 2021 07:47:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11269#M6275</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-11-11T07:47:00Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11270#M6276</link>
      <description>&lt;P&gt;Next thing you can do is if its a full load goto the underlying storage and delete the physical delta table and do a full reload. if you dont have too much time to research on options&lt;/P&gt;</description>
      <pubDate>Thu, 11 Nov 2021 12:10:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11270#M6276</guid>
      <dc:creator>Sebastian</dc:creator>
      <dc:date>2021-11-11T12:10:12Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11271#M6277</link>
      <description>&lt;P&gt;Thanks wernes for your quick response.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Here i am having table with 6 columns but how to  pass these 6 column names  in select command  SELECT _c0::bigint key, _c1::int index, _c2 textData&lt;/LI&gt;&lt;LI&gt;you mean _c0,_C1 are the columns names ??&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Nov 2021 12:55:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11271#M6277</guid>
      <dc:creator>dataEngineer3</dc:creator>
      <dc:date>2021-11-11T12:55:04Z</dc:date>
    </item>
    <item>
      <title>Re: Hi All, I am trying to read a csv file from datalake and loading data into sql table using Copyinto. am facing an issue   Here i created one table wit...</title>
      <link>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11272#M6278</link>
      <description>&lt;P&gt;_c0 etc are indeed column names.  If you read a csv and do not define column names (or you read the file without header) those are the names.&lt;/P&gt;</description>
      <pubDate>Sun, 14 Nov 2021 06:54:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hi-all-i-am-trying-to-read-a-csv-file-from-datalake-and-loading/m-p/11272#M6278</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-11-14T06:54:24Z</dc:date>
    </item>
  </channel>
</rss>

