<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I read the contents of a hidden file in a Spark job? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28034#M19872</link>
    <description>&lt;P&gt;Hi @Jose Gonzalez​&amp;nbsp;, none of these answers helped me, unfortunately. I'm still hoping to find a good solution to this issue.&lt;/P&gt;</description>
    <pubDate>Tue, 12 Apr 2022 18:49:18 GMT</pubDate>
    <dc:creator>Lincoln_Bergeso</dc:creator>
    <dc:date>2022-04-12T18:49:18Z</dc:date>
    <item>
      <title>How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28026#M19864</link>
      <description>&lt;P&gt;I'm trying to read a file from a Google Cloud Storage bucket. The filename starts with a period, so Spark assumes the file is hidden and won't let me read it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My code is similar to this:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from pyspark.sql import SparkSession
&amp;nbsp;
spark = SparkSession.builder.getOrCreate()
df = spark.read.format("text").load("gs://&amp;lt;bucket&amp;gt;/.myfile", wholetext=True)
df.show()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The resulting DataFrame is empty (as in, it has no rows). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When I run this on my laptop, I get the following error message:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;22/02/15 16:40:58 WARN DataSource: All paths were ignored:
  gs://&amp;lt;bucket&amp;gt;/.myfile&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I've noticed that this applies to files starting with an underscore as well.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How can I get around this?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2022 23:24:33 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28026#M19864</guid>
      <dc:creator>Lincoln_Bergeso</dc:creator>
      <dc:date>2022-02-15T23:24:33Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28027#M19865</link>
      <description>&lt;P&gt;Spark uses the Hadoop Input API to read files, which ignores every file that starts with an underscore or a period.&lt;/P&gt;&lt;P&gt;I did not find a solution for this as the hiddenFileFilter is always active.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 16:00:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28027#M19865</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2022-02-16T16:00:56Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28028#M19866</link>
      <description>&lt;P&gt;Hi there, @Lincoln Bergeson​! My name is Piper, and I'm a moderator for Databricks. Thank you for your question and welcome to the community. We'll give your peers a chance to respond and then we'll circle back if we need to.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in advance for your patience. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 16:18:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28028#M19866</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-02-16T16:18:53Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28029#M19867</link>
      <description>&lt;P&gt;Is there any way to work around this?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 17:13:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28029#M19867</guid>
      <dc:creator>Lincoln_Bergeso</dc:creator>
      <dc:date>2022-02-16T17:13:06Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28030#M19868</link>
      <description>&lt;P&gt;Looking forward to the answers. From my research this looks something that needs a special configuration or work-around, which I'm hoping Databricks can provide.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 17:13:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28030#M19868</guid>
      <dc:creator>Lincoln_Bergeso</dc:creator>
      <dc:date>2022-02-16T17:13:40Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28032#M19870</link>
      <description>&lt;P&gt;@Lincoln Bergeson​&amp;nbsp; GCS object names are very liberal. Only \r and \n are invalid, everything else is valid, including the NUL character.&amp;nbsp;I am still not sure if  &lt;A href="https://stackoverflow.com/questions/19830264/which-files-are-ignored-as-input-by-mapper" alt="https://stackoverflow.com/questions/19830264/which-files-are-ignored-as-input-by-mapper" target="_blank"&gt;this&lt;/A&gt; can help you. We do really need to hack this from spark side! &lt;/P&gt;</description>
      <pubDate>Wed, 16 Mar 2022 04:34:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28032#M19870</guid>
      <dc:creator>Atanu</dc:creator>
      <dc:date>2022-03-16T04:34:10Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28033#M19871</link>
      <description>&lt;P&gt;Hi @Lincoln Bergeson​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just a friendly follow-up. Did any of the previous responses help you to resolve your issue? Please let us know if you still need help. &lt;/P&gt;</description>
      <pubDate>Mon, 11 Apr 2022 18:51:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28033#M19871</guid>
      <dc:creator>jose_gonzalez</dc:creator>
      <dc:date>2022-04-11T18:51:20Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28034#M19872</link>
      <description>&lt;P&gt;Hi @Jose Gonzalez​&amp;nbsp;, none of these answers helped me, unfortunately. I'm still hoping to find a good solution to this issue.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Apr 2022 18:49:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28034#M19872</guid>
      <dc:creator>Lincoln_Bergeso</dc:creator>
      <dc:date>2022-04-12T18:49:18Z</dc:date>
    </item>
    <item>
      <title>Re: How do I read the contents of a hidden file in a Spark job?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28035#M19873</link>
      <description>&lt;P&gt;I don't think there is an easy way to do this. You will also break very basic functionality (like being able to read Delta tables)  if you were able to get around these constraints. I suggest you employ a rename job and then read.&lt;/P&gt;</description>
      <pubDate>Wed, 04 May 2022 16:19:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of-a-hidden-file-in-a-spark-job/m-p/28035#M19873</guid>
      <dc:creator>Dan_Z</dc:creator>
      <dc:date>2022-05-04T16:19:30Z</dc:date>
    </item>
  </channel>
</rss>

