<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Can we read an excel file with many sheets with there indexes? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-we-read-an-excel-file-with-many-sheets-with-there-indexes/m-p/34365#M25120</link>
    <description>&lt;P&gt;I am trying to read a excel file which has 3 sheets which have integers as there names,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sheet 1 name = 21&lt;/P&gt;&lt;P&gt;sheet 2 name = 24&lt;/P&gt;&lt;P&gt;sheet 3 name  = 224&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i got this data from a user so I can't change the sheet name, but with spark reading these is an issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;code -&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;val sheetName = "provided by user"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;val df = spark.read&lt;/P&gt;&lt;P&gt;      .format("com.crealytics.spark.excel").&lt;/P&gt;&lt;P&gt;      option("header", "true").&lt;/P&gt;&lt;P&gt;      option("inferSchema", "false").&lt;/P&gt;&lt;P&gt;      option("dataAddress", f"$sheetName").&lt;/P&gt;&lt;P&gt;      load("/home/sarveshks/data/xl.xlsx")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;    df.show(5)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;stack -&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Exception in thread "main" java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cell&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:1035)&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:390)&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.getStringCellValue(XSSFCell.java:342)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.colName$1(ExcelRelation.scala:125)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$headerColumns$11(ExcelRelation.scala:128)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:285)&lt;/P&gt;&lt;P&gt;    at scala.collection.Iterator.foreach(Iterator.scala:943)&lt;/P&gt;&lt;P&gt;    at scala.collection.Iterator.foreach$(Iterator.scala:943)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)&lt;/P&gt;&lt;P&gt;    at scala.collection.IterableLike.foreach(IterableLike.scala:74)&lt;/P&gt;&lt;P&gt;    at scala.collection.IterableLike.foreach$(IterableLike.scala:73)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractIterable.foreach(Iterable.scala:56)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.map(TraversableLike.scala:285)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.map$(TraversableLike.scala:278)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractTraversable.map(Traversable.scala:108)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$headerColumns$1(ExcelRelation.scala:128)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.headerColumns$lzycompute(ExcelRelation.scala:107)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.headerColumns(ExcelRelation.scala:103)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$inferSchema$1(ExcelRelation.scala:172)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.inferSchema(ExcelRelation.scala:171)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.&amp;lt;init&amp;gt;(ExcelRelation.scala:36)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:36)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:13)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:8)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:339)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:279)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:268)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:268)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:214)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$.delayedEndpoint$com$sundogsoftware$spark$excel$1(excel.scala:35)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$delayedInit$body.apply(excel.scala:10)&lt;/P&gt;&lt;P&gt;    at scala.Function0.apply$mcV$sp(Function0.scala:39)&lt;/P&gt;&lt;P&gt;    at scala.Function0.apply$mcV$sp$(Function0.scala:39)&lt;/P&gt;&lt;P&gt;    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)&lt;/P&gt;&lt;P&gt;    at scala.App.$anonfun$main$1$adapted(App.scala:80)&lt;/P&gt;&lt;P&gt;    at scala.collection.immutable.List.foreach(List.scala:431)&lt;/P&gt;&lt;P&gt;    at scala.App.main(App.scala:80)&lt;/P&gt;&lt;P&gt;    at scala.App.main$(App.scala:78)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$.main(excel.scala:10)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel.main(excel.scala)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I know what this error is trying to say, what i want is to read different sheets by there indexes that is&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sheet name 21 has index 0&lt;/P&gt;&lt;P&gt;sheet name 24 has index 1&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i want to read the sheets by there index not by there names.&lt;/P&gt;</description>
    <pubDate>Thu, 25 Nov 2021 11:02:48 GMT</pubDate>
    <dc:creator>sarvesh</dc:creator>
    <dc:date>2021-11-25T11:02:48Z</dc:date>
    <item>
      <title>Can we read an excel file with many sheets with there indexes?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-read-an-excel-file-with-many-sheets-with-there-indexes/m-p/34365#M25120</link>
      <description>&lt;P&gt;I am trying to read a excel file which has 3 sheets which have integers as there names,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sheet 1 name = 21&lt;/P&gt;&lt;P&gt;sheet 2 name = 24&lt;/P&gt;&lt;P&gt;sheet 3 name  = 224&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i got this data from a user so I can't change the sheet name, but with spark reading these is an issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;code -&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;val sheetName = "provided by user"&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;val df = spark.read&lt;/P&gt;&lt;P&gt;      .format("com.crealytics.spark.excel").&lt;/P&gt;&lt;P&gt;      option("header", "true").&lt;/P&gt;&lt;P&gt;      option("inferSchema", "false").&lt;/P&gt;&lt;P&gt;      option("dataAddress", f"$sheetName").&lt;/P&gt;&lt;P&gt;      load("/home/sarveshks/data/xl.xlsx")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;    df.show(5)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;stack -&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Exception in thread "main" java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cell&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:1035)&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:390)&lt;/P&gt;&lt;P&gt;    at shadeio.poi.xssf.usermodel.XSSFCell.getStringCellValue(XSSFCell.java:342)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.colName$1(ExcelRelation.scala:125)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$headerColumns$11(ExcelRelation.scala:128)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:285)&lt;/P&gt;&lt;P&gt;    at scala.collection.Iterator.foreach(Iterator.scala:943)&lt;/P&gt;&lt;P&gt;    at scala.collection.Iterator.foreach$(Iterator.scala:943)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)&lt;/P&gt;&lt;P&gt;    at scala.collection.IterableLike.foreach(IterableLike.scala:74)&lt;/P&gt;&lt;P&gt;    at scala.collection.IterableLike.foreach$(IterableLike.scala:73)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractIterable.foreach(Iterable.scala:56)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.map(TraversableLike.scala:285)&lt;/P&gt;&lt;P&gt;    at scala.collection.TraversableLike.map$(TraversableLike.scala:278)&lt;/P&gt;&lt;P&gt;    at scala.collection.AbstractTraversable.map(Traversable.scala:108)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$headerColumns$1(ExcelRelation.scala:128)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.headerColumns$lzycompute(ExcelRelation.scala:107)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.headerColumns(ExcelRelation.scala:103)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.$anonfun$inferSchema$1(ExcelRelation.scala:172)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.inferSchema(ExcelRelation.scala:171)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.ExcelRelation.&amp;lt;init&amp;gt;(ExcelRelation.scala:36)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:36)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:13)&lt;/P&gt;&lt;P&gt;    at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:8)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:339)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:279)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:268)&lt;/P&gt;&lt;P&gt;    at scala.Option.getOrElse(Option.scala:189)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:268)&lt;/P&gt;&lt;P&gt;    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:214)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$.delayedEndpoint$com$sundogsoftware$spark$excel$1(excel.scala:35)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$delayedInit$body.apply(excel.scala:10)&lt;/P&gt;&lt;P&gt;    at scala.Function0.apply$mcV$sp(Function0.scala:39)&lt;/P&gt;&lt;P&gt;    at scala.Function0.apply$mcV$sp$(Function0.scala:39)&lt;/P&gt;&lt;P&gt;    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)&lt;/P&gt;&lt;P&gt;    at scala.App.$anonfun$main$1$adapted(App.scala:80)&lt;/P&gt;&lt;P&gt;    at scala.collection.immutable.List.foreach(List.scala:431)&lt;/P&gt;&lt;P&gt;    at scala.App.main(App.scala:80)&lt;/P&gt;&lt;P&gt;    at scala.App.main$(App.scala:78)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel$.main(excel.scala:10)&lt;/P&gt;&lt;P&gt;    at com.sundogsoftware.spark.excel.main(excel.scala)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I know what this error is trying to say, what i want is to read different sheets by there indexes that is&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sheet name 21 has index 0&lt;/P&gt;&lt;P&gt;sheet name 24 has index 1&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i want to read the sheets by there index not by there names.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Nov 2021 11:02:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-read-an-excel-file-with-many-sheets-with-there-indexes/m-p/34365#M25120</guid>
      <dc:creator>sarvesh</dc:creator>
      <dc:date>2021-11-25T11:02:48Z</dc:date>
    </item>
  </channel>
</rss>

