<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Retrieve data from multiple .mdb files using Python. in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/67836#M3237</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'm interested in accessing several .mdb Access files stored in either Azure Data Lake Storage (ADLS) or the Databricks File System using Python. Could you provide guidance on how to accomplish this? It would be immensely helpful if you could also share some code snippets for reference.&lt;/P&gt;</description>
    <pubDate>Wed, 01 May 2024 17:25:42 GMT</pubDate>
    <dc:creator>JamesBrown54</dc:creator>
    <dc:date>2024-05-01T17:25:42Z</dc:date>
    <item>
      <title>Retrieve data from multiple .mdb files using Python.</title>
      <link>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/67836#M3237</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'm interested in accessing several .mdb Access files stored in either Azure Data Lake Storage (ADLS) or the Databricks File System using Python. Could you provide guidance on how to accomplish this? It would be immensely helpful if you could also share some code snippets for reference.&lt;/P&gt;</description>
      <pubDate>Wed, 01 May 2024 17:25:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/67836#M3237</guid>
      <dc:creator>JamesBrown54</dc:creator>
      <dc:date>2024-05-01T17:25:42Z</dc:date>
    </item>
    <item>
      <title>Re: Retrieve data from multiple .mdb files using Python.</title>
      <link>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/104906#M3892</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;To access multiple .mdb (Microsoft Access) files stored in Azure Data Lake Storage (ADLS) or the Databricks File System (DBFS) using Python, you can use the &lt;CODE&gt;pandas_access&lt;/CODE&gt; library. Below are the steps and code snippets to help you achieve this:&lt;/SPAN&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Install the &lt;CODE&gt;pandas_access&lt;/CODE&gt; library&lt;/STRONG&gt;:&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python _1t7bu9hb hljs language-python gb5fhw3"&gt;%pip install pandas_access&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Import the necessary libraries and read the .mdb file&lt;/STRONG&gt;:&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python _1t7bu9hb hljs language-python gb5fhw3"&gt;&lt;SPAN class="hljs-keyword"&gt;import&lt;/SPAN&gt; pandas_access &lt;SPAN class="hljs-keyword"&gt;as&lt;/SPAN&gt; mdb

&lt;SPAN class="hljs-comment"&gt;# Path to your .mdb file in DBFS&lt;/SPAN&gt;
db_filename = &lt;SPAN class="hljs-string"&gt;'/dbfs/FileStore/Campaign_Template.mdb'&lt;/SPAN&gt;

&lt;SPAN class="hljs-comment"&gt;# Listing the tables in the .mdb file&lt;/SPAN&gt;
&lt;SPAN class="hljs-keyword"&gt;for&lt;/SPAN&gt; tbl &lt;SPAN class="hljs-keyword"&gt;in&lt;/SPAN&gt; mdb.list_tables(db_filename):
    &lt;SPAN class="hljs-built_in"&gt;print&lt;/SPAN&gt;(tbl)

&lt;SPAN class="hljs-comment"&gt;# Reading a specific table into a DataFrame&lt;/SPAN&gt;
df = mdb.read_table(db_filename, &lt;SPAN class="hljs-string"&gt;"Campaign_Table"&lt;/SPAN&gt;)
df.head()&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;STRONG&gt;Handling paths in ADLS&lt;/STRONG&gt;: If your .mdb files are stored in ADLS, you need to mount the ADLS container to DBFS first. Here is an example of how to mount an ADLS container:&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python _1t7bu9hb hljs language-python gb5fhw3"&gt;configs = {
    &lt;SPAN class="hljs-string"&gt;"fs.azure.account.auth.type"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"OAuth"&lt;/SPAN&gt;,
    &lt;SPAN class="hljs-string"&gt;"fs.azure.account.oauth.provider.type"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"&lt;/SPAN&gt;,
    &lt;SPAN class="hljs-string"&gt;"fs.azure.account.oauth2.client.id"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"&amp;lt;client-id&amp;gt;"&lt;/SPAN&gt;,
    &lt;SPAN class="hljs-string"&gt;"fs.azure.account.oauth2.client.secret"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"&amp;lt;client-secret&amp;gt;"&lt;/SPAN&gt;,
    &lt;SPAN class="hljs-string"&gt;"fs.azure.account.oauth2.client.endpoint"&lt;/SPAN&gt;: &lt;SPAN class="hljs-string"&gt;"https://login.microsoftonline.com/&amp;lt;tenant-id&amp;gt;/oauth2/token"&lt;/SPAN&gt;
}

dbutils.fs.mount(
    source = &lt;SPAN class="hljs-string"&gt;"abfss://&amp;lt;container-name&amp;gt;@&amp;lt;storage-account-name&amp;gt;.dfs.core.windows.net/"&lt;/SPAN&gt;,
    mount_point = &lt;SPAN class="hljs-string"&gt;"/mnt/&amp;lt;mount-name&amp;gt;"&lt;/SPAN&gt;,
    extra_configs = configs
)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="gb5fhw4"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;After mounting, you can access the .mdb files using the mounted path:&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python _1t7bu9hb hljs language-python gb5fhw3"&gt;db_filename = &lt;SPAN class="hljs-string"&gt;'/mnt/&amp;lt;mount-name&amp;gt;/path/to/your/file.mdb'&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;&lt;STRONG&gt;Reading the .mdb file from the mounted path&lt;/STRONG&gt;:&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class="gb5fhw2"&gt;
&lt;PRE&gt;&lt;CODE class="markdown-code-python _1t7bu9hb hljs language-python gb5fhw3"&gt;&lt;SPAN class="hljs-keyword"&gt;import&lt;/SPAN&gt; pandas_access &lt;SPAN class="hljs-keyword"&gt;as&lt;/SPAN&gt; mdb

&lt;SPAN class="hljs-comment"&gt;# Path to your .mdb file in the mounted ADLS container&lt;/SPAN&gt;
db_filename = &lt;SPAN class="hljs-string"&gt;'/mnt/&amp;lt;mount-name&amp;gt;/path/to/your/file.mdb'&lt;/SPAN&gt;

&lt;SPAN class="hljs-comment"&gt;# Listing the tables in the .mdb file&lt;/SPAN&gt;
&lt;SPAN class="hljs-keyword"&gt;for&lt;/SPAN&gt; tbl &lt;SPAN class="hljs-keyword"&gt;in&lt;/SPAN&gt; mdb.list_tables(db_filename):
    &lt;SPAN class="hljs-built_in"&gt;print&lt;/SPAN&gt;(tbl)

&lt;SPAN class="hljs-comment"&gt;# Reading a specific table into a DataFrame&lt;/SPAN&gt;
df = mdb.read_table(db_filename, &lt;SPAN class="hljs-string"&gt;"Campaign_Table"&lt;/SPAN&gt;)
df.head()&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="gb5fhw4"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&lt;SPAN&gt;These steps should help you access and manipulate data from .mdb files stored in ADLS or DBFS using Python in Databricks. If you encounter any issues, ensure that the paths are correctly specified and that the necessary libraries are installed.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2025 11:41:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/104906#M3892</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2025-01-09T11:41:18Z</dc:date>
    </item>
    <item>
      <title>Re: Retrieve data from multiple .mdb files using Python.</title>
      <link>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/104907#M3893</link>
      <description>&lt;P&gt;These are a couple of blogs and docs too&lt;/P&gt;
&lt;P&gt;&lt;A href="https://docs.databricks.com/en/connect/storage/azure-storage.html" target="_blank"&gt;https://docs.databricks.com/en/connect/storage/azure-storage.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2025 11:42:47 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/retrieve-data-from-multiple-mdb-files-using-python/m-p/104907#M3893</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2025-01-09T11:42:47Z</dc:date>
    </item>
  </channel>
</rss>

