<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Reading a table from a catalog that is in a different/external workspace in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62916#M32136</link>
    <description>&lt;P&gt;Hi Allia,&lt;BR /&gt;&lt;BR /&gt;Thanks for the helpful links. Unfortunately the client I am working with has a hard requirement for us to use the token. Even my first instinct was to set up a delta share. But unfortunately that is not the case. I am looking for a more efficient way to read the data but by using the token. If you see my code I am getting all the rows first and then creating a dataframe on it. I am looking for a more efficient way to do that, maybe ingest directly into a dataframe.&lt;/P&gt;</description>
    <pubDate>Thu, 07 Mar 2024 16:48:19 GMT</pubDate>
    <dc:creator>addy</dc:creator>
    <dc:date>2024-03-07T16:48:19Z</dc:date>
    <item>
      <title>Reading a table from a catalog that is in a different/external workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62611#M32012</link>
      <description>&lt;P&gt;I am trying to read a table that is hosted on a different workspace. We have been told to establish a connection to said workspace using a table and consume the table.&lt;BR /&gt;&lt;BR /&gt;Code I am using is&lt;/P&gt;&lt;P&gt;from databricks import sql&lt;/P&gt;&lt;P&gt;connection = sql.connect(&lt;BR /&gt;server_hostname="adb-123.azuredatabricks.net",&lt;BR /&gt;http_path="/sql/1.0/warehouses/hahaha",&lt;BR /&gt;access_token="pass"&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;cursor = connection.cursor()&lt;/P&gt;&lt;P&gt;cursor.execute("SELECT * FROM table")&lt;/P&gt;&lt;P&gt;# Fetch all rows into a list&lt;BR /&gt;rows = cursor.fetchall()&lt;/P&gt;&lt;P&gt;# Create a PySpark DataFrame from the list of rows&lt;BR /&gt;df = spark.createDataFrame(rows, schema=["A", "B", "C"])&lt;/P&gt;&lt;P&gt;# Close the cursor and connection when done&lt;BR /&gt;cursor.close()&lt;BR /&gt;connection.close()&lt;BR /&gt;&lt;BR /&gt;Is there a better way of doing this? Is there a way to directly run a spark read and get the data into a pyspark dataframe? This method doesn't appear to be that efficient. Please note that we have to go with the token here and directly consume the table.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Mar 2024 21:33:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62611#M32012</guid>
      <dc:creator>addy</dc:creator>
      <dc:date>2024-03-04T21:33:27Z</dc:date>
    </item>
    <item>
      <title>Re: Reading a table from a catalog that is in a different/external workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62627#M32016</link>
      <description>&lt;P&gt;&amp;nbsp;Hi Addy&lt;/P&gt;
&lt;P&gt;Greetings!&lt;/P&gt;
&lt;P&gt;You can also use Delta sharing to share the data across multiple workspaces. Since you want to read tables from another workspace you can use databricks to databricks delta sharing.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://docs.databricks.com/en/data-sharing/read-data-databricks.html" target="_blank"&gt;https://docs.databricks.com/en/data-sharing/read-data-databricks.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://docs.databricks.com/en/data-sharing/index.html" target="_blank"&gt;https://docs.databricks.com/en/data-sharing/index.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2024 08:33:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62627#M32016</guid>
      <dc:creator>Allia</dc:creator>
      <dc:date>2024-03-05T08:33:41Z</dc:date>
    </item>
    <item>
      <title>Re: Reading a table from a catalog that is in a different/external workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62916#M32136</link>
      <description>&lt;P&gt;Hi Allia,&lt;BR /&gt;&lt;BR /&gt;Thanks for the helpful links. Unfortunately the client I am working with has a hard requirement for us to use the token. Even my first instinct was to set up a delta share. But unfortunately that is not the case. I am looking for a more efficient way to read the data but by using the token. If you see my code I am getting all the rows first and then creating a dataframe on it. I am looking for a more efficient way to do that, maybe ingest directly into a dataframe.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Mar 2024 16:48:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/reading-a-table-from-a-catalog-that-is-in-a-different-external/m-p/62916#M32136</guid>
      <dc:creator>addy</dc:creator>
      <dc:date>2024-03-07T16:48:19Z</dc:date>
    </item>
  </channel>
</rss>

