<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: History of code executed on Data Science &amp; Engineering service clusters in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6482#M313</link>
    <description>&lt;P&gt;@Debayan Mukherjee​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Correct - some kind of API access would be good for this, eg the below code. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So, I would be able to construct a dataframe of all queries made against a specified cluster, or at least determine which cells / notebooks were attached to and executed on the cluster, as of specific date times.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from databricks_cli.sdk.api_client import ApiClient
&amp;nbsp;
from databricks_cli.&amp;lt;&amp;lt;module&amp;gt;&amp;gt;.api import &amp;lt;&amp;lt;ClusterHistoryAPI&amp;gt;&amp;gt;
from databricks_cli.clusters.api import ClusterApi
&amp;nbsp;
api_client = ApiClient(
  host  = DATABRICKS_HOST,
  token = DATABRICKS_TOKEN
)
clusters_api = ClusterApi(api_client)
cluster_history_api = ClusterHistoryApi(api_client)  # ie: this is API which provides history access to DS&amp;amp;E clusters
&amp;nbsp;
cluster_id = clusters_api.get_cluster_by_name('DataSciEng_Service_ClusterName').get('cluster_id')
&amp;nbsp;
cluster_code_exec_history = clusters_history_api.get_events(cluster_id, unix_start, unix_end,'ASC','',0,500).get('code_execution_history')  # ie: history of all code segments / cells / notebooks executed on the specified DS&amp;amp;E cluster
&amp;nbsp;
df = spark.read.json(sc.parallelize(cluster_code_exec_history))  # profit&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 04 Apr 2023 05:22:19 GMT</pubDate>
    <dc:creator>rendorHaevyn</dc:creator>
    <dc:date>2023-04-04T05:22:19Z</dc:date>
    <item>
      <title>History of code executed on Data Science &amp; Engineering service clusters</title>
      <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6480#M311</link>
      <description>&lt;P&gt;I want to be able to view a listing of any or all of the following:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;When Notebooks were attached / detached to and from a DS&amp;amp;E cluster&lt;/LI&gt;&lt;LI&gt;When Notebook code was executed on a DS&amp;amp;E cluster&lt;/LI&gt;&lt;LI&gt;What Notebook specific cell code was executed on a DS&amp;amp;E cluster&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is this currently possible?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have explored using the Cluster and Jobs/Runs APIs, however, these do not appear to address ad-hoc notebook executed code, but only jobs/workflows.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;While it appears that the functionality that I'm after is available on Databricks SQL service warehouses, I need the same functionality for DS&amp;amp;E clusters.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The reason for this requirement is to determine what code and notebook triggered immediately preceding resize and expanded disk size events on a specific DS&amp;amp;E cluster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 03:04:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6480#M311</guid>
      <dc:creator>rendorHaevyn</dc:creator>
      <dc:date>2023-04-04T03:04:58Z</dc:date>
    </item>
    <item>
      <title>Re: History of code executed on Data Science &amp; Engineering service clusters</title>
      <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6482#M313</link>
      <description>&lt;P&gt;@Debayan Mukherjee​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Correct - some kind of API access would be good for this, eg the below code. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So, I would be able to construct a dataframe of all queries made against a specified cluster, or at least determine which cells / notebooks were attached to and executed on the cluster, as of specific date times.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;from databricks_cli.sdk.api_client import ApiClient
&amp;nbsp;
from databricks_cli.&amp;lt;&amp;lt;module&amp;gt;&amp;gt;.api import &amp;lt;&amp;lt;ClusterHistoryAPI&amp;gt;&amp;gt;
from databricks_cli.clusters.api import ClusterApi
&amp;nbsp;
api_client = ApiClient(
  host  = DATABRICKS_HOST,
  token = DATABRICKS_TOKEN
)
clusters_api = ClusterApi(api_client)
cluster_history_api = ClusterHistoryApi(api_client)  # ie: this is API which provides history access to DS&amp;amp;E clusters
&amp;nbsp;
cluster_id = clusters_api.get_cluster_by_name('DataSciEng_Service_ClusterName').get('cluster_id')
&amp;nbsp;
cluster_code_exec_history = clusters_history_api.get_events(cluster_id, unix_start, unix_end,'ASC','',0,500).get('code_execution_history')  # ie: history of all code segments / cells / notebooks executed on the specified DS&amp;amp;E cluster
&amp;nbsp;
df = spark.read.json(sc.parallelize(cluster_code_exec_history))  # profit&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 05:22:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6482#M313</guid>
      <dc:creator>rendorHaevyn</dc:creator>
      <dc:date>2023-04-04T05:22:19Z</dc:date>
    </item>
    <item>
      <title>Re: History of code executed on Data Science &amp; Engineering service clusters</title>
      <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6483#M314</link>
      <description>&lt;P&gt;From the UI &lt;A href="https://docs.databricks.com/notebooks/notebooks-code.html#version-control" target="test_blank"&gt;https://docs.databricks.com/notebooks/notebooks-code.html#version-control&lt;/A&gt; best way to check is version control.&lt;/P&gt;&lt;P&gt;BTW, do you see this helps &lt;A href="https://www.databricks.com/blog/2022/11/02/monitoring-notebook-command-logs-static-analysis-tools.html" target="test_blank"&gt;https://www.databricks.com/blog/2022/11/02/monitoring-notebook-command-logs-static-analysis-tools.html&lt;/A&gt; @Cameron McPherson​&amp;nbsp;?&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 18:02:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6483#M314</guid>
      <dc:creator>Atanu</dc:creator>
      <dc:date>2023-04-04T18:02:07Z</dc:date>
    </item>
    <item>
      <title>Re: History of code executed on Data Science &amp; Engineering service clusters</title>
      <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6484#M315</link>
      <description>&lt;P&gt;@Atanu Sarkar​&amp;nbsp; Yes, your proposal will work - thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 23:33:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6484#M315</guid>
      <dc:creator>rendorHaevyn</dc:creator>
      <dc:date>2023-04-04T23:33:37Z</dc:date>
    </item>
    <item>
      <title>Re: History of code executed on Data Science &amp; Engineering service clusters</title>
      <link>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6481#M312</link>
      <description>&lt;P&gt;Hi, Are you saying if you want to list it through the UI, then it is not currently available. &lt;/P&gt;&lt;P&gt;Please tag&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WWwvAAG" alt="https://community.databricks.com/s/profile/0053f000000WWwvAAG" target="_blank"&gt;@Debayan&lt;/A&gt;​&amp;nbsp;with your next response which will notify me. Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2023 05:06:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/history-of-code-executed-on-data-science-engineering-service/m-p/6481#M312</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-04-04T05:06:45Z</dc:date>
    </item>
  </channel>
</rss>

