<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I define &amp; run jobs that execute scripts that are copied inside a custom DataBricks container? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4479#M1192</link>
    <description>&lt;P&gt;Hi @Thijs van den Berg​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for posting your question in our community! We are happy to assist you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 14 May 2023 01:01:55 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-05-14T01:01:55Z</dc:date>
    <item>
      <title>How do I define &amp; run jobs that execute scripts that are copied inside a custom DataBricks container?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4477#M1190</link>
      <description>&lt;P&gt;Hi all, we are building custom Databricks containers (&lt;A href="https://docs.databricks.com/clusters/custom-containers.html" alt="https://docs.databricks.com/clusters/custom-containers.html" target="_blank"&gt;https://docs.databricks.com/clusters/custom-containers.html&lt;/A&gt;). During the container build process we install dependencies and also python source code scripts. We now want to run some of these scripts as jobs, ideally also providing command line arguments. However, when creating jobs, there doesn't seen a way to reference code that is inside the container? Any ideas?&lt;/P&gt;</description>
      <pubDate>Fri, 12 May 2023 10:35:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4477#M1190</guid>
      <dc:creator>Thijs</dc:creator>
      <dc:date>2023-05-12T10:35:05Z</dc:date>
    </item>
    <item>
      <title>Re: How do I define &amp; run jobs that execute scripts that are copied inside a custom DataBricks container?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4478#M1191</link>
      <description>&lt;P&gt;@Thijs van den Berg​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;When creating a job in Databricks, you can reference code that is inside the container by using the dbutils module. Here's an example of how you could reference a Python file called &lt;A href="https://myscript.py" alt="https://myscript.py" target="_blank"&gt;myscript.py&lt;/A&gt; that is located in the /opt/myapp directory of the container:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;import os
&amp;nbsp;
dbutils.fs.cp("file:/opt/myapp/myscript.py", "dbfs:/mnt/my-mount-point/myscript.py")
&amp;nbsp;
os.system("python /dbfs/mnt/my-mount-point/myscript.py arg1 arg2 arg3")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;In this example, we first copy the &lt;A href="https://myscript.py" alt="https://myscript.py" target="_blank"&gt;myscript.py&lt;/A&gt; file from the container file system to a DBFS mount point using the dbutils.fs.cp() method. Then we run the Python script using the os.system() method and passing in any command line arguments. You can also use the databricks-cli to automate the creation of jobs and the upload of files to DBFS. Here's an example:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;databricks fs cp /opt/myapp/myscript.py dbfs:/mnt/my-mount-point/myscript.py
databricks jobs create --name "My Job" --python-task "python /dbfs/mnt/my-mount-point/myscript.py arg1 arg2 arg3" --max-retries 0&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This example uses the databricks-cli to copy the &lt;A href="https://myscript.py" alt="https://myscript.py" target="_blank"&gt;myscript.py&lt;/A&gt; file to DBFS and then creates a new job with a Python task that runs the script with command line arguments.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope this helps! Let me know if you have any further questions.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 13 May 2023 16:06:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4478#M1191</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-05-13T16:06:41Z</dc:date>
    </item>
    <item>
      <title>Re: How do I define &amp; run jobs that execute scripts that are copied inside a custom DataBricks container?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4479#M1192</link>
      <description>&lt;P&gt;Hi @Thijs van den Berg​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for posting your question in our community! We are happy to assist you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 14 May 2023 01:01:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4479#M1192</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-05-14T01:01:55Z</dc:date>
    </item>
    <item>
      <title>Re: How do I define &amp; run jobs that execute scripts that are copied inside a custom DataBricks container?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4480#M1193</link>
      <description>&lt;P&gt;thanks @Suteja Kanuri​&amp;nbsp;for answering. The question I asked was about scheduling/running "jobs"  scripts that reside inside the container &lt;U&gt;throught the Web Interface&lt;/U&gt;: Worksflows &amp;gt; Jobs &amp;gt; Create Job.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What we ended up doing is to package our job scripts into a python module, pip install that module into the container. That allowed us to create a job of type "Python Wheel", and then use package name and entry point to point to the job code we stored in our module inside the container.&lt;/P&gt;</description>
      <pubDate>Mon, 15 May 2023 09:19:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-do-i-define-run-jobs-that-execute-scripts-that-are-copied/m-p/4480#M1193</guid>
      <dc:creator>Thijs</dc:creator>
      <dc:date>2023-05-15T09:19:46Z</dc:date>
    </item>
  </channel>
</rss>

