<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Generative Certification notebook error in Generative AI</title>
    <link>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/97743#M630</link>
    <description>&lt;P&gt;Define all the helper functions from include folder (_helper_functions.py) in your main notebook itself and run these:&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; nltk&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'punkt'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'wordnet'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'omw-1.4'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'punkt_tab'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'averaged_perceptron_tagger_eng'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;</description>
    <pubDate>Tue, 05 Nov 2024 12:52:28 GMT</pubDate>
    <dc:creator>MohammedArif</dc:creator>
    <dc:date>2024-11-05T12:52:28Z</dc:date>
    <item>
      <title>Generative Certification notebook error</title>
      <link>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/82673#M346</link>
      <description>&lt;P&gt;I am preparing for Generative Certification and facing the below error (&lt;SPAN&gt;LookupError&lt;/SPAN&gt;&lt;SPAN&gt;:&lt;/SPAN&gt;) when running the notebook in the first module. These were running without any errors in the past.&lt;/P&gt;&lt;P&gt;Complete error stack trace:&lt;/P&gt;&lt;DIV&gt;&lt;DIV class=""&gt;********************************************************************** Resource &amp;#27;[93maveraged_perceptron_tagger_eng&amp;#27;[0m not found. Please use the NLTK Downloader to obtain the resource: &amp;#27;[31m&amp;gt;&amp;gt;&amp;gt; import nltk &amp;gt;&amp;gt;&amp;gt; nltk.download('averaged_perceptron_tagger_eng') &amp;#27;[0m For more information see: &lt;A class="" href="https://www.nltk.org/data.html" target="_blank" rel="noopener noreferrer"&gt;https://www.nltk.org/data.html&lt;/A&gt; Attempted to load &amp;#27;[93mtaggers/averaged_perceptron_tagger_eng/&amp;#27;[0m Searched in: - '/root/nltk_data' - '/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/nltk_data' - '/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/share/nltk_data' - '/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' **********************************************************************&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;&lt;A target="_blank"&gt;&amp;lt;command-3201934447568746&amp;gt;&lt;/A&gt;, line 2&lt;/SPAN&gt; &lt;SPAN&gt;1&lt;/SPAN&gt; &lt;SPAN class=""&gt;with&lt;/SPAN&gt; &lt;SPAN&gt;open&lt;/SPAN&gt;(&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;articles_path&lt;SPAN&gt;.&lt;/SPAN&gt;replace(&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;dbfs:&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;,&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;/dbfs/&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;)&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;2302.06476.pdf&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;, mode&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;rb&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;) &lt;SPAN class=""&gt;as&lt;/SPAN&gt; pdf: &lt;SPAN class=""&gt;----&amp;gt; 2&lt;/SPAN&gt; doc &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;extract_doc_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;&lt;SPAN class=""&gt;read&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;3&lt;/SPAN&gt; &lt;SPAN&gt;print&lt;/SPAN&gt;(doc)&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;HR /&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;&lt;A target="_blank"&gt;&amp;lt;command-3201934447568602&amp;gt;&lt;/A&gt;, line 8&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;extract_doc_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(x)&lt;/SPAN&gt; &lt;SPAN&gt;6&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;extract_doc_text&lt;/SPAN&gt;(x : &lt;SPAN&gt;bytes&lt;/SPAN&gt;) &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;str&lt;/SPAN&gt;: &lt;SPAN&gt;7&lt;/SPAN&gt; &lt;SPAN&gt;# Read files and extract the values with unstructured&lt;/SPAN&gt; &lt;SPAN class=""&gt;----&amp;gt; 8&lt;/SPAN&gt; sections &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;partition&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;io&lt;/SPAN&gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;&lt;SPAN class=""&gt;BytesIO&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;x&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;9&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;clean_section&lt;/SPAN&gt;(txt): &lt;SPAN&gt;10&lt;/SPAN&gt; txt &lt;SPAN&gt;=&lt;/SPAN&gt; re&lt;SPAN&gt;.&lt;/SPAN&gt;sub(&lt;SPAN&gt;r&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;\&lt;/SPAN&gt;&lt;SPAN&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;, &lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;, txt)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/auto.py:383&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;partition&lt;/SPAN&gt;&lt;SPAN class=""&gt;(filename, content_type, file, file_filename, url, include_page_breaks, strategy, encoding, paragraph_grouper, headers, skip_infer_table_types, ssl_verify, ocr_languages, languages, detect_language_per_element, pdf_infer_table_structure, pdf_extract_images, pdf_image_output_dir_path, xml_keep_tags, data_source_metadata, metadata_filename, request_timeout, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;381&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; filetype &lt;SPAN&gt;==&lt;/SPAN&gt; FileType&lt;SPAN&gt;.&lt;/SPAN&gt;PDF: &lt;SPAN&gt;382&lt;/SPAN&gt; _partition_pdf &lt;SPAN&gt;=&lt;/SPAN&gt; _get_partition_with_extras(&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;pdf&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;) &lt;SPAN class=""&gt;--&amp;gt; 383&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;_partition_pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;384&lt;/SPAN&gt; &lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;# type: ignore&lt;/SPAN&gt; &lt;SPAN&gt;385&lt;/SPAN&gt; &lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;# type: ignore&lt;/SPAN&gt; &lt;SPAN&gt;386&lt;/SPAN&gt; &lt;SPAN class=""&gt;url&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;None&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;387&lt;/SPAN&gt; &lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;388&lt;/SPAN&gt; &lt;SPAN class=""&gt;infer_table_structure&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;infer_table_structure&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;389&lt;/SPAN&gt; &lt;SPAN class=""&gt;strategy&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;strategy&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;390&lt;/SPAN&gt; &lt;SPAN class=""&gt;languages&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;languages&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;391&lt;/SPAN&gt; &lt;SPAN class=""&gt;extract_images_in_pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;pdf_extract_images&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;392&lt;/SPAN&gt; &lt;SPAN class=""&gt;image_output_dir_path&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;pdf_image_output_dir_path&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;393&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;394&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;395&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; (filetype &lt;SPAN&gt;==&lt;/SPAN&gt; FileType&lt;SPAN&gt;.&lt;/SPAN&gt;PNG) &lt;SPAN class=""&gt;or&lt;/SPAN&gt; (filetype &lt;SPAN&gt;==&lt;/SPAN&gt; FileType&lt;SPAN&gt;.&lt;/SPAN&gt;JPG) &lt;SPAN class=""&gt;or&lt;/SPAN&gt; (filetype &lt;SPAN&gt;==&lt;/SPAN&gt; FileType&lt;SPAN&gt;.&lt;/SPAN&gt;TIFF): &lt;SPAN&gt;396&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; partition_image( &lt;SPAN&gt;397&lt;/SPAN&gt; filename&lt;SPAN&gt;=&lt;/SPAN&gt;filename, &lt;SPAN&gt;# type: ignore&lt;/SPAN&gt; &lt;SPAN&gt;398&lt;/SPAN&gt; file&lt;SPAN&gt;=&lt;/SPAN&gt;file, &lt;SPAN&gt;# type: ignore&lt;/SPAN&gt; &lt;SPAN class=""&gt;(...)&lt;/SPAN&gt; &lt;SPAN&gt;404&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs, &lt;SPAN&gt;405&lt;/SPAN&gt; )&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/documents/elements.py:371&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;process_metadata.&amp;lt;locals&amp;gt;.decorator.&amp;lt;locals&amp;gt;.wrapper&lt;/SPAN&gt;&lt;SPAN class=""&gt;(*args, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;369&lt;/SPAN&gt; &lt;SPAN&gt;@functools&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;wraps(func) &lt;SPAN&gt;370&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;wrapper&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;args: _P&lt;SPAN&gt;.&lt;/SPAN&gt;args, &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs: _P&lt;SPAN&gt;.&lt;/SPAN&gt;kwargs) &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; List[Element]: &lt;SPAN class=""&gt;--&amp;gt; 371&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;func&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;args&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;372&lt;/SPAN&gt; sig &lt;SPAN&gt;=&lt;/SPAN&gt; inspect&lt;SPAN&gt;.&lt;/SPAN&gt;signature(func) &lt;SPAN&gt;373&lt;/SPAN&gt; params: Dict[&lt;SPAN&gt;str&lt;/SPAN&gt;, Any] &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;zip&lt;/SPAN&gt;(sig&lt;SPAN&gt;.&lt;/SPAN&gt;parameters, args)), &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/file_utils/filetype.py:591&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;add_filetype.&amp;lt;locals&amp;gt;.decorator.&amp;lt;locals&amp;gt;.wrapper&lt;/SPAN&gt;&lt;SPAN class=""&gt;(*args, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;589&lt;/SPAN&gt; &lt;SPAN&gt;@functools&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;wraps(func) &lt;SPAN&gt;590&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;wrapper&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;args: _P&lt;SPAN&gt;.&lt;/SPAN&gt;args, &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs: _P&lt;SPAN&gt;.&lt;/SPAN&gt;kwargs) &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; List[Element]: &lt;SPAN class=""&gt;--&amp;gt; 591&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;func&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;args&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;592&lt;/SPAN&gt; sig &lt;SPAN&gt;=&lt;/SPAN&gt; inspect&lt;SPAN&gt;.&lt;/SPAN&gt;signature(func) &lt;SPAN&gt;593&lt;/SPAN&gt; params: Dict[&lt;SPAN&gt;str&lt;/SPAN&gt;, Any] &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;zip&lt;/SPAN&gt;(sig&lt;SPAN&gt;.&lt;/SPAN&gt;parameters, args)), &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/file_utils/filetype.py:546&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;add_metadata.&amp;lt;locals&amp;gt;.wrapper&lt;/SPAN&gt;&lt;SPAN class=""&gt;(*args, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;544&lt;/SPAN&gt; &lt;SPAN&gt;@functools&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;wraps(func) &lt;SPAN&gt;545&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;wrapper&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;args: _P&lt;SPAN&gt;.&lt;/SPAN&gt;args, &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs: _P&lt;SPAN&gt;.&lt;/SPAN&gt;kwargs) &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; List[Element]: &lt;SPAN class=""&gt;--&amp;gt; 546&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;func&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;args&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;547&lt;/SPAN&gt; sig &lt;SPAN&gt;=&lt;/SPAN&gt; inspect&lt;SPAN&gt;.&lt;/SPAN&gt;signature(func) &lt;SPAN&gt;548&lt;/SPAN&gt; params: Dict[&lt;SPAN&gt;str&lt;/SPAN&gt;, Any] &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;zip&lt;/SPAN&gt;(sig&lt;SPAN&gt;.&lt;/SPAN&gt;parameters, args)), &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/chunking/title.py:297&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;add_chunking_strategy.&amp;lt;locals&amp;gt;.decorator.&amp;lt;locals&amp;gt;.wrapper&lt;/SPAN&gt;&lt;SPAN class=""&gt;(*args, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;295&lt;/SPAN&gt; &lt;SPAN&gt;@functools&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;wraps(func) &lt;SPAN&gt;296&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;wrapper&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;args: _P&lt;SPAN&gt;.&lt;/SPAN&gt;args, &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs: _P&lt;SPAN&gt;.&lt;/SPAN&gt;kwargs) &lt;SPAN&gt;-&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; List[Element]: &lt;SPAN class=""&gt;--&amp;gt; 297&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;func&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;args&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;298&lt;/SPAN&gt; sig &lt;SPAN&gt;=&lt;/SPAN&gt; inspect&lt;SPAN&gt;.&lt;/SPAN&gt;signature(func) &lt;SPAN&gt;299&lt;/SPAN&gt; params: Dict[&lt;SPAN&gt;str&lt;/SPAN&gt;, Any] &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;dict&lt;/SPAN&gt;(&lt;SPAN&gt;zip&lt;/SPAN&gt;(sig&lt;SPAN&gt;.&lt;/SPAN&gt;parameters, args)), &lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;kwargs)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/pdf.py:183&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;partition_pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;(filename, file, include_page_breaks, strategy, infer_table_structure, ocr_languages, languages, include_metadata, metadata_filename, metadata_last_modified, chunking_strategy, links, extract_images_in_pdf, image_output_dir_path, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;177&lt;/SPAN&gt; languages &lt;SPAN&gt;=&lt;/SPAN&gt; convert_old_ocr_languages_to_languages(ocr_languages) &lt;SPAN&gt;178&lt;/SPAN&gt; logger&lt;SPAN&gt;.&lt;/SPAN&gt;warning( &lt;SPAN&gt;179&lt;/SPAN&gt; &lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;The ocr_languages kwarg will be deprecated in a future version of unstructured. &lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;180&lt;/SPAN&gt; &lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;Please use languages instead.&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;, &lt;SPAN&gt;181&lt;/SPAN&gt; ) &lt;SPAN class=""&gt;--&amp;gt; 183&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN class=""&gt;partition_pdf_or_image&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;184&lt;/SPAN&gt; &lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;185&lt;/SPAN&gt; &lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;186&lt;/SPAN&gt; &lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;187&lt;/SPAN&gt; &lt;SPAN class=""&gt;strategy&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;strategy&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;188&lt;/SPAN&gt; &lt;SPAN class=""&gt;infer_table_structure&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;infer_table_structure&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;189&lt;/SPAN&gt; &lt;SPAN class=""&gt;languages&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;languages&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;190&lt;/SPAN&gt; &lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;191&lt;/SPAN&gt; &lt;SPAN class=""&gt;extract_images_in_pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;extract_images_in_pdf&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;192&lt;/SPAN&gt; &lt;SPAN class=""&gt;image_output_dir_path&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;image_output_dir_path&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;193&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;194&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/pdf.py:288&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;partition_pdf_or_image&lt;/SPAN&gt;&lt;SPAN class=""&gt;(filename, file, is_image, include_page_breaks, strategy, infer_table_structure, ocr_languages, languages, metadata_last_modified, extract_images_in_pdf, image_output_dir_path, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;271&lt;/SPAN&gt; last_modification_date &lt;SPAN&gt;=&lt;/SPAN&gt; get_the_last_modification_date_pdf_or_img( &lt;SPAN&gt;272&lt;/SPAN&gt; file&lt;SPAN&gt;=&lt;/SPAN&gt;file, &lt;SPAN&gt;273&lt;/SPAN&gt; filename&lt;SPAN&gt;=&lt;/SPAN&gt;filename, &lt;SPAN&gt;274&lt;/SPAN&gt; ) &lt;SPAN&gt;276&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; ( &lt;SPAN&gt;277&lt;/SPAN&gt; &lt;SPAN class=""&gt;not&lt;/SPAN&gt; is_image &lt;SPAN&gt;278&lt;/SPAN&gt; &lt;SPAN class=""&gt;and&lt;/SPAN&gt; determine_pdf_or_image_strategy( &lt;SPAN class=""&gt;(...)&lt;/SPAN&gt; &lt;SPAN&gt;286&lt;/SPAN&gt; &lt;SPAN&gt;!=&lt;/SPAN&gt; &lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;ocr_only&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;287&lt;/SPAN&gt; &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &lt;SPAN class=""&gt;--&amp;gt; 288&lt;/SPAN&gt; extracted_elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;extractable_elements&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;289&lt;/SPAN&gt; &lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;290&lt;/SPAN&gt; &lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;spooled_to_bytes_io_if_needed&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;291&lt;/SPAN&gt; &lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;292&lt;/SPAN&gt; &lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt; &lt;SPAN&gt;or&lt;/SPAN&gt; &lt;SPAN class=""&gt;last_modification_date&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;293&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;294&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;295&lt;/SPAN&gt; pdf_text_extractable &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;any&lt;/SPAN&gt;( &lt;SPAN&gt;296&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;(el, Text) &lt;SPAN class=""&gt;and&lt;/SPAN&gt; el&lt;SPAN&gt;.&lt;/SPAN&gt;text&lt;SPAN&gt;.&lt;/SPAN&gt;strip() &lt;SPAN class=""&gt;for&lt;/SPAN&gt; el &lt;SPAN class=""&gt;in&lt;/SPAN&gt; extracted_elements &lt;SPAN&gt;297&lt;/SPAN&gt; ) &lt;SPAN&gt;298&lt;/SPAN&gt; &lt;SPAN class=""&gt;else&lt;/SPAN&gt;:&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/pdf.py:206&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;extractable_elements&lt;/SPAN&gt;&lt;SPAN class=""&gt;(filename, file, include_page_breaks, metadata_last_modified, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;204&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;isinstance&lt;/SPAN&gt;(file, &lt;SPAN&gt;bytes&lt;/SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &lt;SPAN&gt;205&lt;/SPAN&gt; file &lt;SPAN&gt;=&lt;/SPAN&gt; io&lt;SPAN&gt;.&lt;/SPAN&gt;BytesIO(file) &lt;SPAN class=""&gt;--&amp;gt; 206&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN class=""&gt;_partition_pdf_with_pdfminer&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;207&lt;/SPAN&gt; &lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;208&lt;/SPAN&gt; &lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;file&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;209&lt;/SPAN&gt; &lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;210&lt;/SPAN&gt; &lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;211&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;212&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/utils.py:179&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;requires_dependencies.&amp;lt;locals&amp;gt;.decorator.&amp;lt;locals&amp;gt;.wrapper&lt;/SPAN&gt;&lt;SPAN class=""&gt;(*args, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;170&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;len&lt;/SPAN&gt;(missing_deps) &lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt; &lt;SPAN&gt;0&lt;/SPAN&gt;: &lt;SPAN&gt;171&lt;/SPAN&gt; &lt;SPAN class=""&gt;raise&lt;/SPAN&gt; &lt;SPAN class=""&gt;ImportError&lt;/SPAN&gt;( &lt;SPAN&gt;172&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;Following dependencies are missing: &lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;join(missing_deps)&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;. &lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;173&lt;/SPAN&gt; &lt;SPAN&gt;+&lt;/SPAN&gt; ( &lt;SPAN class=""&gt;(...)&lt;/SPAN&gt; &lt;SPAN&gt;177&lt;/SPAN&gt; ), &lt;SPAN&gt;178&lt;/SPAN&gt; ) &lt;SPAN class=""&gt;--&amp;gt; 179&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN class=""&gt;func&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;args&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/pdf.py:525&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;_partition_pdf_with_pdfminer&lt;/SPAN&gt;&lt;SPAN class=""&gt;(filename, file, include_page_breaks, metadata_last_modified, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;523&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; file: &lt;SPAN&gt;524&lt;/SPAN&gt; fp &lt;SPAN&gt;=&lt;/SPAN&gt; cast(BinaryIO, file) &lt;SPAN class=""&gt;--&amp;gt; 525&lt;/SPAN&gt; elements &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;_process_pdfminer_pages&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;526&lt;/SPAN&gt; &lt;SPAN class=""&gt;fp&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;fp&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;527&lt;/SPAN&gt; &lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;filename&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;528&lt;/SPAN&gt; &lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;include_page_breaks&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;529&lt;/SPAN&gt; &lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;metadata_last_modified&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;530&lt;/SPAN&gt; &lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;*&lt;/SPAN&gt;&lt;SPAN class=""&gt;kwargs&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;531&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;533&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; elements&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/pdf.py:613&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;_process_pdfminer_pages&lt;/SPAN&gt;&lt;SPAN class=""&gt;(fp, filename, include_page_breaks, metadata_last_modified, sort_mode, **kwargs)&lt;/SPAN&gt; &lt;SPAN&gt;611&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; _text&lt;SPAN&gt;.&lt;/SPAN&gt;strip(): &lt;SPAN&gt;612&lt;/SPAN&gt; points &lt;SPAN&gt;=&lt;/SPAN&gt; ((x1, y1), (x1, y2), (x2, y2), (x2, y1)) &lt;SPAN class=""&gt;--&amp;gt; 613&lt;/SPAN&gt; element &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;element_from_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt; &lt;SPAN&gt;614&lt;/SPAN&gt; &lt;SPAN class=""&gt;_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;615&lt;/SPAN&gt; &lt;SPAN class=""&gt;coordinates&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;points&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;616&lt;/SPAN&gt; &lt;SPAN class=""&gt;coordinate_system&lt;/SPAN&gt;&lt;SPAN class=""&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;coordinate_system&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt; &lt;SPAN&gt;617&lt;/SPAN&gt; &lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;618&lt;/SPAN&gt; coordinates_metadata &lt;SPAN&gt;=&lt;/SPAN&gt; CoordinatesMetadata( &lt;SPAN&gt;619&lt;/SPAN&gt; points&lt;SPAN&gt;=&lt;/SPAN&gt;points, &lt;SPAN&gt;620&lt;/SPAN&gt; system&lt;SPAN&gt;=&lt;/SPAN&gt;coordinate_system, &lt;SPAN&gt;621&lt;/SPAN&gt; ) &lt;SPAN&gt;623&lt;/SPAN&gt; links: List[Link] &lt;SPAN&gt;=&lt;/SPAN&gt; []&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/text.py:235&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;element_from_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(text, coordinates, coordinate_system)&lt;/SPAN&gt; &lt;SPAN&gt;229&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; is_possible_numbered_list(text): &lt;SPAN&gt;230&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; ListItem( &lt;SPAN&gt;231&lt;/SPAN&gt; text&lt;SPAN&gt;=&lt;/SPAN&gt;text, &lt;SPAN&gt;232&lt;/SPAN&gt; coordinates&lt;SPAN&gt;=&lt;/SPAN&gt;coordinates, &lt;SPAN&gt;233&lt;/SPAN&gt; coordinate_system&lt;SPAN&gt;=&lt;/SPAN&gt;coordinate_system, &lt;SPAN&gt;234&lt;/SPAN&gt; ) &lt;SPAN class=""&gt;--&amp;gt; 235&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; &lt;SPAN class=""&gt;is_possible_narrative_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;text&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;: &lt;SPAN&gt;236&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; NarrativeText( &lt;SPAN&gt;237&lt;/SPAN&gt; text&lt;SPAN&gt;=&lt;/SPAN&gt;text, &lt;SPAN&gt;238&lt;/SPAN&gt; coordinates&lt;SPAN&gt;=&lt;/SPAN&gt;coordinates, &lt;SPAN&gt;239&lt;/SPAN&gt; coordinate_system&lt;SPAN&gt;=&lt;/SPAN&gt;coordinate_system, &lt;SPAN&gt;240&lt;/SPAN&gt; ) &lt;SPAN&gt;241&lt;/SPAN&gt; &lt;SPAN class=""&gt;elif&lt;/SPAN&gt; is_possible_title(text):&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/text_type.py:87&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;is_possible_narrative_text&lt;/SPAN&gt;&lt;SPAN class=""&gt;(text, cap_threshold, non_alpha_threshold, languages, language_checks)&lt;/SPAN&gt; &lt;SPAN&gt;84&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; under_non_alpha_ratio(text, threshold&lt;SPAN&gt;=&lt;/SPAN&gt;non_alpha_threshold): &lt;SPAN&gt;85&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN class=""&gt;False&lt;/SPAN&gt; &lt;SPAN class=""&gt;---&amp;gt; 87&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; &lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;eng&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN class=""&gt;in&lt;/SPAN&gt; languages &lt;SPAN class=""&gt;and&lt;/SPAN&gt; (sentence_count(text, &lt;SPAN&gt;3&lt;/SPAN&gt;) &lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt; &lt;SPAN&gt;2&lt;/SPAN&gt;) &lt;SPAN class=""&gt;and&lt;/SPAN&gt; (&lt;SPAN class=""&gt;not&lt;/SPAN&gt; &lt;SPAN class=""&gt;contains_verb&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;text&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &lt;SPAN&gt;88&lt;/SPAN&gt; trace_logger&lt;SPAN&gt;.&lt;/SPAN&gt;detail(&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;Not narrative. Text does not contain a verb:&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;text&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;) &lt;SPAN&gt;# type: ignore # noqa: E501&lt;/SPAN&gt; &lt;SPAN&gt;89&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN class=""&gt;False&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/partition/text_type.py:189&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;contains_verb&lt;/SPAN&gt;&lt;SPAN class=""&gt;(text)&lt;/SPAN&gt; &lt;SPAN&gt;186&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; text&lt;SPAN&gt;.&lt;/SPAN&gt;isupper(): &lt;SPAN&gt;187&lt;/SPAN&gt; text &lt;SPAN&gt;=&lt;/SPAN&gt; text&lt;SPAN&gt;.&lt;/SPAN&gt;lower() &lt;SPAN class=""&gt;--&amp;gt; 189&lt;/SPAN&gt; pos_tags &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;pos_tag&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;text&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;190&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; &lt;SPAN&gt;any&lt;/SPAN&gt;(tag &lt;SPAN class=""&gt;in&lt;/SPAN&gt; POS_VERB_TAGS &lt;SPAN class=""&gt;for&lt;/SPAN&gt; _, tag &lt;SPAN class=""&gt;in&lt;/SPAN&gt; pos_tags)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/unstructured/nlp/tokenize.py:55&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;pos_tag&lt;/SPAN&gt;&lt;SPAN class=""&gt;(text)&lt;/SPAN&gt; &lt;SPAN&gt;53&lt;/SPAN&gt; &lt;SPAN class=""&gt;for&lt;/SPAN&gt; sentence &lt;SPAN class=""&gt;in&lt;/SPAN&gt; sentences: &lt;SPAN&gt;54&lt;/SPAN&gt; tokens &lt;SPAN&gt;=&lt;/SPAN&gt; _word_tokenize(sentence) &lt;SPAN class=""&gt;---&amp;gt; 55&lt;/SPAN&gt; parts_of_speech&lt;SPAN&gt;.&lt;/SPAN&gt;extend(&lt;SPAN class=""&gt;_pos_tag&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;tokens&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;) &lt;SPAN&gt;56&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; parts_of_speech&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/nltk/tag/__init__.py:165&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;pos_tag&lt;/SPAN&gt;&lt;SPAN class=""&gt;(tokens, tagset, lang)&lt;/SPAN&gt; &lt;SPAN&gt;140&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;pos_tag&lt;/SPAN&gt;(tokens, tagset&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN class=""&gt;None&lt;/SPAN&gt;, lang&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;eng&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &lt;SPAN&gt;141&lt;/SPAN&gt; &lt;SPAN&gt;"""&lt;/SPAN&gt; &lt;SPAN&gt;142&lt;/SPAN&gt; &lt;SPAN&gt;Use NLTK's currently recommended part of speech tagger to&lt;/SPAN&gt; &lt;SPAN&gt;143&lt;/SPAN&gt; &lt;SPAN&gt;tag the given list of tokens.&lt;/SPAN&gt; &lt;SPAN class=""&gt;(...)&lt;/SPAN&gt; &lt;SPAN&gt;163&lt;/SPAN&gt; &lt;SPAN&gt;:rtype: list(tuple(str, str))&lt;/SPAN&gt; &lt;SPAN&gt;164&lt;/SPAN&gt; &lt;SPAN&gt;"""&lt;/SPAN&gt; &lt;SPAN class=""&gt;--&amp;gt; 165&lt;/SPAN&gt; tagger &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;_get_tagger&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;lang&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;166&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; _pos_tag(tokens, tagset, tagger, lang)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/nltk/tag/__init__.py:107&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;_get_tagger&lt;/SPAN&gt;&lt;SPAN class=""&gt;(lang)&lt;/SPAN&gt; &lt;SPAN&gt;105&lt;/SPAN&gt; tagger &lt;SPAN&gt;=&lt;/SPAN&gt; PerceptronTagger(lang&lt;SPAN&gt;=&lt;/SPAN&gt;lang) &lt;SPAN&gt;106&lt;/SPAN&gt; &lt;SPAN class=""&gt;else&lt;/SPAN&gt;: &lt;SPAN class=""&gt;--&amp;gt; 107&lt;/SPAN&gt; tagger &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;PerceptronTagger&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;108&lt;/SPAN&gt; &lt;SPAN class=""&gt;return&lt;/SPAN&gt; tagger&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/nltk/tag/perceptron.py:183&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;PerceptronTagger.__init__&lt;/SPAN&gt;&lt;SPAN class=""&gt;(self, load, lang)&lt;/SPAN&gt; &lt;SPAN&gt;181&lt;/SPAN&gt; &lt;SPAN&gt;self&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;classes &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;set&lt;/SPAN&gt;() &lt;SPAN&gt;182&lt;/SPAN&gt; &lt;SPAN class=""&gt;if&lt;/SPAN&gt; load: &lt;SPAN class=""&gt;--&amp;gt; 183&lt;/SPAN&gt; &lt;SPAN class=""&gt;self&lt;/SPAN&gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;&lt;SPAN class=""&gt;load_from_json&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;lang&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/nltk/tag/perceptron.py:273&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;PerceptronTagger.load_from_json&lt;/SPAN&gt;&lt;SPAN class=""&gt;(self, lang)&lt;/SPAN&gt; &lt;SPAN&gt;271&lt;/SPAN&gt; &lt;SPAN class=""&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;load_from_json&lt;/SPAN&gt;(&lt;SPAN&gt;self&lt;/SPAN&gt;, lang&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;eng&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &lt;SPAN&gt;272&lt;/SPAN&gt; &lt;SPAN&gt;# Automatically find path to the tagger if location is not specified.&lt;/SPAN&gt; &lt;SPAN class=""&gt;--&amp;gt; 273&lt;/SPAN&gt; loc &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN class=""&gt;find&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;f&lt;/SPAN&gt;&lt;SPAN class=""&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;taggers/averaged_perceptron_tagger_&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN class=""&gt;lang&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;/&lt;/SPAN&gt;&lt;SPAN class=""&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;)&lt;/SPAN&gt; &lt;SPAN&gt;274&lt;/SPAN&gt; &lt;SPAN class=""&gt;with&lt;/SPAN&gt; &lt;SPAN&gt;open&lt;/SPAN&gt;(loc &lt;SPAN&gt;+&lt;/SPAN&gt; TAGGER_JSONS[lang][&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;weights&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;]) &lt;SPAN class=""&gt;as&lt;/SPAN&gt; fin: &lt;SPAN&gt;275&lt;/SPAN&gt; &lt;SPAN&gt;self&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;model&lt;SPAN&gt;.&lt;/SPAN&gt;weights &lt;SPAN&gt;=&lt;/SPAN&gt; json&lt;SPAN&gt;.&lt;/SPAN&gt;load(fin)&lt;/DIV&gt;&lt;DIV class=""&gt;File &lt;SPAN class=""&gt;/local_disk0/.ephemeral_nfs/envs/pythonEnv-2534a48d-f9f7-4daf-934e-2d9e08931bf6/lib/python3.10/site-packages/nltk/data.py:582&lt;/SPAN&gt;, in &lt;SPAN class=""&gt;find&lt;/SPAN&gt;&lt;SPAN class=""&gt;(resource_name, paths)&lt;/SPAN&gt; &lt;SPAN&gt;580&lt;/SPAN&gt; sep &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN&gt;*&lt;/SPAN&gt; &lt;SPAN&gt;70&lt;/SPAN&gt; &lt;SPAN&gt;581&lt;/SPAN&gt; resource_not_found &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;sep&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;msg&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN class=""&gt;{&lt;/SPAN&gt;sep&lt;SPAN class=""&gt;}&lt;/SPAN&gt;&lt;SPAN class=""&gt;\n&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt; &lt;SPAN class=""&gt;--&amp;gt; 582&lt;/SPAN&gt; &lt;SPAN class=""&gt;raise&lt;/SPAN&gt; &lt;SPAN class=""&gt;LookupError&lt;/SPAN&gt;(resource_not_found)&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sun, 11 Aug 2024 18:38:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/82673#M346</guid>
      <dc:creator>cleversuresh</dc:creator>
      <dc:date>2024-08-11T18:38:52Z</dc:date>
    </item>
    <item>
      <title>Re: Generative Certification notebook error</title>
      <link>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/82753#M351</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/114928"&gt;@cleversuresh&lt;/a&gt;,&amp;nbsp;&lt;SPAN&gt;You're encountering a LookupError because the averaged_perceptron_tagger_eng resource is missing. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;To fix it, run this in a new cell: &lt;/SPAN&gt;&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;python import nltk nltk.download('averaged_perceptron_tagger_eng')&lt;/LI-CODE&gt;
&lt;P&gt;&lt;SPAN&gt;This will download the necessary resources. If you still face issues, ensure your NLTK data directory is set up correctly and accessible.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 14:04:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/82753#M351</guid>
      <dc:creator>Retired_mod</dc:creator>
      <dc:date>2024-08-12T14:04:50Z</dc:date>
    </item>
    <item>
      <title>Re: Generative Certification notebook error</title>
      <link>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/95274#M618</link>
      <description>&lt;P&gt;I have the same problem.&amp;nbsp;&lt;/P&gt;&lt;P&gt;It looks like the&amp;nbsp;&lt;SPAN&gt;extract_doc_text() function is no longer in the imported libraries&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Oct 2024 13:26:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/95274#M618</guid>
      <dc:creator>Sven-Rel</dc:creator>
      <dc:date>2024-10-21T13:26:50Z</dc:date>
    </item>
    <item>
      <title>Re: Generative Certification notebook error</title>
      <link>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/97743#M630</link>
      <description>&lt;P&gt;Define all the helper functions from include folder (_helper_functions.py) in your main notebook itself and run these:&lt;/P&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; nltk&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'punkt'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'wordnet'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'omw-1.4'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'punkt_tab'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;nltk.&lt;/SPAN&gt;&lt;SPAN&gt;download&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'averaged_perceptron_tagger_eng'&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 05 Nov 2024 12:52:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/generative-certification-notebook-error/m-p/97743#M630</guid>
      <dc:creator>MohammedArif</dc:creator>
      <dc:date>2024-11-05T12:52:28Z</dc:date>
    </item>
  </channel>
</rss>

