cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

"ai_parse_document()" is not a full OCR engine ? It's not extracting text from high quality image

radha_krishna
Visitor

 I used "ai_parse_document()" to parse a PNG file that contains cat images and text. From the image, I wanted to extract all the cat names, but the response returned nothing. It seems that "ai_parse_document()" does not support rich image extraction. Am i right?

%sql
WITH parsed_documents AS (
    SELECT
      path,
      ai_parse_document(
        content,
        map(
          'imageOutputPath', '/Volumes/vector_search1/00_landing/volume1/',
          'descriptionElementTypes', '*'
        )
      ) AS parsed
    FROM READ_FILES('/Volumes/vector_search1/00_landing/volume1/cat names.png', format => 'binaryFile')
  ),
  parsed_text AS (
    SELECT
      path,
      concat_ws(
        '\n\n',
        transform(
          try_cast(parsed:document:elements AS ARRAY<STRING>),
          element -> try_cast(element:content AS STRING)
        )
      ) AS text
    FROM parsed_documents
    WHERE try_cast(parsed:error_status AS STRING) IS NULL
  )
  SELECT
    path,
    text,
    ai_query(
      'databricks-meta-llama-3-3-70b-instruct',
      concat(
        'Extract the following information from the document  ',
        text
      ),
      returnType => 'STRING'
    ) AS structured_data
  FROM parsed_text
  WHERE text IS NOT NULL;

 

0 REPLIES 0