<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracting PDFs and using AI queries  | best practices in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131486#M49103</link>
    <description>&lt;P&gt;Furthering the prototype again today , will update second half today. thank you for the proactive check&lt;/P&gt;</description>
    <pubDate>Wed, 10 Sep 2025 06:50:48 GMT</pubDate>
    <dc:creator>ManojkMohan</dc:creator>
    <dc:date>2025-09-10T06:50:48Z</dc:date>
    <item>
      <title>Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131298#M49031</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Problem i am solving:&lt;/STRONG&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Upload PDF → available in /Volumes/&amp;lt;catalog&amp;gt;/&amp;lt;schema&amp;gt;/&amp;lt;volume&amp;gt;/.&lt;/LI&gt;&lt;LI&gt;Extract text with pdfplumber (or OCR if scanned).&lt;/LI&gt;&lt;LI&gt;Store in Delta table for governance.&lt;/LI&gt;&lt;LI&gt;Parse intelligently using:&lt;OL&gt;&lt;LI&gt;ai_query() with Databricks LLMs for flexible JSON output.&lt;/LI&gt;&lt;/OL&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Solution Approach:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ManojkMohan_0-1757363520515.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19842iC00B0CB07D9FDE43/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ManojkMohan_0-1757363520515.png" alt="ManojkMohan_0-1757363520515.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Successfully extracted data from pdf&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ManojkMohan_1-1757363556810.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19843iEC69E1FCF8953F31/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ManojkMohan_1-1757363556810.png" alt="ManojkMohan_1-1757363556810.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Successfully saved to Delta Table&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ManojkMohan_2-1757363593667.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19844i48F8CD3492F02CE7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ManojkMohan_2-1757363593667.png" alt="ManojkMohan_2-1757363593667.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When i use ai query with the code below&lt;/P&gt;&lt;P&gt;from pyspark.sql.functions import expr&lt;/P&gt;&lt;P&gt;# --- Configuration ---&lt;BR /&gt;source_table = "pdf.default.raw_pdf_text"&lt;BR /&gt;target_table = "pdf.default.parsed_pdf_data"&lt;BR /&gt;# Using a powerful model available on Databricks for this task&lt;BR /&gt;model_name = 'databricks-meta-llama-3-70b-instruct'&lt;/P&gt;&lt;P&gt;# --- Main Logic ---&lt;BR /&gt;try:&lt;BR /&gt;# 1. Read the raw text data from the source table&lt;BR /&gt;print(f"Reading data from source table: {source_table}")&lt;BR /&gt;df_raw = spark.table(source_table)&lt;/P&gt;&lt;P&gt;# 2. Use ai_query to extract structured information&lt;BR /&gt;# NOTE: The prompt has been adapted to extract information relevant to a phone bill.&lt;BR /&gt;print(f"Applying ai_query with model '{model_name}' to extract structured data...")&lt;BR /&gt;df_structured = df_raw.withColumn(&lt;BR /&gt;"structured_info",&lt;BR /&gt;expr(&lt;BR /&gt;f"""&lt;BR /&gt;ai_query(&lt;BR /&gt;'{model_name}',&lt;BR /&gt;concat(&lt;BR /&gt;'From the following text of a phone bill, extract these fields as a valid JSON object: customer_name, invoice_number, billing_date, due_date, and total_amount. ',&lt;BR /&gt;'If a field is not present, use null. Ensure total_amount is a numeric value without currency symbols.\\n',&lt;BR /&gt;content&lt;BR /&gt;)&lt;BR /&gt;)&lt;BR /&gt;"""&lt;BR /&gt;)&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;# 3. Save the new DataFrame with structured data to a target table&lt;BR /&gt;print(f"Saving structured data to target table: {target_table}")&lt;BR /&gt;df_structured.write.mode("overwrite").saveAsTable(target_table)&lt;/P&gt;&lt;P&gt;print(f"\nSuccessfully extracted structured data and saved it to the table: {target_table}")&lt;BR /&gt;&lt;BR /&gt;# Optional: Display the results to verify&lt;BR /&gt;print("\n--- Preview of the structured data ---")&lt;BR /&gt;spark.table(target_table).select("filename", "structured_info").show(truncate=False)&lt;/P&gt;&lt;P&gt;except Exception as e:&lt;BR /&gt;print(f"An unexpected error occurred: {e}")&lt;/P&gt;&lt;P&gt;running into bug&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Reading data from source table: pdf.default.raw_pdf_text Applying ai_query with model 'databricks-meta-llama-3-70b-instruct' to extract structured data... Saving structured data to target table: pdf.default.parsed_pdf_data An unexpected error occurred: [REMOTE_FUNCTION_HTTP_FAILED_ERROR] The remote HTTP request failed with code 404, and error message 'HTTP request failed with status: {"error_code":"RESOURCE_DOES_NOT_EXIST","message":"Endpoint with name \'databricks-meta-llama-3-70b-instruct\' does not exist.","details":[{"@type":"type.googleapis.com/google.rpc.RequestInfo","request_id":"2110c93f-da76-4315-ad9e-482b1a96d65d","serving_data":""}]}' SQLSTATE: 57012 JVM stacktrace: org.apache.spark.SparkException at org.apache.spark.sql.errors.QueryExecutionErrors$.remoteHttpFailedError(QueryExecutionErrors.scala:2059) at com.databricks.sql.util.RemoteHttpClient$.postRawResponse(RemoteHttpClient.scala:124) at com.databricks.sql.catalyst.expressions.ai.AIFunctionsUtils$.getEndpointAPI(AIFunctionsUtils.scala:780) at com.databricks.sql.catalyst.expressions.ai.AIFunctionsUtils$.getModelServingEndpoint(AIFunctionsUtils.scala:805) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.modelProvider$lzycompute(AIQuery.scala:1111) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.modelProvider(AIQuery.scala:1099) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.endpointUrl(AIQuery.scala:270) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.endpointUrl$(AIQuery.scala:270) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.endpointUrl$lzycompute(AIQuery.scala:1084) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.endpointUrl(AIQuery.scala:1084) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.checkInputDataTypes(AIQuery.scala:325) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.checkInputDataTypes$(AIQuery.scala:282) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.checkInputDataTypes(AIQuery.scala:1084) at org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:336) at org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:336) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$childrenResolved$1(Expression.scala:348) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$childrenResolved$1$adapted(Expression.scala:348) at scala.collection.IterableOnceOps.forall(IterableOnce.scala:633) at scala.collection.IterableOnceOps.forall$(IterableOnce.scala:630) at scala.collection.AbstractIterable.forall(Iterable.scala:935) at org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:348) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20$$anonfun$applyOrElse$182.applyOrElse(Analyzer.scala:3054) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20$$anonfun$applyOrElse$182.applyOrElse(Analyzer.scala:3036) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:586) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:586) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:220) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:232) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:232) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:244) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:250) at scala.collection.immutable.List.map(List.scala:251) at scala.collection.immutable.List.map(List.scala:79) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:250) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$5(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:371) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:220) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20.applyOrElse(Analyzer.scala:3036) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20.applyOrElse(Analyzer.scala:2868) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:418) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:137) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:133) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:42) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2868) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2863) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$17(RuleExecutor.scala:503) at org.apache.spark.sql.catalyst.rules.RecoverableRuleExecutionHelper.processRule(RuleExecutor.scala:657) at org.apache.spark.sql.catalyst.rules.RecoverableRuleExecutionHelper.processRule$(RuleExecutor.scala:641) at org.apache.spark.sql.catalyst.rules.RuleExecutor.processRule(RuleExecutor.scala:137) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$16(RuleExecutor.scala:503) at com.databricks.spark.util.MemoryTracker$.withThreadAllocatedBytes(MemoryTracker.scala:51) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.measureRule(QueryPlanningTracker.scala:338) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$15(RuleExecutor.scala:501) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$14(RuleExecutor.scala:500) at scala.collection.LinearSeqOps.foldLeft(LinearSeq.scala:183) at scala.collection.LinearSeqOps.foldLeft$(LinearSeq.scala:179) at scala.collection.immutable.List.foldLeft(List.scala:79) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$13(RuleExecutor.scala:492) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:466) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$23(RuleExecutor.scala:613) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$23$adapted(RuleExecutor.scala:613) at scala.collection.immutable.List.foreach(List.scala:334) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:613) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:359) at org.apache.spark.sql.catalyst.analysis.Analyzer.super$execute(Analyzer.scala:577) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeSameContext$1(Analyzer.scala:577) at com.databricks.sql.unity.SAMSnapshotHelper$.visitPlansDuringAnalysis(SAMSnapshotHelper.scala:41) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeSameContext(Analyzer.scala:576) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:568) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:402) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:568) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:499) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:347) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:253) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:347) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.resolveInFixedPoint(HybridAnalyzer.scala:388) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.$anonfun$apply$1(HybridAnalyzer.scala:98) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.withTrackedAnalyzerBridgeState(HybridAnalyzer.scala:135) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.apply(HybridAnalyzer.scala:91) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$2(Analyzer.scala:555) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:425) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:555) at com.databricks.sql.unity.SAMSnapshotHelper$.visitPlansDuringAnalysis(SAMSnapshotHelper.scala:41) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:548) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$3(QueryExecution.scala:327) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:655) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$7(QueryExecution.scala:810) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withExecutionPhase$1(SQLExecution.scala:155) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:291) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:287) at com.databricks.util.TracingSpanUtils$.$anonfun$withTracing$4(TracingSpanUtils.scala:235) at com.databricks.util.TracingSpanUtils$.withTracing(TracingSpanUtils.scala:129) at com.databricks.util.TracingSpanUtils$.withTracing(TracingSpanUtils.scala:233) at com.databricks.tracing.TracingUtils$.withTracing(TracingUtils.scala:296) at com.databricks.spark.util.DatabricksTracingHelper.withSpan(DatabricksSparkTracingHelper.scala:115) at com.databricks.spark.util.DBRTracing$.withSpan(DBRTracing.scala:47) at org.apache.spark.sql.execution.SQLExecution$.withExecutionPhase(SQLExecution.scala:136) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$6(QueryExecution.scala:810) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1449) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$4(QueryExecution.scala:803) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$3(QueryExecution.scala:800) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:800) at org.apache.spark.sql.execution.QueryExecution.withQueryExecutionId(QueryExecution.scala:789) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:799) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:798) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$2(QueryExecution.scala:309) at com.databricks.sql.util.MemoryTrackerHelper.withMemoryTracking(MemoryTrackerHelper.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$1(QueryExecution.scala:308) at scala.util.Try$.apply(Try.scala:217) at org.apache.spark.util.Utils$.doTryWithCallerStacktrace(Utils.scala:1686) at org.apache.spark.util.Utils$.getTryWithCallerStacktrace(Utils.scala:1747) at org.apache.spark.util.LazyTry.get(LazyTry.scala:75) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:369) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:288) at org.apache.spark.sql.classic.Dataset$.$anonfun$ofRows$13(Dataset.scala:258) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.classic.SparkSession.$anonfun$withActiveAndFrameProfiler$1(SparkSession.scala:1071) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.classic.SparkSession.withActiveAndFrameProfiler(SparkSession.scala:1071) at org.apache.spark.sql.classic.Dataset$.ofRows(Dataset.scala:254) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:3721) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:3199) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:385) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:282) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:238) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:466) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:466) at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:97) at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:121) at org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:115) at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:120) at org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:465) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:238) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$execute$1(ExecuteThreadRunner.scala:141) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.spark.connect.service.UtilizationMetrics.recordActiveQueries(UtilizationMetrics.scala:43) at com.databricks.spark.connect.service.UtilizationMetrics.recordActiveQueries$(UtilizationMetrics.scala:40) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.recordActiveQueries(ExecuteThreadRunner.scala:53) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:139) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.$anonfun$run$2(ExecuteThreadRunner.scala:586) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109) at scala.util.Using$.resource(Using.scala:296) at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:586) Caused by: com.databricks.sql.util.UnexpectedHttpStatus: HTTP request failed with status: {"error_code":"RESOURCE_DOES_NOT_EXIST","message":"Endpoint with name 'databricks-meta-llama-3-70b-instruct' does not exist.","details":[{"@type":"type.googleapis.com/google.rpc.RequestInfo","request_id":"2110c93f-da76-4315-ad9e-482b1a96d65d","serving_data":""}]} at com.databricks.sql.util.RemoteHttpClient.$anonfun$sendRequestInternal$6(RemoteHttpClient.scala:411) at com.databricks.sql.util.RemoteHttpClient.$anonfun$sendRequestInternal$6$adapted(RemoteHttpClient.scala:395) at com.databricks.sql.util.RetryUtils$.$anonfun$runWithExponentialBackoffRetryWithCount$1(RetryUtils.scala:182) at com.databricks.sql.util.RetryUtils$.$anonfun$runWithExponentialBackoffRetryWithCount$1$adapted(RetryUtils.scala:181) at com.databricks.sql.util.RetryWorkerImpl.$anonfun$runWithExponentialBackoffRetry$2(RetryUtils.scala:90) at com.databricks.sql.util.RetryWorkerImpl.$anonfun$runWithExponentialBackoffRetry$2$adapted(RetryUtils.scala:89) at com.databricks.sql.util.RetryWorkerImpl.runWithExponentialBackoffRetryInternal(RetryUtils.scala:103) at com.databricks.sql.util.RetryWorkerImpl.runWithExponentialBackoffRetry(RetryUtils.scala:89) at com.databricks.sql.util.RetryUtils$.runWithExponentialBackoffRetryWithCount(RetryUtils.scala:181) at com.databricks.sql.util.RemoteHttpClient.sendRequestInternal(RemoteHttpClient.scala:395) at com.databricks.sql.util.RemoteHttpClient.sendRequest(RemoteHttpClient.scala:325) at com.databricks.sql.util.RemoteHttpClient$.postRawResponse(RemoteHttpClient.scala:112) at com.databricks.sql.catalyst.expressions.ai.AIFunctionsUtils$.getEndpointAPI(AIFunctionsUtils.scala:780) at com.databricks.sql.catalyst.expressions.ai.AIFunctionsUtils$.getModelServingEndpoint(AIFunctionsUtils.scala:805) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.modelProvider$lzycompute(AIQuery.scala:1111) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.modelProvider(AIQuery.scala:1099) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.endpointUrl(AIQuery.scala:270) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.endpointUrl$(AIQuery.scala:270) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.endpointUrl$lzycompute(AIQuery.scala:1084) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.endpointUrl(AIQuery.scala:1084) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.checkInputDataTypes(AIQuery.scala:325) at com.databricks.sql.catalyst.expressions.ai.AIQueryBase.checkInputDataTypes$(AIQuery.scala:282) at com.databricks.sql.catalyst.expressions.ai.UnresolvedAIQuery.checkInputDataTypes(AIQuery.scala:1084) at org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:336) at org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:336) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$childrenResolved$1(Expression.scala:348) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$childrenResolved$1$adapted(Expression.scala:348) at scala.collection.IterableOnceOps.forall(IterableOnce.scala:633) at scala.collection.IterableOnceOps.forall$(IterableOnce.scala:630) at scala.collection.AbstractIterable.forall(Iterable.scala:935) at org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:348) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20$$anonfun$applyOrElse$182.applyOrElse(Analyzer.scala:3054) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20$$anonfun$applyOrElse$182.applyOrElse(Analyzer.scala:3036) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:586) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:586) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUpWithPruning$1(QueryPlan.scala:220) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:232) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:232) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:244) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:250) at scala.collection.immutable.List.map(List.scala:251) at scala.collection.immutable.List.map(List.scala:79) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:250) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$5(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:371) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:255) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:220) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20.applyOrElse(Analyzer.scala:3036) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$20.applyOrElse(Analyzer.scala:2868) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:141) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:418) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:137) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:133) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:42) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2868) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:2863) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$17(RuleExecutor.scala:503) at org.apache.spark.sql.catalyst.rules.RecoverableRuleExecutionHelper.processRule(RuleExecutor.scala:657) at org.apache.spark.sql.catalyst.rules.RecoverableRuleExecutionHelper.processRule$(RuleExecutor.scala:641) at org.apache.spark.sql.catalyst.rules.RuleExecutor.processRule(RuleExecutor.scala:137) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$16(RuleExecutor.scala:503) at com.databricks.spark.util.MemoryTracker$.withThreadAllocatedBytes(MemoryTracker.scala:51) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.measureRule(QueryPlanningTracker.scala:338) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$15(RuleExecutor.scala:501) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$14(RuleExecutor.scala:500) at scala.collection.LinearSeqOps.foldLeft(LinearSeq.scala:183) at scala.collection.LinearSeqOps.foldLeft$(LinearSeq.scala:179) at scala.collection.immutable.List.foldLeft(List.scala:79) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$13(RuleExecutor.scala:492) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:466) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$23(RuleExecutor.scala:613) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$23$adapted(RuleExecutor.scala:613) at scala.collection.immutable.List.foreach(List.scala:334) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:613) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:359) at org.apache.spark.sql.catalyst.analysis.Analyzer.super$execute(Analyzer.scala:577) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeSameContext$1(Analyzer.scala:577) at com.databricks.sql.unity.SAMSnapshotHelper$.visitPlansDuringAnalysis(SAMSnapshotHelper.scala:41) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeSameContext(Analyzer.scala:576) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:568) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:402) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:568) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:499) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:347) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:253) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:347) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.resolveInFixedPoint(HybridAnalyzer.scala:388) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.$anonfun$apply$1(HybridAnalyzer.scala:98) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.withTrackedAnalyzerBridgeState(HybridAnalyzer.scala:135) at org.apache.spark.sql.catalyst.analysis.resolver.HybridAnalyzer.apply(HybridAnalyzer.scala:91) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$2(Analyzer.scala:555) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:425) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:555) at com.databricks.sql.unity.SAMSnapshotHelper$.visitPlansDuringAnalysis(SAMSnapshotHelper.scala:41) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:548) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$3(QueryExecution.scala:327) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:655) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$7(QueryExecution.scala:810) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withExecutionPhase$1(SQLExecution.scala:155) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:291) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:287) at com.databricks.util.TracingSpanUtils$.$anonfun$withTracing$4(TracingSpanUtils.scala:235) at com.databricks.util.TracingSpanUtils$.withTracing(TracingSpanUtils.scala:129) at com.databricks.util.TracingSpanUtils$.withTracing(TracingSpanUtils.scala:233) at com.databricks.tracing.TracingUtils$.withTracing(TracingUtils.scala:296) at com.databricks.spark.util.DatabricksTracingHelper.withSpan(DatabricksSparkTracingHelper.scala:115) at com.databricks.spark.util.DBRTracing$.withSpan(DBRTracing.scala:47) at org.apache.spark.sql.execution.SQLExecution$.withExecutionPhase(SQLExecution.scala:136) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$6(QueryExecution.scala:810) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1449) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$4(QueryExecution.scala:803) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$3(QueryExecution.scala:800) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:800) at org.apache.spark.sql.execution.QueryExecution.withQueryExecutionId(QueryExecution.scala:789) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:799) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:798) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$2(QueryExecution.scala:309) at com.databricks.sql.util.MemoryTrackerHelper.withMemoryTracking(MemoryTrackerHelper.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$lazyAnalyzed$1(QueryExecution.scala:308) at scala.util.Try$.apply(Try.scala:217) at org.apache.spark.util.Utils$.doTryWithCallerStacktrace(Utils.scala:1686) at org.apache.spark.util.LazyTry.tryT$lzycompute(LazyTry.scala:60) at org.apache.spark.util.LazyTry.tryT(LazyTry.scala:59) at org.apache.spark.util.LazyTry.get(LazyTry.scala:75) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:369) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:288) at org.apache.spark.sql.classic.Dataset$.$anonfun$ofRows$13(Dataset.scala:258) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.classic.SparkSession.$anonfun$withActiveAndFrameProfiler$1(SparkSession.scala:1071) at com.databricks.spark.util.FrameProfiler$.$anonfun$record$1(FrameProfiler.scala:113) at com.databricks.spark.util.FrameProfilerExporter$.maybeExportFrameProfiler(FrameProfilerExporter.scala:114) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:104) at org.apache.spark.sql.classic.SparkSession.withActiveAndFrameProfiler(SparkSession.scala:1071) at org.apache.spark.sql.classic.Dataset$.ofRows(Dataset.scala:254) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:3721) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:3199) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:385) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:282) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:238) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:466) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:860) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:466) at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:97) at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:121) at org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:115) at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:120) at org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:465) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:238) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$execute$1(ExecuteThreadRunner.scala:141) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.spark.connect.service.UtilizationMetrics.recordActiveQueries(UtilizationMetrics.scala:43) at com.databricks.spark.connect.service.UtilizationMetrics.recordActiveQueries$(UtilizationMetrics.scala:40) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.recordActiveQueries(ExecuteThreadRunner.scala:53) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:139) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.$anonfun$run$2(ExecuteThreadRunner.scala:586) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:51) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109) at scala.util.Using$.resource(Using.scala:296) at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:586)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;tagging my usual commenters for best practices advise&amp;nbsp;&amp;nbsp;&lt;A href="https://community.databricks.com/t5/user/viewprofilepage/user-id/171339" target="_blank" rel="noopener"&gt;@TheOC&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502" target="_blank" rel="noopener"&gt;@szymon_dybczak&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://community.databricks.com/t5/user/viewprofilepage/user-id/146924" target="_blank" rel="noopener"&gt;@BS_THE_ANALYST&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://community.databricks.com/t5/user/viewprofilepage/user-id/179536" target="_blank" rel="noopener"&gt;@Coffee77&lt;/A&gt;&amp;nbsp;&lt;STRONG&gt;&lt;A title="View profile" href="https://community.databricks.com/t5/user/viewprofilepage/user-id/179612" target="_blank" rel="author noopener"&gt;&lt;SPAN class=""&gt;WiliamRosa&lt;/SPAN&gt;&lt;/A&gt;&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/180185"&gt;@ck7007&lt;/a&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Sep 2025 20:47:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131298#M49031</guid>
      <dc:creator>ManojkMohan</dc:creator>
      <dc:date>2025-09-08T20:47:27Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131335#M49049</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/155141"&gt;@ManojkMohan&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Maybe you're using wrong endpoint name. Try with&amp;nbsp;&lt;STRONG&gt;databricks-meta-llama-3-3-70b-instruct&lt;BR /&gt;&lt;/STRONG&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;In your case you're trying to call an API with following name:&amp;nbsp;&lt;SPAN&gt;databricks-meta-llama-3-70b-instruct which I guess has small typo&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Sep 2025 06:44:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131335#M49049</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-09T06:44:45Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131485#M49102</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/155141"&gt;@ManojkMohan&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Did it work?&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 06:48:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131485#M49102</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-10T06:48:07Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131486#M49103</link>
      <description>&lt;P&gt;Furthering the prototype again today , will update second half today. thank you for the proactive check&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 06:50:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131486#M49103</guid>
      <dc:creator>ManojkMohan</dc:creator>
      <dc:date>2025-09-10T06:50:48Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131489#M49104</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/155141"&gt;@ManojkMohan&lt;/a&gt;&amp;nbsp;, keep us updated &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 06:52:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131489#M49104</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2025-09-10T06:52:54Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting PDFs and using AI queries  | best practices</title>
      <link>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131575#M49142</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110502"&gt;@szymon_dybczak&lt;/a&gt;&amp;nbsp; it worked&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ManojkMohan_0-1757527235895.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19900i38F735A599F34FC5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ManojkMohan_0-1757527235895.png" alt="ManojkMohan_0-1757527235895.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ManojkMohan_1-1757527257995.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/19901i0B91A6923652E879/image-size/medium?v=v2&amp;amp;px=400" role="button" title="ManojkMohan_1-1757527257995.png" alt="ManojkMohan_1-1757527257995.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The code reads raw text content extracted from PDF files stored in the Delta table&amp;nbsp;pdf.default.raw_pdf_text&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Databricks ai_query function, which internally calls a large foundation model (databricks-meta-llama-3-3-70b-instruct) to process the raw text&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;BR /&gt;The transformed data, which now includes the extracted structured information alongside any original fields, is saved to a new Delta table pdf.default.parsed_pdf_data&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Sep 2025 18:02:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/extracting-pdfs-and-using-ai-queries-best-practices/m-p/131575#M49142</guid>
      <dc:creator>ManojkMohan</dc:creator>
      <dc:date>2025-09-10T18:02:50Z</dc:date>
    </item>
  </channel>
</rss>

