cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Add-on for Splunk v1.2 - Error in 'databricksquery' command

hukel
Contributor

Is anyone else using the new v1.2 of the Databricks Add-on for Splunk ?   We upgraded to 1.2 and now get this error for all queries.

Running process: /opt/splunk/bin/nsjail-wrapper /opt/splunk/bin/python3.7 /opt/splunk/etc/apps/TA-Databricks/bin/databricksquery.py
Error in 'databricksquery' command: External search command exited unexpectedly with non-zero error code 1.

I've opened an issue here https://github.com/databrickslabs/splunk-integration/issues/42 but haven't gotten a follow-up.  

Is anyone else using this add-on successfully with v1.2?

1 ACCEPTED SOLUTION

Accepted Solutions

hukel
Contributor

There is a new mandatory parameter for databricksquery called account_name.    This breaking change is not documented in Splunkbase release notes but it does appear in the docs within the Splunk app.

 

databricksquery cluster="<cluster_name>" query="<SQL_query>" command_timeout=<timeout_in_seconds> account_name="<account_name>"

 

View solution in original post

5 REPLIES 5

shan_chandra
Databricks Employee
Databricks Employee

@hukel - could you please share the full error stack trace?

I can't see the python stack trace because the TA doesn't output it to a Splunk-logged location (that I can find).    The search.log output is all I can see (pasted below).

08-07-2023 16:03:05.046 INFO  SearchParser [994756 searchOrchestrator] - PARSING: | databricksquery command_timeout=1200 query="\n     \n     SELECT ImageFileName, _time, *\n     FROM silver.ProcessRollup2 \n\n     WHERE event_date BETWEEN '2023-08-07' AND '2023-08-07'\n       AND _time >= 1691409780.000 AND _time <= 1691424183.000\n       AND (\n          LOWER(ImageFileName) LIKE '\\\\\\\\device\\\\\\\\harddiskvolume%\\\\\\\\\agentexecutor.exe'\n       )\n     ORDER BY _time DESC \n\n     LIMIT 1 "
08-07-2023 16:03:05.047 INFO  ServerConfig [994756 searchOrchestrator] - Will add app jailing prefix /opt/splunk/bin/nsjail-wrapper for TA-Databricks
08-07-2023 16:03:05.047 INFO  ChunkedExternProcessor [994756 searchOrchestrator] - Running process: /opt/splunk/bin/nsjail-wrapper /opt/splunk/bin/python3.7 /opt/splunk/etc/apps/TA-Databricks/bin/databricksquery.py
08-07-2023 16:03:05.747 INFO  ChunkedExternProcessor [994756 searchOrchestrator] - Custom search command is a generating command.
08-07-2023 16:03:05.747 WARN  ChunkedExternProcessor [994756 searchOrchestrator] - Error adding inspector message: invalid level or message already exists
08-07-2023 16:03:05.747 INFO  SearchPipeline [994756 searchOrchestrator] - ReportSearch=0 AllowBatchMode=0
08-07-2023 16:03:05.747 INFO  SearchPhaseGenerator [994756 searchOrchestrator] - No need for RTWindowProcessor
08-07-2023 16:03:05.747 INFO  SearchPhaseGenerator [994756 searchOrchestrator] - Adding timeliner to final phase
08-07-2023 16:03:05.747 INFO  SearchParser [994756 searchOrchestrator] - PARSING: | timeliner remote=0 partial_commits=0 max_events_per_bucket=10000 fieldstats_update_maxperiod=60 bucket=0 extra_field=*
08-07-2023 16:03:05.747 INFO  TimelineCreator [994756 searchOrchestrator] - Creating timeline with remote=0 partialCommits=0 commitFreq=0 syncKSFreq=0 maxSyncKSPeriodTime=60000 bucket=0 latestTime=1691424183.000000 earliestTime=1691409780.000000
08-07-2023 16:03:05.747 INFO  SearchPhaseGenerator [994756 searchOrchestrator] - required fields list to add to different pipelines = *,_bkt,_cd,_si,host,index,linecount,source,sourcetype,splunk_server
08-07-2023 16:03:05.747 INFO  SearchPhaseGenerator [994756 searchOrchestrator] - Search Phases created.
08-07-2023 16:03:05.749 INFO  SearchOrchestrator [994756 searchOrchestrator] - Starting the status control thread.
08-07-2023 16:03:05.749 INFO  SearchOrchestrator [994756 searchOrchestrator] - Starting phase=1
08-07-2023 16:03:05.749 INFO  ReducePhaseExecutor [994794 phase_1] - Starting phase_1
08-07-2023 16:03:05.749 INFO  SearchStatusEnforcer [994787 StatusEnforcerThread] - Enforcing disk quota = 10485760000
08-07-2023 16:03:05.805 ERROR ChunkedExternProcessor [994794 phase_1] - EOF while attempting to read transport header read_size=0
08-07-2023 16:03:05.805 ERROR ChunkedExternProcessor [994794 phase_1] - Error in 'databricksquery' command: External search command exited unexpectedly with non-zero error code 1.
08-07-2023 16:03:05.805 INFO  ReducePhaseExecutor [994794 phase_1] - Ending phase_1

shan_chandra
Databricks Employee
Databricks Employee

@hukel -  Does the below query runs fine in an isolated notebook?

SELECT ImageFileName, _time, *\n     FROM silver.ProcessRollup2 \n\n     WHERE event_date BETWEEN '2023-08-07' AND '2023-08-07'\n       AND _time >= 1691409780.000 AND _time <= 1691424183.000\n       AND (\n          LOWER(ImageFileName) LIKE '\\\\\\\\device\\\\\\\\harddiskvolume%\\\\\\\\\agentexecutor.exe'\n       )\n     ORDER BY _time DESC \n\n     LIMIT 1 

 

Yes,  this is a test query that I always use.  It has only stopped working after the 1.2 upgrade.

hukel_0-1691425884567.png

 

hukel
Contributor

There is a new mandatory parameter for databricksquery called account_name.    This breaking change is not documented in Splunkbase release notes but it does appear in the docs within the Splunk app.

 

databricksquery cluster="<cluster_name>" query="<SQL_query>" command_timeout=<timeout_in_seconds> account_name="<account_name>"

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group