Databricks Community

hukel · 10-30-2024

Background:I've created a small function in a notebook that uses Splunk's splunk-sdk package. The original intention was to call Splunk to execute a search/query, but for the sake of simplicity while testing this issue, the function only prints pr...

hukel · 01-08-2024

I'm new to RANGE_JOIN so this may be completely normal, but I'd like confirmation.Whenever I put a RANGE_JOIN hint in my query SELECT /*+ RANGE_JOIN(pr2, 3600) */ event.FirstIP4Record FROM SCHEMA_NAME_HERE.dnsrequest event INNER JOIN SC...

hukel · 12-26-2023

I'm experimenting with liquid clustering and have some questions about compatible types (somewhat similar to Liquid clustering with boolean columns ).Table created as CREATE TABLE IF NOT EXISTS <TABLE> ( _time DOUBLE , timestamp TIMESTAMP_NT...

hukel · 12-01-2023

Source data looks like: { "IntegrityLevel": "16384", "ParentProcessId": "10972929104936", "SourceProcessId": "10972929104936", "SHA256Hash": "a26a1ffb81a61281ffa55cb7778cc3fb0ff981704de49f75f51f18b283fba7a2", "ImageFileName": "\\Device\\Harddisk...

hukel · 11-27-2023

I have some CSV files that I upload to DBFS storage several times a day. From these CSVs, I have created SQL tables: CREATE TABLE IF NOT EXISTS masterdata.lookup_host USING CSV OPTIONS (header "true", inferSchema "true") LOCATION '/mnt/masterdata/...

hukel · 10-29-2024

Another idea (if you need to do small lookups, not bulk transfer) .... what about using Splunk's splunk-sdk to create a notebook function that hits Splunk via REST API?

hukel · 06-11-2024

In my experience with the Splunk add-on, it is typically used to pull Databricks data into Splunk, not to push. If the data sets are small then it could probably push as well, but I think you'd have to write some sort of Splunk map loop to issue I...

hukel · 01-08-2024

Per support, "TimestampNTZ data skipping is not yet supported".

hukel · 12-27-2023

I'm not sure if this is related, but I've hit another challenge with TIMESTAMP_NTZ columnsAs soon as I calculate the statistics on a TIMESTAMP_NTZ column in a table, I can't use that column in a WHERE clause date range.This query -- set the variable ...

hukel · 12-27-2023

Running this fills up the statistics for the columns.ANALYZE TABLE <TABLE> COMPUTE STATISTICS FOR COLUMNS timestamp,aid,ContextProcessIdBut I still get the error when I run OPTIMIZE:Unsupported datatype 'TimestampNTZType' com.databricks.backend.commo...

Databricks Community

User Stats

User Activity

Python function using Splunk SDK works in Python notebook but not in SQL notebook

Parsed Logical Plan report UnresolvedHint RANGE_JOIN

Unsupported datatype 'TimestampNTZType' with liquid clustering

Convert multiple string fields to int or long during streaming

InconsistentReadException: The file might have been updated during query - CSV backed table

Re: How to get data from Splunk on daily basis?

Re: How to get data from Splunk on daily basis?

Re: Unsupported datatype 'TimestampNTZType' with liquid clustering

Re: Unsupported datatype 'TimestampNTZType' with liquid clustering

Re: Unsupported datatype 'TimestampNTZType' with liquid clustering