we have a library that allows dotnet applications to talk to databricks clusters (https://github.com/clearbank/SparkSqlClient). This communicates with the clusters over the spark thrift serverAlthough this works great for clusters in the "data scienc...
We are planning on migrating to unity catalog but are unable to determine how we can segregate dev, staging and production data from each otherOur plan was to separate catalogs by SLDC Environment scopes (as per description and diagram at https://doc...
We have a jar method that takes in as a parameter "--date 2022-01-01" and it will process that dates worth of data. However when invoked via a job the date we want to pass in is the day before the job run was startedWe could default this in the jar j...
We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of...
We have created a table using the new generated column feature (https://docs.microsoft.com/en-us/azure/databricks/delta/delta-batch#deltausegeneratedcolumns)
CREATE TABLE ingest.MyEvent(
data binary,
topic string,
timestamp timestamp,
date dat...
I have tried those connection details however it they give me 400 errors when trying to connect directly using the hive thrift server contract (https://github.com/apache/hive/blob/master/service-rpc/if/TCLIService.thrift). I do not get the issues whe...
There is a lot of utility in being able to sperate dev/qa/prod data. We do not (and in some cases can not) have prod data accessable in dev environments/workspaces, or have dev data available in prod environments/workspacesAs it is at the moment I do...
These features are amazing and we do use these to optimize individual queriesBut I was looking for a way where we can calculate statistics over all the queries running on the platform. Answer questions like Who is running the most queriesWhat is the ...
@Kaniz Fatma​ , @Atanu Sarkar​ thanks for your repsonse, and for investigating an individual query the UI is great and import could be a useful featureBut what we were after was a way to analyse the queyr history as an aggregate. For example graphing...
That doesn't seem to solve the problem.We appear to be using hive 0.13.0, docs mention we should be be on 2.3.7. Is there something we have to do on our end to upgrade?Running the queries givesspark.conf.get("spark.sql.hive.metastore.jars") //builtin...