I have timeseries data in k Kafka topics. I would like to read this data into windows of length 10 minutes. For each window, I want to run N SQL queries and materialize result. The specific N queries to run depends on the kafka topic name. How should...
We've setup a premium workspace with passthrough credentials cluster , while they do work and access my adls gen 2 storageI can't make it install a library on the cluster from there. and keeping getting"Library installation attempted on the driver no...
Sorry I can't figure this out, the link you've added is irrelevant for passthrough credentials, if we add it the cluster won't be passthrough, Is there a way to add this just for a specific folder? while keeping passthrough for the rest?
Databricks Office HoursRegister for Office Hours to participate in a live Q&A session with Databricks experts! Our next event is scheduled for June 22nd from 8:00 am - 9:00am PT.This is your opportunity to connect directly with our experts to ask any...
In Databricks -->Workflows-->Job Runs we have a column "Run As".From where does this value come. We are getting a user id here but need to change it to a generic account. Any help would be appreciated. Thanks
Hello everyone,I've created by error a DBFS folder named : ${env]But when I run this command :dbutils.fs.rm("/mnt/${env]")It returns me this error : java.net.URISyntaxException: Illegal character in path at index 12: /mnt/$%7Benv]How can I do please ...
I use Databricks and I try to connect to posgresql via the following code"jdbcHostname = "xxxxxxx"jdbcDatabase = "xxxxxxxxxxxx"jdbcPort = "5432"username = "xxxxxxx"password = "xxxxxxxx"jdbcUrl = "jdbc:postgresql://{0}:{1}/{2}".format(jdbcHostname, jd...
hi @Boumaza nadia​ Please check the Ganglia metrics for the cluster. This could be a scalability issue where cluster is overloading. This can happen due to a large partition not fitting into the given executor's memory. To fix this we recommend bump...
My function in func.pydef lower_events(df):
return df.withColumn("event",f.lower(f.col("event")))My main notebook import pyspark.sql.functions as f
from pyspark.sql.functions import udf, col, lower
import sys
sys.path.append("..")
from folder.func...
@Kaniz Fatma​ https://community.databricks.com/s/question/0D58Y00008ouo6xSAA/how-to-fetch-environmental-variables-saved-in-one-notebook-into-another-notebook-in-databricks-repos-and-notebooksCan you please look into this
Hi all!Before we used Databricks Repos we used the run magic to run various utility python functions from one notebook inside other notebooks, fex like reading from a jdbc connections. We now plan to switch to repos to utilize the fantastic CI/CD pos...
Thats...odd. I was sure I had tried that, but now it works somehow. I guess it has to be that now I did it with double quotation marks. Thanks anyway! Works like a charm.
I'm having issues trying to read temporary views in SQL Analytics module. Ive managed to create temporary views based on a query but I don't know how to read from them? Just using the name of the view returns "Table or view not found".
For a production work load containing around 15k gzip compressed json files per hour all in a YYYY/MM/DD/HH/id/timestamp.json.gz directoryWhat would be the better approach on ingesting this into a delta table in terms of not only the incremental load...
@Kaniz Fatma​ So i've not found a fix for the small file problem using autoloader, seems to struggle really badly against large directories, had a cluster running for 8h stuck on "listing directory" part with no end, cluster seemed completely idle to...
Hi,I am not able to import .dbc file into Databricks workspace for "Databricks Developer Foundation Capstone". When I click import the error message is displayed. Secondly when I click the github link in the 'Download the capstone' error 404 is displ...
Hi @Karthikeyan.r3@cognizant.com R​ , We haven’t heard from you on the last responses, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise...
I am reading data from a Kafka topic, say topic_a. I have an application, app_one which uses Spark Streaming to read data from topic_a. I have a checkpoint location, loc_a to store the checkpoint. Now, app_one has read data till offset 90.Can I creat...
Hi @John Constantine​,Is not recommended to share the checkpoint with your queries. Every streaming query should have their own checkpoint. If you can to start at the offset 90 in another query, then you can define it when starting your job. You can ...
Hi Team, When i am trying to register a permanant function i am getting the below error.%sqlCREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR '/tmp/SimpleUdf.jar';%sqlselect simple_udf(2)Error Details : com.databricks.backend.common.rpc.Databricks...
hi @Werner Stinckens​ @Jose Gonzalez​ @Hubert Dudek​ @Kaniz Fatma​ ​Thanks for all the help, Appreciate it. I was able to create permanent functions and use eclipse to create the runnable jar. However, Does anyone have any idea on how to deploy t...
Anyone recently encountered the following error in cloudformation stack while attempting to create a databricks quickstart workspace in AWS?[ERROR] 2022-05-17T16:25:35.920Z 6593c6c0-677c-4918-bcb2-0f5fc9a1c482 Exception: An error occurred (AccessDen...
Hi @Ravi Puranik​ ​, We haven’t heard from you on the last response from me, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will ...