This is due to a issue reported here : https://github.com/tensorflow/profiler/issues/344The DBR 8.4ML comes with Tensor flow 2.5 and the latest version of tensorboard-plugin-profile is 2.4.To workaround the issue, you can add option --load_fast=false...
I have a requirement to replay ingestion from landing data and build silver table. I am trying to write delta file from raw Avro files based in landing zone. The raw files are located in folder based on date. I am currently using streaming to read d...
How can I integrate Databricks clusters with Prometheus? I tried adding the following Spark property to my cluster but cannot find the Prometheus metrics endpoints. Any thoughts?
spark.ui.prometheus.enabled = true
Hello,
Is it a good idea to Host a Schema Data Warehouse on Azure Databricks database itself.
Usually we use Azure Databricks to Prep the data and then Host it on Azure Sql Database. However question is can we not Host the data on Azure Databricks i...
Hello! I am trying to forward fill a column in a Pandas dataframe based on a keyword. I have come up with:
pdf_df['EEName_TEST'] = pdf_df['EEName_TEST'].str.contains('Name:').ffill()
This gives me a boolean result but I still can't figure out what ...
I have Azure notebook which take widget input parameter and performs necessary action. But this note book should be called within another notebook using dbutils.notebook.run , How do I pass parameter to the widget?
I was mounting the Datalake Gen1 to Databricks for accessing and processing files, The below code was working great for the past 1 year and all of a sudden I'm getting an errorconfigs = {"df.adl.oauth2.access.token.provider.type": "ClientCredential"...
I'm operating on some data that looks like the image attached.
the command that I'm performing is :<code>library(magrittr)
subsetting the data for MAC-OS & sorting by event-timestamp.
acDF <- eventsDF %>% SparkR::select("device", "event_timestamp...
I am trying to connect PostgreSQL from Azure Databricks.
I am using the below code to connect.
jdbcHostname = "Test"
jdbcPort = 1234
jdbcDatabase = "Test1"
jdbcUrl = "jdbc:postgresql://{0}:{1}/{2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
Conn...
@Javier De La Torre do you really need two-way SSL (verify-full)? In most cases one way SSL (sslmode=require) should be enough. @akj2784​ When you say "Connection was successful", where do you mean you established a successful connection? You might...
Say I have two notebooks A and B. Notebook A generates data for notebook B to process. However, I want multiple B to process the data concurrently. Is this possible?
For those of you who use databricks-connect probably know that it’s a great tool to use the power of spark/databricks, while executing/debugging code (and having proper git integration) from your favorite IDE.
However, when you want to test somethin...
Spark Standalone Cluster Configuration (Spark 3.0.0)-
1 Master2 Workers (4 cores each)
I am using Airflow SparkSubmitOperator to submit the job to Spark Master in Cluster mode. There are multiple(~20) DAGs on airflow submitting jobs to Spark. These ...
I have created a custom transformer to be used in a ml pipeline. I was able to write the pipeline to storage by extending the transformer class with DefaultParamsWritable. Reading the pipeline back in however, does not seem possible in Scala. I have...