โ08-08-2024 10:57 PM - edited โ08-08-2024 11:04 PM
Hi all,
I'm working on setting up Overwatch in our Databricks workspace to monitor resources, and I've encountered an issue during overwatch deployment. I am able to deploy overwatch, but the validation for the `Gold_jobRunCostPotentialFact` module failed with the following error:
> "Unsupported component type interface scala.collection.Seq in arrays."
Due to this error, a table is missing in the gold schema: `overwatch_global.jobRunCostPotentialFact`. this tables is essential for our reporting needs.
What I've tried so far:
- I attempted using different versions of the Overwatch libraries installed on the compute cluster, but the same error persists.
- I deployed each layer of the Medallion architecture for Overwatch separately. While all deployments were successful, the mentioned error still appeared in the logs, and the table remains missing from the consumer database.
I am using Scala version 2.12, Spark version 3.5.0 and DBR 15.3 ML
Has anyone else encountered this issue or have suggestions on how to resolve it? Any help would be greatly appreciated!
Thanks in advance!
โ08-14-2024 11:20 AM
HI @SriramMohanty , I have solved the issue and explained all the steps in blog post. The official documentation is outdated and incomplete.
If anybody is facing same issue, please refer to my blog post
Hi, If anybody is facing problem with Overwatch deployment please refer to my blog post or contact me @ contact@jiteshraut.me
โ08-08-2024 11:05 PM
I referred below docs and blogs for setting up Overwatch:
Overwatch -Databricks own Observability tool by @SriramMohanty
โ08-09-2024 03:50 AM
โ08-12-2024 04:04 AM
Hi @SriramMohanty ,
I've tried DBR 11.3LTS as you suggested, but I'm facing issues. It seems like DBR 11.3 doesn't support delta table features. I'm using Overwatch version 0.8.1.2.
When I ran validation using a config file in Delta format, I got a NoClassDefFoundError. When I used a Delta Table instead, I got a DeltaTableFeatureException, indicating that the version of Databricks I'm using doesn't support the required reader table features, specifically deletionVectors.
Let me know if you'd like any further changes!
If you could share some resources that might be helpful, that would be a great help.
Thank you!
โ08-12-2024 10:59 PM
Hi @jiteshraut20 ,
If you are using CSV file please use the below code to convert it into delta.
spark.read .option("header", "true") .option("ignoreLeadingWhiteSpace", true) .option("ignoreTrailingWhiteSpace", true) .csv("/path/to/config.csv") .coalesce(1) .write.format("delta") .save("/myPath/overwatch/configs/prod_config")
Please validate the data which is present in "/myPath/overwatch/configs/prod_config" and use the below code to run the pipeline
import com.databricks.labs.overwatch.MultiWorkspaceDeployment
val configTable = "/myPath/overwatch/configs/prod_config" // Path to the config table
val tempLocation = "/tmp/overwatch/templocation"
MultiWorkspaceDeployment(configTable, tempLocation).deploy(1)
โ08-14-2024 11:20 AM
HI @SriramMohanty , I have solved the issue and explained all the steps in blog post. The official documentation is outdated and incomplete.
If anybody is facing same issue, please refer to my blog post
Hi, If anybody is facing problem with Overwatch deployment please refer to my blog post or contact me @ contact@jiteshraut.me
โ08-14-2024 09:51 PM
Hi @jiteshraut20 ,
Can you please explain what is outdated in the official document as we update the official documentation frequently with every release.
โ08-14-2024 10:38 PM
Hi @SriramMohanty ,
I encountered a few issues while configuring Overwatch based on the official documentation, and I wanted to highlight some of them for your reference:
Overwatch Configuration Changes: The documentation does not reflect recent changes in the Overwatch config file. For example, the column previously named etl_storage_prefix is now required as storage_prefix, but this update is not mentioned.
Databricks Runtime (DBR) Version Requirements: The documentation does not specify the required DBR version for the latest Overwatch JAR. I found that Overwatch version 0.8.1.2 works without errors only on DBR 13.3 LTS. The document linked in your previous replies does not clearly state this compatibility. Using DBR 11.3, which is mentioned, leads to issues with the latest JAR. Thanks to the Databricks team, I was able to resolve this by using the correct DBR version.
SQL Query History Gold Deployment Error: There is an error in the deployment of sql_query_history_gold (Gold View: Create view failed, column name error_message cannot be resolved), which is resolved in Overwatch version 0.8.1.2. This information is not covered in the documentation.
- org.scalaj:scalaj-http_2.12:2.4.2
- dataframe_rules_engine_2.12:0.2.0
These are some of the points I noticed were missing from the documentation. While the documentation has been a valuable resource, it could be even more effective with proper updates. Please correct me if Iโm wrong.
The contacts you provided were extremely helpful, and thanks to you and them, I was able to figure things out.
โ08-14-2024 11:16 PM
Hi @jiteshraut20 ,
1) Storage_prefix: It is updated in the documents please reffer config .
2)If the system table is in use, the recommended Databricks runtime version is 13.3 LTS. For other cases, 11.3 LTS should work seamlessly. Please see the documentation on system table requirements. doc: system table requirements
3)For fresh deployments on version 8.1.2, you should not encounter any exceptions. This is why we always recommend using the latest version of Overwatch.
4)The following two jars are not required and, therefore, not mentioned in the documentation: org.scalaj:scalaj-http_2.12:2.4.2
and dataframe_rules_engine_2.12:0.2.0
.
Thanks,
Sriram
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group