08-09-2022 07:47 AM
DBR 10.4 LTS is failing frequently due to GC overhead once in half an hour.
Can anyone from Databricks Team let me know if we have some existing tickets or bugs.
Note : We used the same configuration and same DBR for almost last 3 months.
When checking the logs there are issues in writing SST files to rocks db and few issues are there in connecting to azure ( timeout error )
08-09-2022 08:26 AM
if you are seeing timeout errors, was there any network change that happened recently on your side? Do you use default DNS or custom DNS? You can create an init script to capture the network traffic.
https://docs.microsoft.com/en-us/azure/databricks/kb/dev-tools/use-tcpdump-create-pcap-files
08-09-2022 08:55 AM
hi @Prabakar Ammeappin the timeout is very less ,
we see a lot of rocks db copy happening more than usual
08-09-2022 09:03 AM
hi @somanath Sankaran could you please add the stack message?
09-08-2022 03:14 AM
Hi @somanath Sankaran
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
09-26-2022 08:29 AM
hi @Vidula Khanna have raised a support ticket to ADB from client side.
We can close this however it seems like DBR Version 11.2 and above has some fixes for the RocksDB memory leak based on communication with Databricks developer team