Hi @WiliamRosa , 

Thanks for your response! 
I shall still need to prompt caching as my usecase asks for it. 
Other databricks endpoints seems to work, like 'databricks-gpt-oss-120b' (using identical logic as you shared in your message). But I could not confirm the actual cache as I can not access token usage for these queries.

Best regards!