We have successfully gotten Datadog agent(s) installed and running on databricks clusters via init script - this part seems to be working fine. We are working on instrumenting our jobs using the OpenTelemetry endpoint feature of the Datadog agent, which requires being able to communicate with the agent over http (there is also a socket option, but we would prefer http). This works fine when running directly on the driver and worker nodes.
However, we are running our jobs using databricks container service and the processes in the container seem to be unable to access the host instance where the agent is running.
Has anyone found a solution or workaround for this?