Skip to content
Snippets Groups Projects
user avatar
Em Sharnoff authored
With upcoming compute pool changes, we're going to end up with a lot of
VMs where the informant is unable to start up - because the initial file
cache connection will fail until postgres is alive, which only happens
once the pooled VM is bound to a particular endpoint.

So on staging, we currently report a lot of "autoscaling stuck" VMs,
when in reality these are just part of the pool. Having a separate
value for the number of these stuck VMs that are actually running
something will ensure our metrics continue to be useful.

And also, in passing this through so that we can make a metric out of
it, it's worth storing & logging the endpoint ID, so that the
information is more easily available (without having to cross-reference
the console DB)
fb7a4506
History
Name Last commit Last update