We get configuration details on all servers in a single API call by calling compute servers-detail /servers/detail ( https://docs.openstack.org/api-ref/compute/?expanded=list-servers-detail,list-servers-detailed-detail#list-servers-detailed )
However, it provides only configuration data, not runtime metrics.
For runtime metrics, compute diagnostics API /servers/{server_id}/diagnostics ( https://docs.openstack.org/api-ref/compute/?expanded=show-server-diagnostics-detail#show-server-diagnostics ) can be used.
It requires server_id to be passed and each API call will be separate for each instance.
It is good to have an API to get this diagnostics information for all the instances with a single API call like /servers/detail.
Hello,
You've hit on a common challenge when working with OpenStack and monitoring your instances. While the
/servers/detail
API provides a wealth of configuration information, the lack of a single-call API to retrieve runtime diagnostics for all instances can indeed be inefficient for monitoring and troubleshooting at scale.It sounds like you're looking for a more streamlined approach, and your suggestion of a
/servers/diagnostics
endpoint mirroring the functionality of/servers/detail
but for runtime metrics makes perfect sense.Unfortunately, as of my last update and based on the OpenStack Compute API reference you've provided, there isn't a built-in OpenStack API endpoint to retrieve diagnostics for all servers in a single call. The
/servers/{server_id}/diagnostics
endpoint is specifically designed to fetch diagnostics for a single server identified by its ID.However, there are a few strategies and tools you might consider to achieve a similar outcome or work around this limitation:
1. Scripting and Automation:
You can write scripts (using Python with the
python-openstackclient
oropenstacksdk
) to:Fetch the list of all server IDs using the
/servers/detail
API.Iterate through the list of server IDs.
For each server ID, call the
/servers/{server_id}/diagnostics
API.Collect and aggregate the diagnostic data.
This approach involves multiple API calls but can be automated to gather the information you need.
2. Leveraging OpenStack Telemetry (Ceilometer and Gnocchi):
Ceilometer: This OpenStack project is designed for collecting metering and monitoring data. It gathers various metrics from different OpenStack services, including Nova (Compute). You can configure Ceilometer to collect metrics like CPU utilization, memory usage, disk I/O, network traffic, etc., for your instances.
Gnocchi: This project provides a time-series database for metrics collected by Ceilometer. It allows you to query and retrieve historical and current metric data for your instances.
By setting up and utilizing Ceilometer and Gnocchi, you can have a centralized system for monitoring runtime metrics across all your instances, although it involves querying the Telemetry API rather than the Compute API directly.
3. Infrastructure Monitoring Tools:
Various third-party infrastructure monitoring tools (e.g., Prometheus, Grafana, Zabbix, Datadog) can integrate with OpenStack to collect and visualize runtime metrics. These tools often provide agents that run on your instances or directly interact with the OpenStack APIs to gather data.
4. Custom OpenStack API Extensions (Advanced):
For more advanced users and if you have the ability to modify your OpenStack deployment, you could potentially develop a custom API extension for the Nova service. This extension could expose an endpoint that retrieves aggregated diagnostic information for all servers. However, this is a significant development effort.
McDonald's Customer Survey