insight-agent Component Status Explanation¶
In AI platform, Insight acts as a multi-cluster observability product. To achieve unified data collection across multiple clusters, users need to install the Helm App insight-agent (installed by default in the insight-system namespace). Refer to How to Install insight-agent .
Status Explanation¶
In the "Observability" -> "Collection Management" section, you can view the installation status of insight-agent in each cluster.
- Not Installed : insight-agent is not installed in the insight-system namespace of the cluster.
- Running : insight-agent is successfully installed in the cluster, and all deployed components are running.
- Error : If insight-agent is in this state, it indicates that the helm deployment failed or there are components deployed that are not in a running state.
You can troubleshoot using the following steps:
-
Run the following command. If the status is deployed , proceed to the next step. If it is failed , it is recommended to uninstall and reinstall it from Container Management -> Helm Apps as it may affect application upgrades:
-
Run the following command or check the status of the deployed components in Insight -> Data Collection . If there are Pods not in the Running state, restart the containers in an abnormal state.
Additional Notes¶
-
The resource consumption of the Prometheus metric collection component in insight-agent is directly proportional to the number of Pods running in the cluster. Please adjust the resources for Prometheus according to the cluster size. Refer to Prometheus Resource Planning.
-
The storage capacity of the vmstorage metric storage component in the global service cluster is directly proportional to the total number of Pods in the clusters.
- Please contact the platform administrator to adjust the disk capacity of vmstorage based on the cluster size. Refer to vmstorage Disk Capacity Planning.
- Adjust vmstorage disk based on multi-cluster scale. Refer to vmstorge Disk Expansion.