AMD Radeon (ROCm SMI)
Description
This connector provides hardware information about AMD Radeon GPUs.
Target
Typical platform: AMD
Operating system: Linux
Prerequisites
Leverages: ROCm drivers with rocm-smi support.
Technology and protocols: Commands
Examples
CLI
metricshub HOSTNAME -t linux -c +AMDRadeon --ssh -u USER
metricshub.yaml
resourceGroups:
<RESOURCE_GROUP>:
resources:
<HOSTNAME-ID>:
attributes:
host.name: <HOSTNAME> # Change with actual host name
host.type: linux
connectors: [ +AMDRadeon ] # Optional, to load only this connector
protocols:
ssh:
username: <USERNAME> # Change with actual credentials
password: <PASSWORD> # Encrypted using metricshub-encrypt
Connector Activation Criteria
The AMD Radeon (ROCm SMI) connector will be automatically activated, and its status will be reported as OK if all the below criteria are met:
- The command below succeeds on the monitored host:
- Command:
rocm-smi - Output contains:
ROCm System Management Interface(regex)
- Command:
Metrics
| Type | Collected Metrics | Specific Attributes |
|---|---|---|
| enclosure | hw.status{hw.type="enclosure", state="present"} | - |
| fan | hw.fan.speed_ratiohw.status{hw.type="fan", state="present"} | hw.parent.typeidnamesensor_location |
| gpu | hw.energy{hw.type="gpu"}hw.gpu.memory.bandwidthhw.gpu.memory.utilizationhw.gpu.speedhw.gpu.utilizationhw.power.limit{hw.type="gpu"}hw.power{hw.type="gpu"}hw.status{hw.type="gpu", state="present"} | hw.parent.typeidinfomodelnameperformance_levelserial_numbervendor |
| temperature | hw.status{hw.type="temperature", state="present"}hw.temperature | hw.parent.typeidnamesensor_location |
| voltage | hw.status{hw.type="voltage", state="present"}hw.voltage | hw.parent.typeidnamesensor_location |