Skip to main content

Nvidia-Smi

Description

This connector provides hardware information about most Nvidia GPUs. (Clocking).

hardware linux nvidia windows

Target

Typical platforms: Nvidia, Microsoft Windows, Linux

Operating systems: Microsoft Windows, Linux

Prerequisites

Leverages: NVIDIA drivers with NVIDIA-SMI support.

Technology and protocols: Commands

Examples

CLI

metricshub HOSTNAME -t win -c +NvidiaSmi --wmi -u USER

metricshub.yaml

resourceGroups:
<RESOURCE_GROUP>:
resources:
<HOSTNAME-ID>:
attributes:
host.name: <HOSTNAME> # Change with actual host name
host.type: win
connectors: [ +NvidiaSmi ] # Optional, to load only this connector
protocols:
wmi:
username: <USERNAME> # Change with actual credentials
password: <PASSWORD> # Encrypted using metricshub-encrypt

Connector Activation Criteria

The Nvidia-Smi connector will be automatically activated, and its status will be reported as OK if all the below criteria are met:

  • The command below succeeds on the monitored host:
    • Command: nvidia-smi
    • Output contains: Driver Version (regex)

Metrics

TypeCollected MetricsSpecific Attributes
enclosurehw.status{hw.type="enclosure", state="present"}-
fanhw.fan.speed_ratio
hw.fan.speed_ratio.limit{limit_type="low.critical"}
hw.fan.speed_ratio.limit{limit_type="low.degraded"}
hw.status{hw.type="fan", state="present"}
hw.parent.type
id
name
sensor_location
gpuhw.energy{hw.type="gpu"}
hw.gpu.io{direction="receive"}
hw.gpu.io{direction="transmit"}
hw.gpu.memory.limit
hw.gpu.memory.utilization
hw.gpu.utilization{task="decoder"}
hw.gpu.utilization{task="encoder"}
hw.gpu.utilization{task="general"}
hw.power{hw.type="gpu"}
hw.status{hw.type="gpu", state="present"}
driver_version
firmware_version
hw.parent.type
id
info
model
name
serial_number
vendor
temperaturehw.status{hw.type="temperature", state="present"}
hw.temperature
hw.temperature.limit{limit_type="high.critical"}
hw.temperature.limit{limit_type="high.degraded"}
hw.parent.type
id
name
sensor_location
voltagehw.status{hw.type="voltage", state="present"}
hw.voltage
hw.parent.type
id
name
sensor_location