Путеводитель по Руководству Linux

  User  |  Syst  |  Libr  |  Device  |  Files  |  Other  |  Admin  |  Head  |



   pcp-atop    ( 1 )

расширенный мониторинг системы и процессов (Advanced System and Process Monitor)

OUTPUT DESCRIPTION - SYSTEM LEVEL

The system level information consists of the following output lines:

PRC Process and thread level totals. This line contains the total cpu time consumed in system mode (`sys') and in user mode (`user'), the total number of processes present at this moment (`#proc'), the total number of threads present at this moment in state `running' (`#trun'), `sleeping interruptible' (`#tslpi') and `sleeping uninterruptible' (`#tslpu'), the number of zombie processes (`#zombie'), the number of clone system calls (`clones'), and the number of processes that ended during the interval (`#exit') when process accounting is used. Instead of `#exit` the last column may indicate that process accounting could not be activated (`no procacct`). If the screen-width does not allow all of these counters, only a relevant subset is shown.

CPU CPU utilization. At least one line is shown for the total occupation of all CPUs together. In case of a multi-processor system, an additional line is shown for every individual processor (with `cpu' in lower case), sorted on activity. Inactive CPUs will not be shown by default. The lines showing the per-cpu occupation contain the cpu number in the field combined with the wait percentage.

Every line contains the percentage of cpu time spent in kernel mode by all active processes (`sys'), the percentage of cpu time consumed in user mode (`user') for all active processes (including processes running with a nice value larger than zero), the percentage of cpu time spent for interrupt handling (`irq') including softirq, the percentage of unused cpu time while no processes were waiting for disk I/O (`idle'), and the percentage of unused cpu time while at least one process was waiting for disk I/O (`wait'). In case of per-cpu occupation, the cpu number and the wait percentage (`w') for that cpu. The number of lines showing the per-cpu occupation can be limited.

For virtual machines, the steal-percentage (`steal') shows the percentage of cpu time stolen by other virtual machines running on the same hardware. For physical machines hosting one or more virtual machines, the guest-percentage (`guest') shows the percentage of cpu time used by the virtual machines. Notice that this percentage overlaps the user-percentage!

When PMC performance monitoring counters are supported by the CPU and the kernel (and pmdaperfevent(1) runs with root privileges), the number of instructions per CPU cycle (`ipc') is shown. The first sample always shows the value 'initial', because the counters are just activated at the moment that pcp-atop is started. When the CPU busy percentage is high and the IPC is less than 1.0, it is likely that the CPU is frequently waiting for memory access during instruction execution (larger CPU caches or faster memory might be helpful to improve performance). When the CPU busy percentage is high and the IPC is greater than 1.0, it is likely that the CPU is instruction-bound (more/faster cores might be helpful to improve performance). Furthermore, per CPU the effective number of cycles (`cycl') is shown. This value can reach the current CPU frequency if such CPU is 100% busy. When an idle CPU is halted, the number of effective cycles can be (considerably) lower than the current frequency. Notice that the average instructions per cycle and number of cycles is shown in the CPU line for all CPUs. See also: http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html

In case of frequency scaling, all previously mentioned CPU percentages are relative to the used scaling of the CPU during the interval. If a CPU has been active for e.g. 50% in user mode during the interval while the frequency scaling of that CPU was 40%, only 20% of the full capacity of the CPU has been used in user mode.

If the screen-width does not allow all of these counters, only a relevant subset is shown.

CPL CPU load information. This line contains the load average figures reflecting the number of threads that are available to run on a CPU (i.e. part of the runqueue) or that are waiting for disk I/O. These figures are averaged over 1 (`avg1'), 5 (`avg5') and 15 (`avg15') minutes. Furthermore the number of context switches (`csw'), the number of serviced interrupts (`intr') and the number of available CPUs are shown.

If the screen-width does not allow all of these counters, only a relevant subset is shown.

GPU GPU utilization (Nvidia). Read the section GPU STATISTICS GATHERING in this document to find the details about the activation of the pmdanvidia daemon.

In the first column of every line, the bus-id (last nine characters) and the GPU number are shown. The subsequent columns show the percentage of time that one or more kernels were executing on the GPU (`gpubusy'), the percentage of time that global (device) memory was being read or written (`membusy'), the occupation percentage of memory (`memocc'), the total memory (`total'), the memory being in use at the moment of the sample (`used'), the average memory being in use during the sample time (`usavg'), the number of processes being active on the GPU at the moment of the sample (`#proc'), and the type of GPU.

If the screen-width does not allow all of these counters, only a relevant subset is shown. The number of lines showing the GPUs can be limited.

MEM Memory occupation. This line contains the total amount of physical memory (`tot'), the amount of memory which is currently free (`free'), the amount of memory in use as page cache including the total resident shared memory (`cache'), the amount of memory within the page cache that has to be flushed to disk (`dirty'), the amount of memory used for filesystem meta data (`buff'), the amount of memory being used for kernel mallocs (`slab'), the amount of slab memory that is reclaimable (`slrec'), the resident size of shared memory including tmpfs (`shmem`), the resident size of shared memory (`shrss`) the amount of shared memory that is currently swapped (`shswp`), the amount of memory that is currently claimed by vmware's balloon driver (`vmbal`), the amount of memory that is currently claimed by the ARC (cache) of ZFSonlinux (`zfarc`), the amount of memory that is claimed for huge pages (`hptot`), and the amount of huge page memory that is really in use (`hpuse`).

If the screen-width does not allow all of these counters, only a relevant subset is shown.

SWP Swap occupation and overcommit info. This line contains the total amount of swap space on disk (`tot') and the amount of free swap space (`free'), the size of the swap cache (`swcac'), the total size of compressed storage in zswap (`zpool`), the total size of the compressed pages stored in zswap (`zstor'), the total size of the memory used for KSM (`ksuse`, i.e. shared), and the total size of the memory saved (deduped) by KSM (`kssav`, i.e. sharing). Furthermore the committed virtual memory space (`vmcom') and the maximum limit of the committed space (`vmlim', which is by default swap size plus 50% of memory size) is shown. The committed space is the reserved virtual space for all allocations of private memory space for processes. The kernel only verifies whether the committed space exceeds the limit if strict overcommit handling is configured (vm.overcommit_memory is 2).

PAG Paging frequency. This line contains the number of scanned pages (`scan') due to the fact that free memory drops below a particular threshold and the number times that the kernel tries to reclaim pages due to an urgent need (`stall'). Also the number of memory pages the system read from swap space (`swin') and the number of memory pages the system wrote to swap space (`swout') and the number of OOM (out-of- memory) kills (`oomkill') are shown.

PSI Pressure Stall Information. This line contains percentages about resource pressure related to CPU, memory and I/O. Certain percentages refer to 'some' meaning that some processes/threads were delayed due to resource overload. Other percentages refer to 'full' meaning a loss of overall throughput due to resource overload. The values `cpusome', `memsome', `memfull', `iosome' and `iofull' show the pressure percentage during the entire interval. The values `cs' (cpu some), `ms' (memory some), `mf' (memory full), `is' (I/O some) and `if' (I/O full) each show three percentages separated by slashes: pressure percentage over the last 10, 60 and 300 seconds.

LVM/MDD/DSK Logical volume/multiple device/disk utilization. Per active unit one line is produced, sorted on unit activity. Such line shows the name (e.g. VolGroup00-lvtmp for a logical volume or sda for a hard disk), the busy percentage i.e. the portion of time that the unit was busy handling requests (`busy'), the number of read requests issued (`read'), the number of write requests issued (`write'), the number of KiBytes per read (`KiB/r'), the number of KiBytes per write (`KiB/w'), the number of MiBytes per second throughput for reads (`MBr/s'), the number of MiBytes per second throughput for writes (`MBw/s'), the average queue depth (`avq') and the average number of milliseconds needed by a request (`avio') for seek, latency and data transfer. If the screen-width does not allow all of these counters, only a relevant subset is shown.

The number of lines showing the units can be limited per class (LVM, MDD or DSK) with the 'l' key or statically (see separate man-page of pcp-atoprc(5)). By specifying the value 0 for a particular class, no lines will be shown any more for that class.

NFM Network Filesystem (NFS) mount at the client side. For each NFS-mounted filesystem, a line is shown that contains the mounted server directory, the name of the server (`srv'), the total number of bytes physically read from the server (`read') and the total number of bytes physically written to the server (`write'). Data transfer is subdivided in the number of bytes read via normal read() system calls (`nread'), the number of bytes written via normal read() system calls (`nwrit'), the number of bytes read via direct I/O (`dread'), the number of bytes written via direct I/O (`dwrit'), the number of bytes read via memory mapped I/O pages (`mread'), and the number of bytes written via memory mapped I/O pages (`mwrit').

NFC Network Filesystem (NFS) client side counters. This line contains the number of RPC calls issues by local processes (`rpc'), the number of read RPC calls (`read`) and write RPC calls (`rpwrite') issued to the NFS server, the number of RPC calls being retransmitted (`retxmit') and the number of authorization refreshes (`autref').

NFS Network Filesystem (NFS) server side counters. This line contains the number of RPC calls received from NFS clients (`rpc'), the number of read RPC calls received (`cread`), the number of write RPC calls received (`cwrit'), the number of Megabytes/second returned to read requests by clients (`MBcr/s`), the number of Megabytes/second passed in write requests by clients (`MBcw/s`), the number of network requests handled via TCP (`nettcp'), the number of network requests handled via UDP (`netudp'), the number of reply cache hits (`rchits'), the number of reply cache misses (`rcmiss') and the number of uncached requests (`rcnoca'). Furthermore some error counters indicating the number of requests with a bad format (`badfmt') or a bad authorization (`badaut'), and a counter indicating the number of bad clients (`badcln').

NET Network utilization (TCP/IP). One line is shown for activity of the transport layer (TCP and UDP), one line for the IP layer and one line per active interface. For the transport layer, counters are shown concerning the number of received TCP segments including those received in error (`tcpi'), the number of transmitted TCP segments excluding those containing only retransmitted octets (`tcpo'), the number of UDP datagrams received (`udpi'), the number of UDP datagrams transmitted (`udpo'), the number of active TCP opens (`tcpao'), the number of passive TCP opens (`tcppo'), the number of TCP output retransmissions (`tcprs'), the number of TCP input errors (`tcpie'), the number of TCP output resets (`tcpor'), the number of UDP no ports (`udpnp'), and the number of UDP input errors (`udpie'). If the screen-width does not allow all of these counters, only a relevant subset is shown. These counters are related to IPv4 and IPv6 combined.

For the IP layer, counters are shown concerning the number of IP datagrams received from interfaces, including those received in error (`ipi'), the number of IP datagrams that local higher-layer protocols offered for transmission (`ipo'), the number of received IP datagrams which were forwarded to other interfaces (`ipfrw'), the number of IP datagrams which were delivered to local higher-layer protocols (`deliv'), the number of received ICMP datagrams (`icmpi'), and the number of transmitted ICMP datagrams (`icmpo'). If the screen-width does not allow all of these counters, only a relevant subset is shown. These counters are related to IPv4 and IPv6 combined.

For every active network interface one line is shown, sorted on the interface activity. Such line shows the name of the interface and its busy percentage in the first column. The busy percentage for half duplex is determined by comparing the interface speed with the number of bits transmitted and received per second; for full duplex the interface speed is compared with the highest of either the transmitted or the received bits. When the interface speed can not be determined (e.g. for the loopback interface), `---' is shown instead of the percentage. Furthermore the number of received packets (`pcki'), the number of transmitted packets (`pcko'), the line speed of the interface (`sp'), the effective amount of bits received per second (`si'), the effective amount of bits transmitted per second (`so'), the number of collisions (`coll'), the number of received multicast packets (`mlti'), the number of errors while receiving a packet (`erri'), the number of errors while transmitting a packet (`erro'), the number of received packets dropped (`drpi'), and the number of transmitted packets dropped (`drpo'). If the screen-width does not allow all of these counters, only a relevant subset is shown. The number of lines showing the network interfaces can be limited.

IFB Infiniband utilization. For every active Infiniband port one line is shown, sorted on activity. Such line shows the name of the port and its busy percentage in the first column. The busy percentage is determined by taking the highest of either the transmitted or the received bits during the interval, multiplying that value by the number of lanes and comparing it against the maximum port speed. Furthermore the number of received packets divided by the number of lanes (`pcki'), the number of transmitted packets divided by the number of lanes (`pcko'), the maximum line speed (`sp'), the effective amount of bits received per second (`si'), the effective amount of bits transmitted per second (`so'), and the number of lanes (`lanes'). If the screen-width does not allow all of these counters, only a relevant subset is shown. The number of lines showing the Infiniband ports can be limited.