Disk
counters:
·
PhysicalDisk\Avg.
Disk Sec/Read This
measures the average time, in seconds, to read data from the disk. If the number
is larger than 25ms, that means the disk system is experiencing latency when
reading from the disk. For mission-critical servers, the acceptable threshold is
much lower. The most logical solution here is to replace the current disk system
with a faster disk system.
·
PhysicalDisk\Avg.
Disk Sec/Write This
measures the average time, in seconds, it takes to write data to the disk. If
the number is larger than 25ms, the disk system experiences latency when writing
to the disk. For mission-critical servers the acceptable threshold is much
lower. The likely solution here is to replace the disk system with a faster disk
system.
Summary:
The most common
performance monitoring metric people quote is usually disk queue
length. While this is
important counter, for SAN systems it is almost impossible to use it as an
accurate metric.
Why?
Because the usual
“rule of thumb ”for“ bad performance is a queue length, greater than 2 for a
disk drive. However, when you have a SAN with 100 drives, you have no idea how
many are being used for your drive.
When you start
focusing on response time, which is what really matters most, the queue length
starts to become irrelevant...
1. When you read the perfmon data and see a
number like “.010” this means 10 milliseconds.
3. Note, many times
the performance problems are tied to firmware revisions, HBA configuration,
and/or BIOS issues.
4. I use the
following table to determine the meaning of the data:
<10ms excellent
<20ms reasonable
>20ms bad
CPU counters:
·
System\Processor
Queue Length:
number of threads in the processor queue. The server doesn't have enough
processor power if the value is more than two times the number of
CPUs.
·
Processor\%
Processor Time:
percentage of elapsed time the processor spends executing a non-idle thread. If
the percentage is greater than 85 percent, the processor is overwhelmed and the
server may require more processing power.
MEM counters:
·
Memory\Cache
Bytes This
indicates the amount of memory being used for the file system cache. There may
be a disk bottleneck if this value is greater than
300MB
·
Memory
/ Available MBytes - minimum
10% of memory should be free and available. Less than that usually indicating
there is insufficient memory which can increase paging activity. You should
consider adding more RAM if that happens
·
Memory
/ Pages/sec – should not be
higher than 1000. A number higher than that, usually indicates there may be a
memory leak happening.
·
Paging
File / % Usage – should not
be greater than 10%.
·
Memory\%
Committed Bytes in Use This
measures the amount of virtual memory in use. This indicates insufficient memory
if the number is greater than 80 percent. The solution for this is to add more
memory.
Network counters:
·
Network
Interface / Output Queue Length - measures
the length of the output packet queue in packets.
healthy – 0
caution – 1-2
critical – >2
No comments:
Post a Comment