分类:
2006-10-15 22:53:46
System Monitoring.
Unix Systems administrator performs the job of systems monitoring in between other jobs. In order to be well informed about system all the time, the very first thing is control of system. In all cases if you work in a team make sure that you tell your teammate about anything you do in system which is going to affect performance. System utilities and commands which reports about and perform vital statistics collections are:
dxi4 dxi4 3.2.0 V2.1.6 i386 01/02/98 00:00:01 runq-sz %runocc swpq-sz %swpocc 01:00:01 2.0 81 02:00:01 2.0 81 03:00:01 2.0 82 04:00:00 2.0 82 05:00:01 2.2 84 06:00:01 2.1 82 07:00:03 2.7 99 08:00:01 2.2 75 08:20:01 2.3 80 08:40:01 4.1 98 09:00:03 5.2 100 09:20:00 4.4 98 09:40:01 3.4 95 10:00:01 3.5 97 10:20:01 3.6 97 10:40:00 3.5 93 Average 2.6 86
Cpu workload Management.
To find out the system load average the most common used command is
uptime.
3:04pm up 4 day(s), 10:37, 16 users, load average: 0.11, 0.10, 0.12Here it tells us that current time is 3:04 pm, system is up four days since 10:37, there are 16 users, five minutes ago load average was 0.11, ten minutes ago was 0.10 and fifteen minutes ago was 0.12.
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME COMD 3 S 0 0 0 0 0 20 0 0 482f60 ? 3:11 schet 3 S 0 2 0 0 0 20 30810 4096 4900e4 ? 10:08 vhanh 1 S 0 159 1 0 28 20 3081c 56 fcbc5c ? 0:00 strer 1 S 0 1408 1 0 28 20 30b21 104 fcbb14 syscon 0:00 gettl 1 S 0 167 1 0 40 20 30893 52 47da28 ? 0:00 volid 5 S 0 171 1 0 26 20 30835 528 48bc30 ? 0:17 vold 41 S 0 321 1 0 26 20 308b1 108 1b66b4 ? 0:00 ktlod 1 S 0 517 1 0 40 20 30904 100 4831ec ? 0:00 x25nd 1 S 0 560 1 0 40 20 30941 92 4831ec ? 0:00 netd 1 S 0 459 1 0 26 20 3094e 72 48bc30 ? 0:02 trlod
accounting : The command accton enables system-wide accounting services, if you are not using accounting on your system then disabling accton command will increase productivity of your system.
biod: Daemon allows the system to access filesystems via NFS.
comsat: program that prints "you have new mail" on your screen.
lpd or lpsched: printer daemons.
mounted: this daemon listens for remote mount requests.
nfsd: This Daemon services NFS requests from remote systems.
nntpd : This Daemon supports USENET network news services.
quotas: /etc/rc quotaon enables disk quota checking.
rlogind: services rlogin and rsh commands.
routed: This daemon routes network packets destined for other networks. If your local network has only one gateway to the outside world you can disable routed. Then make sure that /etc/rc.local has a line such as route add default gateway 1
rwhod: This daemon provides information about users on other systems, rwho and ruptime commands use this daemon.
sendmail: This provides e-mail services both internally and externally (between other systems). Sendmail uses lot of memory.
talkd: This Daemon supports the talk command.
timed: This daemon attempts to synchronize system clocks across a network. If you are working across different systems then this is necessary.
ypbind: This daemon lets the system look up information in NIS database. Atleast one system must be running ypserv before you can run ypbind.
ypserv:
back to top of page
Memory management and performance issues are probably the most important in any system. If there is not engouh physical memory installed in system, Most Unix systems use swap and paging techniques to make sure that adequate memory is available at all time. When you set up a file system for the first time, a separate swap area is also required to be setup. Swapping and Paging can significantly reduce the performace of your system. Difference between swapping and paging is that, Swapping occurs when whole process is transferred to disk, while paging is when some part of process is transferred to disk while rest is still in physical memory. There are two utilities to monitor memory, called vmstat (for bsd, etc), sar( for system V, etc). Page-ins and page-outs are pages moved in and out from physical memory to disk, swap-ins and swap-outs are processes moved in and out of disk.
Estimating memory for a system (System V)
First find out the size of your application using size command as follows, here application name is a.out (binary executable). In case there are many applications, add their sizes and multiply by the number of times they will run on a system. size command will show text, data and stack in the executable file.
size a.out
50884 + 9544 + 27604 = 88032
a.out: iAPX 386 executable not stripped
If this file had been pure executable then we had a need to account for the text and data/stack segments separately. For each invocation, we need to allow 88032 bytes * each invocation 2000(or 1024 KB) . Perform same computation for each program to be run on system.
Vmstat command is used in System V and BSD systems, it informs about virtual memory.
Syntax of vmstat is :
vmstat interval number
so for example, if I want vmstat to show me memory every five seconds and for 3 times I will use
vmstat 5 3
procs memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s6 in sy cs us sy id 0 0 0 133136 13400 0 66 14 0 0 0 0 1 7 1 0 194 650 124 3 3 94 0 0 0 3557440 835184 0 22 1 0 0 0 0 0 1 0 0 164 155 151 0 2 98 0 0 0 3557440 835184 0 77 0 0 0 0 0 9 7 1 0 213 287 79 0 9 90
dxi4 dxi4 3.2.0 V2.1.6 i386 01/02/98 00:00:01 freemem freeswp 01:00:01 44571 1374268 02:00:01 43930 1367068 03:00:01 43224 1368316 04:00:00 43500 1374012 05:00:01 43831 1376500 06:00:01 44128 1373268 07:00:03 43349 1354548 08:00:01 43488 1364372 08:20:01 43078 1352500 08:40:01 42526 1350828 09:00:03 42261 1342652 09:20:00 42487 1349292 09:40:01 41296 1338516 10:00:01 41484 1331284 10:20:01 41368 1335316 10:40:00 40969 1326292 11:00:01 41208 1336340 11:20:01 41236 1347508 11:40:01 41439 1340748 12:00:00 40581 1332708 12:20:01 41221 1339964 12:40:01 41431 1338068 Average 42964 1350653
freemem columns reports how much free memory is available in pages. System starts paging when free memory drops below the configuration constant called GPSGLO, paging then continues until the number of free blocks passes GPGSHI. GPGSLO and GPGSHI default to 25 and 40 blocks. To directly look at swapping statistics use sar -w
dxifour:/u0/ssb>sar -w dxi4 dxi4 3.2.0 V2.1.6 i386 01/02/98 00:00:01 swpin/s bswin/s swpot/s bswot/s pswch/s 01:00:01 0.00 0.0 0.00 0.0 122 02:00:01 0.00 0.0 0.00 0.0 119 03:00:01 0.00 0.0 0.00 0.0 117 04:00:00 0.00 0.0 0.00 0.0 123 05:00:01 0.00 0.0 0.00 0.0 157 06:00:01 0.00 0.0 0.00 0.0 134 07:00:03 0.00 0.0 0.00 0.0 135 08:00:01 0.00 0.0 0.00 0.0 151 08:20:01 0.00 0.1 0.00 0.1 183 08:40:01 0.00 0.0 0.00 0.0 408 09:00:03 0.00 0.0 0.00 0.0 510 09:20:00 0.00 0.0 0.00 0.0 390 09:40:01 0.00 0.0 0.00 0.0 359 10:00:01 0.00 0.1 0.00 0.1 377 10:20:01 0.00 0.0 0.00 0.0 451 10:40:00 0.00 0.1 0.00 0.1 440 11:00:01 0.00 0.0 0.00 0.0 478 11:20:01 0.00 0.0 0.00 0.0 317 11:40:01 0.00 0.0 0.00 0.0 247 12:00:00 0.00 0.0 0.00 0.0 249 12:20:01 0.00 0.0 0.00 0.0 176 12:40:01 0.00 0.0 0.00 0.0 350 Average 0.00 0.0 0.00 0.0 213
# sar -p (system V.3) dxi4 dxi4 3.2.0 V2.1.6 i386 01/02/98 00:00:01 vflt/s pflt/s pgfil/s rclm/s 01:00:01 854.76 0.00 0.00 0.00 02:00:01 867.77 0.00 0.00 0.00 03:00:01 863.78 0.00 0.00 0.00 04:00:00 839.07 0.00 0.00 0.00 05:00:01 822.51 0.00 0.01 0.00 06:00:01 833.93 0.00 0.00 0.00 07:00:03 881.90 0.00 0.00 0.00 08:00:01 348.59 0.00 0.01 0.00 08:20:01 684.73 0.00 0.00 0.00 08:40:01 2727.91 0.00 0.03 0.00 09:00:03 1018.15 0.00 0.03 0.00 09:20:00 989.03 0.00 0.05 0.00 09:40:01 838.29 0.00 0.00 0.00 10:00:01 965.68 0.00 0.04 0.00 10:20:01 948.19 0.00 0.00 0.00 10:40:00 908.90 0.00 0.00 0.00 11:00:01 781.14 0.00 0.02 0.00 11:20:01 843.14 0.00 0.00 0.00 11:40:01 875.43 0.00 0.04 0.00 12:00:00 872.25 0.00 0.01 0.00 12:20:01 802.01 0.00 0.01 0.00 12:40:01 1192.86 0.00 0.01 0.00 Average 878.55 0.00 0.01 0.00
Per-process disk throughput. Speed at which single process and read or write to a disk. You can measure time taken by executing a cp or mv command.
Total disk throughput: Total speed at which all the processes together can transfer to and from disks.
Disk storage efficiency: Efficiency of disk storage.
A rule of a thumb is that a disk spend about 80% of time searching, while only 20% reading and writing data back and forth. That means that if a seek time of a disk is lower the better is the performance of disk.
Other things when considering to buy a disk like rotational speed (most disks are 3600 RPM), Raw transfer rate is not that important as seek time. Disk capacity depends upon the user need. System V divides each disk into many partitions. You should always stick to your disk tools when partitioning or defining disk.
The problem of fragmentation can be kept to minimum by regularly running fsck command. To do this you will need to unmount the disk and then run fsck diskname respond to the questions that come up on console. You can also reorganize free list by using fsck -S option, so that the fragmentation could be kept to minimum, as when system is booting up this free list is used to see fragmentation.
iostat is BSD tool which is also found on many system V systems. This tool prints a number of I/O statistics that will help you to balance disk load. Syntax is
iostat drives interval count
drives are disk drives, interval is in seconds, count is number of samples.
i.e. in following example, all disks with interval of 2 seconds show:
iostat 2i
device bps sps msps c1t6d0 0 0.0 1.0 c1t3d0 0 0.0 1.0 c1t4d0 0 0.0 1.0 c1t5d0 0 0.0 1.0 c0t1d0 0 0.0 1.0 c0t1d1 0 0.0 1.0 c0t1d2 0 0.0 1.0 c0t1d3 0 0.0 1.0 c0t1d4 0 0.0 1.0 c0t1d5 0 0.0 1.0 c0t1d6 0 0.0 1.0 c0t1d7 0 0.0 1.0
For disk cache statistics you could use sar -b command.
back to top of page
Network performance can reduce the response time and frustrate users. To find out if your network is slowing down the traffic, try this. First open up a session to the system from regular terminal emulator and log in while counting the seconds say your system is named apple, then from another system say orange, rlogin to apple while counting the seconds it take for login prompt to appear. Compare the time and if rlogin is slower then your network is reaching its maximum capacity.
The basic network tool is called ping. If you want to see that a system named orange is reachable from your system named apple then from apple
apple :> ping oranges
Pinging host oranges.com (oranges) : 38.152.119.3 oranges.com: is alive! ----oranges.com PING Statistics---- 1 packets transmitted, 1 packets received, 0% packet loss round-trip (ms) min/avg/max = 9.27/9.27/9.27
This above command tells us that oranges host is reachable and no packets were dropped. Another command to test networking problems is called netstat.
To diagnose a networking problem netstat -i could be used in the following way:
netstat -i
Name Mtu Network Address Ipkts Opkts Odrop eg1 1500 204.89.162 dxi4.dxi.com 2275517 3783974 0 eg0 1500 38.254.211 dxifour.dxi.com 4716968 2862227 0 loop 1536 loopback-net localhost 0 0 0
Routing tables (10 entries) Destination Gateway Flags ttl Use Interface default 200.89.161.216 UGP PERM 537225 eg1 193.9.4.1 200.89.161.223 UGHP PERM 61325 eg1 127.0.0.1 127.0.0.1 UHP PERM 0 loop 191.99.8.40 200.89.161.245 UGHD 29 937 eg1for more information about netstat, type man netstat at prompt.
Kernel Management.
Kernel is the heart of a Unix operating system. It manages memory, schedules processes, manages I/O, and does all of the other low level jobs. Since it does all the important jobs it is always resident in physical memory of a Unix system. Other programs and software processes can be swapped or paged but kernel is always in physical memory. That's the reason that it should be as small as possible. To configure Kernel, login as root to system console and use the utility provided by your system. HP-UX uses sam, AIX uses smit, sco uses scoadmin, dynix uses menu. The things that you can do is make sure that software and drivers are absolutely needed to be in the system, if not then remove them and compile and replace the current kernel.
Back to top
Security | |||
Your suggestions and comments are welcome. Please
reserved with Sandeep S Bajwa.