Wednesday, September 23, 2009

egrep - LOVE it

'egrep' is an alias for grep invoked with the '-e' switch to enable extended regular expressions.

The reason I especially love it is its ability to use inclusive "or's". In other words, give me all the lines with x, y or z.

So if you want to see a count of all the mails sent to your exim daemon from say, 5 to 8 pm on 9/22, you can simply execute the following:

egrep '(2009-09-22 17|2009-09-22 18|2009-09-22 19|2009-09-22 20)' mainlog.1 | grep dnslookup | wc -l

And voila! You will have your answer.

Monday, September 21, 2009

Reading Top on Multi-core Systems

Take this sample output:

top - 11:24:56 up 2 days, 11:19, 2 users, load average: 0.45, 0.34, 0.32
Tasks: 176 total, 1 running, 175 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.1%us, 0.4%sy, 0.0%ni, 96.3%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 8174036k total, 7693556k used, 480480k free, 260924k buffers
Swap: 2031608k total, 4k used, 2031604k free, 5043372k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26229 root 25 0 2181m 1.5g 96m S 28 18.7 133:33.65 java
3388 root 18 0 67692 2240 1732 S 0 0.0 0:01.95 rotatelogs
16890 nsmc 15 0 12736 1144 816 R 0 0.0 0:00.51 top

What we’re seeing seems completely contradictory - the java process is consuming 28 percent of the cpu's resources, but the total user process consumption on the system is 3 percent?

Basically, a rollup of CPU time on a multicore system is going to be an over-simplification. It’s essentially the total cpu of all processes divided by the number of cores. (In this case, eight).

Using the run-time command of ‘1’ (that’s the number one), you get a different output:

top - 11:28:25 up 2 days, 11:23, 2 users, load average: 0.14, 0.21, 0.27
Tasks: 174 total, 1 running, 173 sleeping, 0 stopped, 0 zombie
Cpu0 : 13.3%us, 1.3%sy, 0.0%ni, 84.1%id, 0.7%wa, 0.0%hi, 0.7%si, 0.0%st
Cpu1 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8174036k total, 7754924k used, 419112k free, 261028k buffers
Swap: 2031608k total, 4k used, 2031604k free, 5104464k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26229 root 25 0 2181m 1.5g 96m S 20 18.7 134:23.33 java

That makes a LOT more sense. So the default output is showing the cpu consumption of the java process at the individual core level, but the rollup as an average across all the cores. This new output shows how it’s getting distributed across the cores.

By the same token, if you go back to the default output and hit ‘I’ (that’s a capital i), you will get a cpu consumption number for the java process divided by the number of cores:

top - 11:31:31 up 2 days, 11:26, 2 users, load average: 0.14, 0.22, 0.26
Tasks: 174 total, 1 running, 173 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.0%us, 0.5%sy, 0.0%ni, 96.0%id, 0.3%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 8174036k total, 7808624k used, 365412k free, 261136k buffers
Swap: 2031608k total, 4k used, 2031604k free, 5156848k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26229 root 25 0 2181m 1.5g 96m S 3.5 18.7 135:04.92 java
16890 nsmc 15 0 12736 1144 816 R 0.0 0.0 0:01.32 top
1 root 15 0 10344 680 572 S 0.0 0.0 0:02.47 init

So now you’re consistent again.

I’m not sure why these two are aren’t viewed from the same perspective by default.