Tuesday, February 22, 2011

Syslog Messages, Take 2

Recently I have been helping my colleague to analyse data from Sun Solaris explorer (SUNWexplo) files and this required to parse through a lot of the /var/adm/messages generated by the system. Certain keywords are searched for possible system fault
perl -n -e 'print if /\b(bad|failed|critical|crash|down|offline|unavailable|fatal|panic)\b/i'

I managed to locate a few more important syslog messages and updated my previous blog on sample syslog accordingly.

While I am doing the final touch-up on my scripts, I found Solaris Common Messages and Troubleshooting Guide and Archive of Error Messages particular useful.

It would be wonderful if all our field engineers can help to consolidate all those messages related to incident. These information plus a central syslog server with real-time monitoring & alerting capability can help customer to effectively monitor all their servers. I believe this will help to differentiate us from our competitors.

