Thursday, August 23, 2007

File Logging in Solaris, the NFS way (almost) and the BSM way

One of my potential customers wanted to implement grid computing with their existing servers. Two issues they would like us to address.
  1. Each design team can only access their own design files
  2. All access of files from the central storage have to be captured to ensure design is kept within the team.

Item 1 is pretty easy to address with the standard ACL in Solaris, or any other flavours of UNIX

Item 2. Two possible solutions to this issue. First approach, NFS logging. I know Solaris NFS comes with the capability of logging all the NFS access. All we need is to share out the directory path with logging enable (share -o log=global /some/directory). The nfslog.conf will tell where the log file will go, default /var/nfs

# more /etc/nfs/nfslog.conf

#ident  "@(#)nfslog.conf        1.5     99/02/21 SMI"
# Copyright (c) 1999 by Sun Microsystems, Inc.
# All rights reserved.
# NFS server log configuration file.
# <tag> [ defaultdir=<dir_path> ] \
#       [ log=<logfile_path> ] [ fhtable=<table_path> ] \
#       [ buffer=<bufferfile_path> ] [ logformat=basic|extended ]

global  defaultdir=/var/nfs \
    log=nfslog fhtable=fhtable buffer=nfslog_workbuffer

There is a catch. In the man page of nfslogd, it say NFS logging is not supported for NFS version 4.

# man nfslogd

System Administration Commands                        nfslogd(1M)

 nfslogd - nfs logging daemon


 The nfslogd  daemon  provides  operational  logging  to  the
 Solaris  NFS  server. It is the nfslogd daemon's job to gen-
 erate the activity log by analyzing the RPC operations  pro-
 cessed  by  the  NFS server.  The log will only be generated
 for file systems exported  with  logging  enabled.  This  is
 specified  at  file  system  export  time  by  means  of the
 share_nfs(1M) command.

 NFS server logging is not supported on Solaris machines that
 are using NFS Version 4.

OK, I can still fall back to NFS version 3. However, when I enable nfs server (/etc/init.d/nfs.server or svcadm enable svc:/network/nfs/server:default) and have another server to mount using nfsv3 (mount -o vers=3 ...), I can see files in the /var/nfs directory are growing, but most of them are binary files not ascii
# ls -l /var/nfs
total 13502
-rw-r-----   1 root     root           0 Aug 23 11:25 fhtable.0000000000000000.dir
-rw-r-----   1 root     root        1024 Aug 23 11:25 fhtable.0000000000000000.pag
-rw-r-----   1 root     root        4096 Aug 23 11:57 fhtable.0154002000000002.dir
-rw-r-----   1 root     root     8368128 Aug 23 17:31 fhtable.0154002000000002.pag
-rw-r-----   1 root     root           0 Aug 23 11:25 nfslog
-rw-------   1 root     root     2309148 Aug 23 17:44 nfslog_workbuffer_log_in_process
drwxr-xr-x   2 daemon   daemon       512 Mar 21 10:21 v4_oldstate
drwxr-xr-x   2 daemon   daemon       512 Aug 23 11:30 v4_state

# file /var/nfs/*
/var/nfs/fhtable.0000000000000000.dir:  empty file
/var/nfs/fhtable.0000000000000000.pag:  data
/var/nfs/fhtable.0154002000000002.dir:  data
/var/nfs/fhtable.0154002000000002.pag:  data
/var/nfs/nfslog:        empty file
/var/nfs/nfslog_workbuffer_log_in_process:      data
/var/nfs/v4_oldstate:   directory
/var/nfs/v4_state:      directory
All these files seem to be binary in nature and very likely they are in Berkeley DB format. I tried to use perl dbmopen to read it and unpack the key-value, but all I got are garbage. I suppose I need to fully understand the struct of that entry. To my surprise, there isn't any utility in the OS that allow me to view the content of the file. Also, I had no luck at all finding a solution on the Internet. Anyway, I am hitting a dead end and all I can do is to post it to the Sun folk for a solution.

Second approach that I wanted to try is Basic Security Module (BSM). Although this is meant for conforming to the US C2 security audit requirements, it does log access of files (any files, including sharable libraries) in the system. Anyway, I just want to run it to see what it can offer.

cd /etc/security
Once bsmconv is activated, you need to reboot the server in oder for the bsm audit kernel module to be loaded in the next start up. The log file is located in /var/audit directory by default
$ ls  /var/audit
20070821022314.20070821022315.chihung  20070821023545.20070821023546.chihung
20070821022315.20070821023344.chihung  20070821023546.20070821024602.chihung

In BSM, it comes with a utility (praudit) to view the content. If you read up the man page, there are a couple of flags in this command. In particular, the '-l' flag and the '-x' flag are the one that I wanted to talk about. '-l' gives you one line per record so that you can use any of the standard Solaris utilities like sed/awk/cut to filter out the information you want.

This is a sample output of praudit -l with fm/fa/fc/fd (file modified/access/create/delete) audit turned on. See /etc/security/audit_class for a full list.

file,2007-08-21 10:35:46.111 +08:00,/var/audit/20070821023545.20070821023546.chihung
header,44,2,system booted,na,2007-08-21 10:34:49.704 +08:00,text,booting kernel
header,135,2,stat(2),,chihung,2007-08-21 10:39:22.700 +08:00,path,/usr/lib/pt_chmod,attribute,104511,root,bin,85,623,0,subject,chihung,root,staff,chihung,staff,704,2477344358,756 65558 ftpl_2_207,return,success,0
header,126,2,stat(2),fe,chihung,2007-08-21 10:39:22.727 +08:00,path,/platform/SUNW,UltraSPARC-IIi-cEngine/lib,subject,chihung,root,staff,chihung,staff,704,2477344358,756 65558 ftpl_2_207,return,failure: No such file or directory,-1
header,149,2,access(2),,chihung,2007-08-21 10:39:22.737 +08:00,path,/dev/pts/devices/pseudo/pts@0:1,attribute,20620,root,tty,335,12582918,103079215105,subject,chihung,root,staff,chihung,staff,704,2477344358,756 65558 ftpl_2_207,return,success,0
header,149,2,pathconf(2),,chihung,2007-08-21 10:39:22.737 +08:00,path,/dev/pts/devices/pseudo/pts@0:1,attribute,20620,root,root,335,12582918,103079215105,subject,chihung,root,staff,chihung,staff,704,2477344358,756 65558 ftpl_2_207,return,success,1
header,149,2,access(2),,chihung,2007-08-21 10:39:22.740 +08:00,path,/dev/pts/devices/pseudo/pts@0:1,attribute,20620,chihung,tty,335,12582918,103079215105,subject,chihung,chihung,staff,chihung,staff,703,2477344358,756
If you were to analyse the number of fields (FS) per record, you realise that they are not consistent. FS ranges from 3,8,19,21,22,28,29,30,31,32. Ability to extract the right field for analysis is going to be a nightmare.

How about the 'praudit -x' flag. Basically it will output the information in XML format. This is cool and XML is definitely my friend. Let's see the output:

<?xml version='1.0' encoding='UTF-8' ?>
<?xml-stylesheet type='text/xsl' href='file:///usr/share/lib/xml/style/adt_record.xsl.1' ?>

<!DOCTYPE audit PUBLIC '-//Sun Microsystems, Inc.//DTD Audit V1//EN' 'file:///usr/share/lib/xml/dtd/adt_record.dtd.1'>

<file iso8601="2007-08-21 10:35:46.111 +08:00">/var/audit/20070821023545.20070821023546.chihung</file>
<record version="2" event="system booted" modifier="na" iso8601="2007-08-21 10:34:49.704 +08:00">
<text>booting kernel</text>
<record version="2" event="stat(2)" host="chihung" iso8601="2007-08-21 10:39:22.700 +08:00">
<attribute mode="104511" uid="root" gid="bin" fsid="85" nodeid="623" device="0"/>
<subject audit-uid="chihung" uid="root" gid="staff" ruid="chihung" rgid="staff" pid="704" sid="2477344358" tid="756 65558 ftpl_2_207"/>
<return errval="success" retval="0"/>
<record version="2" event="stat(2)" modifier="fe" host="chihung" iso8601="2007-08-21 10:39:22.727 +08:00">
<subject audit-uid="chihung" uid="root" gid="staff" ruid="chihung" rgid="staff" pid="704" sid="2477344358" tid="756 65558 ftpl_2_207"/>
<return errval="failure: No such file or directory" retval="-1"/>
<record version="2" event="access(2)" host="chihung" iso8601="2007-08-21 10:39:22.737 +08:00">
<attribute mode="20620" uid="root" gid="tty" fsid="335" nodeid="12582918" device="103079215105"/>
<subject audit-uid="chihung" uid="root" gid="staff" ruid="chihung" rgid="staff" pid="704" sid="2477344358" tid="756 65558 ftpl_2_207"/>
<return errval="success" retval="0"/>
It even comes with a XML stylesheet that we can apply. Here is the screen dump of the html after running (xsltproc adt_record.xsl.1 audit.xml > audit-xml.html)

With XML, I can pick up a lot of interesting things from the data. Say I want to find out who access files that do not belong to them. With DOM implementation in Tcl (tDOM), I can script it like this

package require tdom
set doc [dom parse [tDOM::xmlReadFile audit.xml]]
set root [$doc documentElement]
foreach i [$root selectNodes {//attribute[@uid != string(../subject/@uid)]}] {
puts [[$i selectNodes ../path/text()] nodeValue]

Until I can resolve the NFS logging, it seems BSM is going to be the answer. Bear in mind that BSM generates tonnes of data. If I were to implement this approach, I will definitely allocate a lot of disk space (100+GB) and make sure this is in a separate disk to avoid too many IO activities in the OS disk (/var)

Labels: , , ,


Blogger blake said...

I'm also struggling to get NFS logging working. I've got the property set, but only get the 'nfslog_workbuffer_log_in_process' file as output, and it's binary, as you say in this entry. Have you made any headway?


7:45 AM  
Blogger chihungchan said...

I posted that quite some time back, that's before Solaris was open sourced. Now you can view the source code from and find out the data structure before the binary file.

Hope you have some luck on this. Appreciate if you can share with the rest.

10:16 PM  

Post a Comment

<< Home