Alan Hargreaves' Blog

The ramblings of an Australian SaND TSC* Principal Field Technologist

Archive for the ‘DTrace’ Category

ksh93 ENOEXEC errors solved

If you have been following my blogs on transitioning to an OpenSolaris SPARC based Sun Ray server (part 1 and part 2) you would have seen my frustration at the errors I was getting like:

make: sh: cannot execute [Exec format error]
make: sh: cannot execute [Exec format error]
make: sh: cannot execute [Exec format error]
sh: make: function not defined
*** Error code 126

I did some poking around today. I noticed that the stack in the writing of that error message looked like:

              libc.so.1`__write+0x8
              libast.so.1`_sfflsbuf+0x24c
              libast.so.1`sfwrite+0x3e0
              libast.so.1`errorv+0xc34
              libast.so.1`errormsg+0x1c
              libshell.so.1`sh_lex+0x408
              libshell.so.1`term+0x34
              libshell.so.1`list+0x8
              libshell.so.1`sh_cmd+0x18
              libshell.so.1`sh_parse+0x2c0
              libshell.so.1`sh_eval+0x1c8
              libshell.so.1`funload+0x188
              libshell.so.1`path_absolute+0x584
              libshell.so.1`path_search+0x2b4
              libshell.so.1`sh_exec+0xddc
              libshell.so.1`sh_subshell+0x5c8
              libshell.so.1`comsubst+0x998
              libshell.so.1`varsub+0x4f8
              libshell.so.1`copyto+0x8dc
              libshell.so.1`sh_mactrim+0x10c

It’s trying to treat make as if it were a shell function. Huh? of course that’s not going to work.

Inside path_absolute in libshell I found the following:

    767                 isfun = (pp->flags&PATH_FPATH);

Further down this is checked and if it is, the file is opened to be read as a shell script.

ding!

Heading to the man page we see:

     FPATH         The search path for function definitions.  The
                   directories  in  this  path are searched for a
                   file with the same name  as  the  function  or
                   command  when a function with the -u attribute
                   is referenced and when a command is not found.
                   If  an  executable  file with the name of that
                   command is found, then it is read and executed
                   in  the  current environment. Unlike PATH, the
                   current directory must be  represented  expli-
                   citly by dot (.) rather than by adjacent colon
                   (:) characters or a beginning or ending  colon
                   (:).

Now if I add in the fact that I construct my $PATH using $FPATH and $LPATH (among others) a light begins to dawn.

My $FPATH actually contained almost the entirety of my normal path. So, ksh93 (which is also /bin/sh on OpenSolaris) was assuming that anything in any of those directories was a script to be read and parsed, without having to check.

I changed the names of those variables and everything there is now working as it should.

Advertisements

Written by Alan

April 13, 2010 at 9:20 pm

Posted in DTrace, OpenSolaris, Work

DTrace helping to get to the bottom of load spikes

For a while now I have had a customer call that has been causing me a lot of pain to diagnose.

The symptom is that occasionally they will see the load average spike to 500ish and then decay back to normal. There is no other performance effect on the system.

The latest approach that I had been taking was to attempt to get DTrace to watch when a thread was put onto a dispatch queue and if that dispatch queue had more than a threshold number of threads on it, then to run mdb to get a list of the dispatch queues (echo ::cpulist -l | mdb -l).

Turns out that even this was too slow and the threads had been dealt with by the time I got the output.

OK this actually tells us something. These threads are being placed onto the dispatch queue incredibly quickly and also going to cpu and being dealt with incredibly quickly.

At this stage it occurred to me that it was too fast for the threads to be created. It had to be some kind of wakeup of a lot of threads, like a cv_broadcast(9F).

So I did up this script and asked the customer to run it.

#!/usr/sbin/dtrace -s

#pragma D option quiet

/*
 * This D script will notify when any one process wakes up more
 * than threshold threads currently sleeping on a synchronisation
 * object.
 */

int threshold;

BEGIN {
	threshold = 20;
	printf("Monitoring, ...\n");
}

fbt::sleepq_wakeall_chan:entry {
	self->wakeups = 0;
	self->interest = 1;
}

wakeup /self->interest/ {
	self->wakeups++;
}

fbt::sleepq_wakeall_chan:return /self->interest && self->wakeup > threshold/ {
	printf("\n%Y %s [%d]: woke up %d threads",
	    walltimestamp, execname, pid, self->wakeups);
	stack(20);
}

fbt::sleepq_wakeall_chan:return /self->interest/ {
	self->interest = self->wakeups = 0;
}

We got immediate results. An excerpt of the output was:

2010 Mar 29 14:10:20 oracle [3102]: woke up 472 threads
              genunix`cv_broadcast+0x44
              zfs`zil_commit+0x78
              zfs`zfs_write+0x5d0
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac

2010 Mar 29 14:10:20 oracle [3100]: woke up 94 threads
              genunix`cv_broadcast+0x44
              zfs`zil_commit+0x78
              zfs`zfs_write+0x5d0
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac

2010 Mar 29 14:10:20 oracle [3102]: woke up 112 threads
              genunix`cv_broadcast+0x44
              zfs`zil_commit+0x78
              zfs`zfs_write+0x5d0
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac

2010 Mar 29 14:10:20 oracle [3102]: woke up 59 threads
              genunix`cv_broadcast+0x44
              zfs`zil_commit+0x78
              zfs`zfs_write+0x5d0
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac

2010 Mar 29 14:10:20 oracle [3100]: woke up 571 threads
              genunix`cv_broadcast+0x44
              zfs`zil_commit+0x78
              zfs`zfs_write+0x5d0
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac

What we have here is Oracle doing a LOT of parallel writes at one time. The system is actually behaving normally and processing things quickly, the only issue is that we occasionally see load average spike to large numbers then decay back to normal. Perhaps a better way to monitor the system activity might be to look at the ‘r’ column in vmstat which shows snapshots of how many threads are in dispatch queues.

So now we know what is going on, we can actually move on.

Written by Alan

April 10, 2010 at 12:36 am

Posted in DTrace, Solaris, Work