Alan Hargreaves' Blog

The ramblings of a former Australian SaND TSC* Principal Field Technologist

Solaris ufs bug in Month of Kernel Bugs

Just noticed that Solaris has an entry in Month of Kernel bugs.

While I agree that we have an issue that needs looking at, I also believe that the contributor is making much more of it than it really deserves.

First off, to paraphrase the issue:

If I give you a specially massaged filesystem and can convince someone with the appropriate privilege to mount it, it will crash the system.

I’d hardly call this a “denial of service”, let alone exploitable.

First off, in order to perform a mount operation of a ufs filesystem, you need sys_mount privilege. In Solaris, we currently are runing under the concept of “least privilege”. That is, a process is given the least amount of privilege that it needs to run. So, in order to exploit this you need to convince someone with the appropriate level of privilege to mount your filesystem. This would also invlove a bit of social engineering which went unmentioned.

That being said, they system should not panic off this filesystem and I will log a bug to this effect. It is a shame that the contributor did not make the crashdump files available as it would certainly speed up any analysis.

One other thing that I should add is that anyone who tries to mount an unknown ufs filesystem without at least running "fsck -n" over it probably deserves what they get.

OK, I have copied it to a relatively current nevada system and mounted it as /dev/lofi/1. On running "fsck -n" we see:

** /dev/rlofi/1 (NO WRITE)
BAD SUPERBLOCK AT BLOCK 16: BAD VALUES IN SUPER BLOCK
LOOK FOR ALTERNATE SUPERBLOCKS WITH MKFS?  no
LOOK FOR ALTERNATE SUPERBLOCKS WITH NEWFS?  no
SEARCH FOR ALTERNATE SUPERBLOCKS FAILED.
USE GENERIC SUPERBLOCK FROM MKFS?  no
USE GENERIC SUPERBLOCK FROM NEWFS?  no
SEARCH FOR ALTERNATE SUPERBLOCKS FAILED. YOU MUST USE THE -o b OPTION
TO FSCK TO SPECIFY THE LOCATION OF A VALID ALTERNATE SUPERBLOCK TO
SUPPLY NEEDED INFORMATION; SEE fsck(1M).

In the normal course of events, would you mount this filesystem. I certainly would not. This however is not the normal course of events and I’m playing on a lab system.

v40z-c# uname -a
SunOS v40z-c 5.11 snv_46 i86pc i386 i86pc

Let’s try a read only mount first.

v40z-c# mount -r /dev/lofi/1 /mnt
v40z-c# ls /mnt
lost+found
v40z-c# umount /mnt

OK, the read only mount is fine. Now the read/write, … Bingo

v40z-c# mount /dev/lofi/1 /mnt
panic[cpu3]/thread=ffffffff9a6974e0: BAD TRAP: type=e (#pf Page fault) rp=fffffe8000c7f2c0 addr=fffffe80fe39d6c4
mount: #pf Page fault
Bad kernel fault at addr=0xfffffe80fe39d6c4
pid=2170, pc=0xfffffffffbb70950, sp=0xfffffe8000c7f3b0, eflags=0x10286
cr0: 8005003b cr4: 6f8
cr2: fffffe80fe39d6c4 cr3: 1ff76b000 cr8: c
...
fffffe8000c7f1b0 unix:die+b1 ()
fffffe8000c7f2b0 unix:trap+1528 ()
fffffe8000c7f2c0 unix:_cmntrap+140 ()
fffffe8000c7f440 ufs:alloccgblk+42f ()
fffffe8000c7f4e0 ufs:alloccg+473 ()
fffffe8000c7f560 ufs:hashalloc+50 ()
fffffe8000c7f600 ufs:alloc+14f ()
fffffe8000c7f6c0 ufs:lufs_alloc+f3 ()
fffffe8000c7f770 ufs:lufs_enable+261 ()
fffffe8000c7f7e0 ufs:ufs_fiologenable+63 ()
fffffe8000c7fd60 ufs:ufs_ioctl+3e0 ()
fffffe8000c7fdc0 genunix:fop_ioctl+3b ()
fffffe8000c7fec0 genunix:ioctl+180 ()
fffffe8000c7ff10 unix:sys_syscall32+101 ()

OK, so we should now have a crashdump to look at.

While the machine is rebooting, it occurs to me that if we put this ufs onto an external USB device, we might actually have an exploitable issue here, once the new hal/rmvolmgr framework is in place (nv_51) if we try to automatically mount ufs devices.

core file:      /var/crash/v40z-c/vmcore.0
release:        5.11 (64-bit)
version:        snv_46
machine:        i86pc
node name:      v40z-c
domain:         aus.cte.sun.com
system type:    i86pc
hostid:         69e47dae
dump_conflags:  0x10000 (DUMP_KERNEL) on /dev/dsk/c1t1d0s1(517M)
time of crash:  Sun Nov 12 13:08:18 EST 2006
age of system:  34 days 1 hours 42 minutes 34.95 seconds
panic CPU:      3 (4 CPUs, 7.56G memory)
panic string:   BAD TRAP: type=e (#pf Page fault) rp=fffffe8000c7f2c0 addr=fffffe80fe39d6c4
sanity checks: settings...vmem...sysent...clock...misc...done
-- panic trap data  type: 0xe (Page fault)
addr: 0xfffffe80fe39d6c4  rp: 0xfffffe8000c7f2c0
savfp 0xfffffe8000c7f440  savpc 0xfffffffffbb70950
%rbp  0xfffffe8000c7f440  %rsp  0xfffffe8000c7f3b0
%rip  0xfffffffffbb70950  (ufs:alloccgblk+0x42f)
0%rdi 0xffffffff8d60b000  1%rsi 0xffffffff8930c308  2%rdx               0xb5
3%rcx               0xb5  4%r8  0xfffffe80fe39d6c0  5%r9              0x12f0
%rax                 0x8  %rbx          0x361005a8
%r10                   0  %r11  0xfffffffffbcd9ff0  %r12               0x5a8
%r13  0xffffffff8930c000  %r14  0xffffffff8d60b000  %r15  0xffffffff99656c00
%cs       0x28 (KCS_SEL)
%ds       0x43 (UDS_SEL)
%es       0x43 (UDS_SEL)
%fs          0 (KFS_SEL)
%gs      0x1c3 (LWPGS_SEL)
%ss       0x30 (KDS_SEL)
trapno     0xe (Page fault)
err        0x2 (page not present,write,supervisor)
%rfl   0x10286 (parity|negative|interrupt enable|resume)
fsbase 0xffffffff80000000 gsbase 0xffffffff8901c800
ufs:alloccgblk+0x42f()
ufs:alloccg+0x473()
ufs:hashalloc+0x50()
ufs:alloc+0x14f()
ufs:lufs_alloc+0xf3()
ufs:lufs_enable+0x261()
ufs:ufs_fiologenable+0x63()
ufs:ufs_ioctl+0x3e0()
genunix:fop_ioctl+0x3b()
genunix:ioctl+0x180()
unix:_syscall32_save+0xbf()
-- switch to user thread's user stack --

The trap has occurred in alloccgblk+42f() in the ufs code.

ufs:alloccgblk+0x410            call   +0xe0a4  (ufs:clrblock+0x0)
ufs:alloccgblk+0x415            decl   0x1c(%r13)	; cgp->cg_cs.cs_nbfree--
ufs:alloccgblk+0x419            decl   0xc4(%r14)	; fs->fs_cstotal.cs_nbfree--
ufs:alloccgblk+0x420            movslq 0xc(%r13),%r8	; %r8 <- cgp->cg_cgx
ufs:alloccgblk+0x428            addq   0x2d8(%r14),%r8	; %r8 <- fs->fs_u.fs_csp[%r8]
ufs:alloccgblk+0x42f            decl   0x4(%r8) <-- panic here

We’ve just made a call to ufs:clrblock() and are decrementing something after a long list of pointer dereferencing. We only call clrblock() once in this routine, so that puts us at:

428  #define fs_cs(fs, indx) fs_u.fs_csp[(indx)]
1238          clrblock(fs, blksfree, (long)blkno);
1239          /*
1240           * the other cg/sb/si fields are TRANS'ed by the caller
1241           */
1242          cgp->cg_cs.cs_nbfree--;
1243          fs->fs_cstotal.cs_nbfree--;
1244          fs->fs_cs(fs, cgp->cg_cgx).cs_nbfree--;

Panicing on line 1244.

I should note at this point that the source I am quoting is from the same source tree as opensolaris.

After the macro expansion, it becomes

1244          fs->fs_u.fs_csp[cgp->cg_cgx]--

So what is cgp->cg_cgx?

SolarisCAT(vmcore.0/11X)> sdump 0xffffffff8930c000 cg cg_cgx
cg_cgx = 0x6f0000

This is probably a trifle on the largish side, which would explain how we have ended up in unmapped memory.

The address we end up with for the (struct csum *) is 0xfffffe80fe39d6c0

If we go back to look at fs->fs_ncg, we see that there were only two cylinder groups allocated. We have an obvious inconsistancy.

Also, interestingly, this is not dieing in the mount(2) system call. It’s dieing in a subsequent ioctl(2). This ioctl appears to be the one enabling ufs logging.

So how might we handle this?

Now, as the filesystem is already mounted and we are in a subsequent ioctl(), we can’t fail the mount. Could we fail in alloccgblock() and have the error propogate back up to the process making the ioctl()?

Walking back up the stack, we see that alloccg() handles alloccgblk() returning 0. Now if alloccg() returns 0 to hashalloc(), in this instance, we’ll first try to do a quadratic rehash, which will also fail, so we’ll fall through to the brute force search. As this starts at cylinder group 2 and there are only 2 cylinder groups, this will call alloccg() once and fail, falling through to return 0 to alloc(). Note that no matter how many times we end up in alloccgblock() it has the same arguments, so it would fail the same way.

In alloc(), it notes that we did not get a block returned and assumes this is because some other thread grabbed the last block. It then treats the whole thing as if we ran out of space and returns ENOSPC to lufs_alloc(). lufs_alloc() catches this, frees up everything and returns the error (ENOSPC) to lufs_enable(), which in turn catches it cleans up and returns it to ufs_fiologenable() and the error is eventually passed back to user space. While not exactly the error we would have hoped, the end result would be that logging would not be turned on and the system would not panic due to this corrupted filesystem.

I’ll log this as a bug against ufs for Solaris 10 and nevada

Update

I have logged CR 6492771 against this issue. The link for the CR should work some time in the next 24 hours, but the content as logged is pretty much a cut and paste from this blog entry.

Update 2

The bug I logged has been closed as a duplicate of

4732193 ufs will attempt to mount filesystem with blatantly-bad superblock

Technorati Tags:
,
,
,

Advertisements

Written by Alan

November 11, 2006 at 7:37 pm

Posted in Solaris

6 Responses

Subscribe to comments with RSS.

  1. An interesting point. It shows the server centric heritage of Solaris. Perhaps fsck should be run by default on UFS mounts?

    Brian Utterback

    November 12, 2006 at 3:46 pm

  2. I can top this with a zfs bug waiting for a ls to panic the machine.
    Nov 13 00:15:12 virgo-admin unix: [ID 753105 kern.notice] #de Divide error
    Nov 13 00:15:12 virgo-admin unix: [ID 358286 kern.notice] addr=0xcdec1ffc
    Nov 13 00:15:12 virgo-admin unix: [ID 243837 kern.notice] pid=4632, pc=0xfe801982, sp=0x0, eflags=0x10286
    Nov 13 00:15:12 virgo-admin unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 698<xmme,fxsr,pge,pse,de>
    Nov 13 00:15:12 virgo-admin unix: [ID 936844 kern.notice] cr2: cdec1ffc cr3: 5c120000
    Nov 13 00:15:12 virgo-admin unix: [ID 537610 kern.notice] gs: fe8e01b0 fs: d68e0000 es: eb260160 ds: eb260160
    Nov 13 00:15:12 virgo-admin unix: [ID 537610 kern.notice] edi: 0 esi: 0 ebp: d4ae86e0 esp: d4ae864c
    Nov 13 00:15:12 virgo-admin unix: [ID 537610 kern.notice] ebx: 0 edx: 0 ecx: 0 eax: d249a1e8
    Nov 13 00:15:12 virgo-admin unix: [ID 537610 kern.notice] trp: 0 err: 0 eip: fe801982 cs: 158
    Nov 13 00:15:12 virgo-admin unix: [ID 717149 kern.notice] efl: 10286 usp: 0 ss: ffffffff
    Nov 13 00:15:12 virgo-admin unix: [ID 100000 kern.notice]
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae856c unix:die+ed (0, d4ae8614, cdec1f)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8600 unix:trap+1033 (d4ae8614, cdec1ffc,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8614 unix:cmntrap+9b (fe8e01b0, d68e0000,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae86e0 unix:UDivRem+92 (0, 0)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8714 zfs:vdev_mirror_map_alloc+163 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8744 zfs:vdev_mirror_io_start+16 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8770 zfs:zio_vdev_io_start+14f (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8780 zfs:zio_next_stage+66 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8790 zfs:zio_ready+124 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae87ac zfs:zio_next_stage+66 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae87cc zfs:zio_wait_for_children+46 (e1f70040, 1, e1f702)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae87e0 zfs:zio_wait_children_ready+18 (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae87f4 zfs:zio_next_stage_async+ac (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8804 zfs:zio_nowait+e (e1f70040)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8840 zfs:arc_read+3a1 (0, d69f6ac0, dba005)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae88bc zfs:dbuf_prefetch+124 (d163ce28, a8b, 0)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae88f4 zfs:dmu_zfetch_fetch+48 (d163ce28, a88, 0, 7)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8954 zfs:dmu_zfetch_dofetch+183 (d163cf84, e7fc48a0)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae89a0 zfs:dmu_zfetch_find+530 (d163cf84, d4ae89c8,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8a24 zfs:dmu_zfetch+bf (d163cf84, 3a44000, )
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8a5c zfs:dbuf_read+123 (eb4b0010, 0, 2)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8aa0 zfs:dnode_hold_impl+c1 (d1645340, 1d230, 0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8ac4 zfs:dnode_hold+1c (d1645340, 1d230, 0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8b04 zfs:dmu_bonus_hold+26 (d1641b90, 1d230, 0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8b78 zfs:zfs_zget+4e (d109c100, 1d230, 0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8bc8 zfs:zfs_dirent_lock+292 (d4ae8bfc, eb4734d0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8c00 zfs:zfs_dirlook+92 (eb4734d0, d4ae8ca0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8c30 zfs:zfs_lookup+74 (eb46d900, d4ae8ca0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8c70 genunix:fop_lookup+32 (eb46d900, d4ae8ca0,)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8dfc genunix:lookuppnvp+2cd (d4ae8e70, 0, 0, 0, )
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8e44 genunix:lookuppnat+ec (d4ae8e70, 0, 0, 0, )
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8ec4 genunix:lookupnameat+54 (804305c, 0, 0, 0, d)
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8f14 genunix:cstatat_getvp+149 (ffd19553, 804305c, )
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8f60 genunix:cstatat64+6b (ffd19553, 804305c, )
    Nov 13 00:15:12 virgo-admin genunix: [ID 353471 kern.notice] d4ae8f84 genunix:lstat64+1f (804305c, 8042088, 1)

    Knut Grunwald

    November 13, 2006 at 7:50 am

  3. Ups !
    Sorry for the bad formatting

    Knut Grunwald

    November 13, 2006 at 7:52 am

  4. The blog takes html. Try surrounding your stack dump in <pre> / </pre> markups.
    Alan.
    ps: has this been logged with Sun as a bug?

    Alan Hargreaves

    November 13, 2006 at 9:39 am

  5. I tried to enter this as a bug, but i failed before, so i’m not shure it is filed.

    Knut Grunwald

    November 14, 2006 at 2:23 am

  6. Please email the stack dump directly to me at Alan dot Hargreaves At Sun dot COM. A pointer to somewhere I can grab the crashdumps from would also be helpful.
    Once I have these, I’ll see if I can find if your submission succeeded. If it didn’t I’ll logit myself.
    Alan.

    Alan Hargreaves

    November 14, 2006 at 5:02 am


Comments are closed.

%d bloggers like this: