Alan Hargreaves' Blog

The ramblings of an Australian SaND TSC* Principal Field Technologist

Archive for the ‘Uncategorized’ Category

A Solaris tmpfs uses real memory

That title may sound a little self explanatory and obvious, but over the last two weeks  I have had two customers tell me flat out that /tmp uses swap and that I should still continue to investigate where their memory is being used.

This is likely because when you define /tmp in /etc/vfstab, you list the device being used as swap.

In the context of a tmpfs, swap means physical memory + physical swap. A tmpfs uses pageable kernel memory. This means that it will use kernel memory, but if required these pages can be paged to the swap device. Indeed if you put more data onto a tmpfs than you have physical memory, this is pretty much guaranteed.

If you are still not convinced try the following.

  1. In one window start up the command
    $ vmstat 2
  2. In another window make a 1gb file in /tmp.
    $ mkfile 1g /tmp/testfile
  3. Watch what happens in the free memory column in the vmstat.

There seems to be a misconception amongst some that a tmpfs is a way of stealing some of the disk we have allocated as swap to use as a filesystem without impacting memory. I’m sorry, this is not the case.

 

Written by Alan

February 28, 2013 at 9:38 am

Posted in Uncategorized

The Importance of Fully Specifying a Problem

I had a customer call this week where we were provided a forced crashdump and asked to determine why the system was hung.

Normally when you are looking at a hung system, you will find a lot of threads blocked on various locks, and most likely very little actually running on the system (unless it’s threads spinning on busy wait type locks).

This vmcore showed none of that. In fact we were seeing hundreds of threads actively on cpu in the second before the dump was forced.

This prompted the question back to the customer:

What exactly were you seeing that made you believe that the system was hung?

It took a few days to get a response, but the response that I got back was that they were not able to ssh into the system and when they tried to login to the console, they got the login prompt, but after typing “root” and hitting return, the console was no longer responsive.

This description puts a whole new light on the “hang”. You immediately start thinking “name services”.

Looking at the crashdump, yes the sshds are all in door calls to nscd, and nscd is idle waiting on responses from the network.

Looking at the connections I see a lot of connections to the secure ldap port in CLOSE_WAIT, but more interestingly I am seeing a few connections over the non-secure ldap port to a different LDAP server just sitting open.

My feeling at this point is that we have an either non-responding LDAP server, or one that is responding slowly, the resolution being to investigate that server.

Moral

When you log a service ticket for a “system hang”, it’s great to get the forced crashdump first up, but it’s even better to get a description of what you observed to make to believe that the system was hung.

Written by Alan

June 3, 2012 at 9:19 am

Posted in Solaris, Uncategorized, Work

Supportfiles.sun.com has moved (and changed address)

Over the last couple of hours the physical location of the supportfiles.sun.com server changed. The benefit is that the machine is now in the same building as the machines that we use to analyse your uploads, so getting the data onto those machines is now substantially faster.

What do I have to do to take advantage of this?

If you are using the DNS to look it up, then nothing, the DNS has changed over to using the new address. However, if you are using the IP address, you need to start using the new one. We are still uploading from the old server for the moment, but it is a substantially slower link. The new address is 192.18.110.60.

Written by Alan

September 27, 2011 at 8:29 pm

Posted in Uncategorized

Follow

Get every new post delivered to your Inbox.