Applies to:
Solaris SPARC Operating System - Version: 8.0 and later [Release: 8.0 and later ] Solaris x64/x86 Operating System - Version: 8 6/00 U1 and later [Release: 8.0 and later] Oracle Solaris Express - Version: 2010.11 and later [Release: 11.0 and later] Information in this document applies to any platform.Goal
Shortage of memory and virtual swap can result in slow system performance, hang, failure to start new process (fork failure), cluster timeout and thus unplanned outage. It is critical for system availability to monitor resource usage.Solution
Physical Memory Shortages Memory shortages can be caused by excessive kernel or application memory allocation and leaks. During memory shortages, the page daemon wakes up and starts scanning and stealing pages to bring the freemem, kernel global variable, value over the lotsfree kernel threshold. Systems with memory shortages slow down because memory pages may have to be read from the swap disk in order for processes to continue executing. High kernel memory allocation can be monitored by using mdb's memstat command. It reports kernel, application and file system memory usage:# echo "::memstat"|mdb -k
Page Summary Pages MB %Tot
------------ ----------- ------ ----
Kernel 18330 143 7% < Kernel Memory
ZFS File Data 4 0 0% < ZFS cache (see below)
Anon 36405 284 14% < Application memory: heap, stack, COW
Exec and libs 1747 13 1% < Application libraries
Page cache 3482 27 1% < File system cache
Free (cachelist) 3241 25 1% < Free memory with vnode info.intact
Free (freelist) 195422 1526 76% < Free memory
Total 258627 2020
Physical 254812 1990
# kstat -n arcstats
# kstat -n system_pages
module: unix instance: 0
name: system_pages class: pages
...
freemem 8337355 < available free memory
..
lotsfree 257271 < Paging starts when freemem drops below lotsfree
minfree 64317 < swapping will start if freemem drops below minfree
pageslocked 4424860 < pages locked excluding pp_kernel (kernel pages)
pagestotal 16465378 < total pages configured>
physmem 16487075 < total pages usable by solaris
pp_kernel 4740398 < memory allocated in kernel
---
# echo "::kmastat"|mdb -k
cache buf buf buf memory alloc alloc
name size in use total in use succeed fail
---------------------- ------ ------ ------ ------ --------- -----
..
kmem_slab_cache 56 2455 2465 139264 2571 0
kmem_bufctl_cache 24 5463 5763 139264 6400 0
kmem_bufctl_audit_cache 128 0 0 0 0 0
kmem_va_8192 8192 74 96 786432 74 0
kmem_va_16384 16384 2 16 262144 2 0
kmem_va_24576 24576 5 10 262144 5 0
kmem_va_32768 32768 1 8 262144 1 0
kmem_va_40960 40960 0 0 0 0 0
kmem_va_49152 49152 0 0 0 0 0
kmem_va_57344 57344 0 0 0 0 0
kmem_va_65536 65536 0 0 0 0 0
kmem_alloc_8 8 97210 98649 794624 3884007 0
kmem_alloc_16 16 29932 30988 499712 9786629 0
kmem_alloc_24 24 43651 44409 1073152 69596060 0
kmem_alloc_32 32 11512 12954 417792 71088529 0
...
set kmem_flags=0x1
How Do I Force a Crash Dump When My Solaris OS Is Hung?
Sun SPARC(R) Enterprise Mx000 (OPL) Servers: How to deal with a hung or unresponsive domain ?Best way to avoid outages due to kernel memory leak is to keep kernel patches up to date.
To monitor application memory usage consider using:
$prstat -s rss -can 100
$ps -eo 'addr zone user s pri pid ppid pcpu pmem vsz rss stime time nlwp psr args'
$pmap -xs <pid>
dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p PID
dtrace -n 'pid$target::malloc:entry
{ @[ustack()] = sum(arg0); }' -p PIDISM segment does not require swap reservations considering all pages are locked in memory by kernel and are not candidate for swapping.
DISM requires swap reservation considering memory can be locked and unlocked by the process.
When process use DISM it selectively increases the size of SGA by locking the ranges. Failure to lock the DISM region and continue using it as SGA for DB block caching may result in slow Oracle DB performance because accessing these pages result in page fault and that will slow down the oracle. See Doc: 1018855.1
When a process starts touching pages then anon structures are allocated, there is no physical disk swap allocated. Swap allocation in Solaris only happens when memory is short and pages need to be migrated to the swap device to keep up with workload memory demand. That is the reason, "swap -l" that reports physical disk swap allocation shows same value in "block" and "free" columns during normal conditions.
Solaris can run without physical disk swap and that is due to swapfs abstraction that acts as if there is a real swap space backing up the page. Solaris works with virtual swap and it is composed of physical memory and physical disk swap. When there is no physical disk swap configured, swap reservation happens against physical memory. Swap reservation against memory has a draw back and that is the system cannot do malloc() bigger than the physical memory configured. Advantage of running without physical disk swap is that the malicious program unable to do huge mallocs and thus cannot cause the system to crawl due to memory shortages.
Virtual swap = Physical memory + Physical Disk swap
Available virtual swap is reported by:
- vmstat: swap
- swap -s
Disk back swap is reported by:
- swap -l
Per process virtual swap reservation can be displayed:
pmap -S <pid>
prstat -s size -can 100 15"
- prstat -s size -can -p <pidlist> 100 15
You can dump the process address space showing all segment using:
pmap -xs <pid>
#!/bin/ksh
# Script monitors kernel and application memory usage
PATH=/bin:/usr/bin:/usr/sbin; export PATH
trap "killall" HUP INT QUIT KILL TERM USR1 USR2
killall()
{
for PID in $PIDLIST
do
kill -9 $PID 2>/dev/null
done
exit
}
restart()
{
for PID in $PIDLIST
do
kill -9 $PID 2>/dev/null
done
}
DIR=DATA.`date +%Y%m%d-%T`
TS=`date +%Y%m%d-%T`
mkdir $DIR
cd $DIR
while true
do
TS=`date +%Y%m%d-%T`
echo $TS >> mem.out
echo "output of ::memstat" >> mem.out
echo ::memstat|mdb -k >> mem.out
echo "output of kstat -n ZFS ARC memory usage" >> mem.out
kstat -n arcstats >> mem.out
echo "output of ::kmastat" >>mem.out
echo "::kmastat"|mdb -k >> mem.out
echo "output of swap -s and swap -l" >>mem.out
echo "swap -s" >>mem.out
swap -s >>mem.out
echo "swap -l" >>mem.out
swap -l >>mem.out
echo "output of ps" >>mem.out
/usr/bin/ps -eo 'addr zone user s pri pid ppid pcpu pmem vsz rss stime time nlwp psr args' >>mem.out
#
# start vmstat, mpstat and prstat in the background
#
PIDLIST=""
echo $TS >>vmstat.out
vmstat 5 >> vmstat.out &
PIDLIST="$PIDLIST $!"
echo $TS >>mpstat.out
mpstat 5 >> mpstat.out &
PIDLIST="$PIDLIST $!"
echo $TS >>prstat.out
prstat -s rss -can 100 >>prstat.out &
PIDLIST="$PIDLIST $!"
sleep 600 # every 10 minutes
restart
done