Introduction
~~~~~~~~~~~~
This short article aims to explain how to get a stack trace from a
core dump produced by any of the Oracle products on Unix platforms.
By following the steps below you can provide Oracle Support with vital
information to help identify the cause of a problem.
Please note that it is important to include information about the
tool being used, any code involved, the operation being performed,
environment etc.. in addition to the details below.
What is a 'core dump' ?
~~~~~~~~~~~~~~~~~~~~~~~
A core dump is an image copy of a processes state at the instant
it 'aborted'. It is produced in the form of a file called 'core'
usually located in the current directory.
What causes a core dump ?
~~~~~~~~~~~~~~~~~~~~~~~~~
There are many situations which can cause a core dump to be produced,
but it is usually because the process has attempted to do something
which the operating system does not like. The most common causes
of this are:
The program tried to access memory outside its allowed range.
The program tried to obtain a resource which was either
exhausted or unavailable.
An attempt was made to execute illegal instructions.
An attempt was made to read unaligned data
In Unix systems the offending process is sent one of a number of
signals which force a core dump to be produced. It is also possible
for a user to produce a core dump by sending one of these signals
to a process manually.
What should I do if I get a core dump ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As with any problem you should first note down the FULL version
numbers of the product, the RDBMS, PL/SQL (if used) and any
related products.
You should also note the EXACT command you were running when
this occurred. Eg: If it was a SQL*Forms problem and you were
using 'mrunform30', write this down. This command will be referred
to as 'program' below.
Now follow the instructions below in order:
1) Check if you have a 'core' file, it should be in the directory where
the command was issued, or in CORE_DUMP_DEST/core_NNNNNN
if it is the 'oracle' executable. "oracle" can also produce
core files in $ORACLE_HOME/dbs/core_NNNNN or $ORACLE_HOME/dbs .
2) Log in as ORACLE and cd to the directory containing the core file.
Then issue:
file core
This should identify the "program" name to use in the next step,
e.g.: oracle
3) Log in as ORACLE and change in to the $ORACLE_HOME/bin
directory. Enter the command:
file program
and write the result down letter for letter. If the word 'dynamic'
or 'dynamically linked' appears in the output of this command
then please make a note of this as there are a few platforms on
which Oracle does NOT support dynamic linking and this may be
the cause of your problem.
4) Now enter:
chmod +r program
to add read permission to the program.
5) Log out , then log in as the user who encountered the error.
The next step will vary slightly depending on which version of
Unix you are using. One of the following commands should exist
on your machine - try each in order until you find one that exists.
In some cases you may be asked for stacks from all threads so
use the "thread" version of the command if it exists. If the core
file contains multiple threads see Note:118252.1.
An alternative to the commands below is to use the stackx.sh
script from Note:362791.1 . That script will try to find a
suitable debugger and extract the stack tracee for you .
Common Debuggers and commands to show a symbolic stack trace:
Command NB Exit command Stack Trace command
------- -- ------------ -------------------
dbx quit where
xdb (HPUX 10) quit t
gdb (HPUX 11) q bt
dde (HPUX 11) q bt
sdb q t
adb $q (or Ctrl-D) $c
debug (PTX only) quit stack
gdb (Linux) quit bt
or thread apply all where
pstack (HPUX, Linux, Solaris)
Change to the directory where the core dump is located and enter
the commands as in the relevant example below. If you are not
sure which program produced the 'core' file then on some Unix
platforms the command 'file core' will tell you the executable
name that the core file is from (this does not work on ALL
Unix platforms, see note below.)
.
Example commands:
DBX: $ script /tmp/mystack
$ dbx $ORACLE_HOME/bin/<program> core
(dbx) where
... << Stack should appear here
(dbx) quit
$ exit
XDB: $ script /tmp/mystack
$ xdb $ORACLE_HOME/bin/<program> core
(xdb) t
... << Stack should appear here
(xdb) quit
$ exit
SDB: $ script /tmp/mystack
$ sdb $ORACLE_HOME/bin/<program> core
(sdb) t
... << Stack should appear here
(sdb) q
$ exit
(NOTE: In the 'adb' commands below literally type the $c & $q)
ADB: $ script /tmp/mystack
$ adb $ORACLE_HOME/bin/<program> core
$c << NB: adb has no prompt so just enter $c
...
$q
$ exit
DEBUG: $ script /tmp/mystack
$ debug -c core $ORACLE_HOME/bin/<program>
debug> stack
... << Stack should appear here
debug> quit
$ exit
GDB: $ script /tmp/mystack
$ gdb $ORACLE_HOME/bin/<program> core
(gdb) bt
... << Stack should appear here
(gdb) thread apply all where
... << Stacks for all threads here
(gdb) quit
$ exit
DDE: $ script /tmp/mystack
$ dde -ui line core $ORACLE_HOME/bin/<program>
dde> bt
...
dde> q
$exit
PSTACK: $ script /tmp/mystack
$ pstack core
$ exit
Assuming this worked then the stack trace should be shown in the
file '/tmp/mystack'. Upload this to Oracle Support.
6) If the debug command failed to give a stack trace then try using
a different debugger from the list above (if available).
If all debuggers fail then there is probably a problem with
either the permissions or the file type - see the section below
and then contact Oracle Support with all the details you have so far.
Common reasons for not getting a sensible stack
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Filesize Limits:
Note that on some machines there may be a kernel parameter or
user limit which controls the maximum size of core file that
can be produced - you can usually check this by typing:
limit in the C shell
OR ulimit -a in the Bourne / Korn shells.
If this limit is too small the core file will be useless -
raise the limit and reproduce the problem.
Stripped Executable
Some program executables are stripped of symbol information.
This makes the stack trace useless. If 'file program' shows
the word 'stripped' or 'nm program' shows no output then it
is likely that the executable is stripped of symbolic information.
In this case the problem tool must be relinked without being
stripped - on most Unix platforms this involves ensuring there is
no '-s' option on the link line. Contact Oracle Support with
details of the link line used to link the tool.
HP Unix
Some platforms like HP Unix need a special object file linking
in at link time to ensure symbols in shared objects can be
reported by the debug tool. Typically this involves relinking the
tool including /usr/lib/end.o on the link line. The location of
this special file may be different depending on your HPUX
version. 'xdb' generally tells you the location of this file
if it was not linked into the executable.
If 'file core' does not return the executable name:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Try using the 'strings' command:
csh> setenv LANG C
or
ksh> LANG=C;export LANG ...to get rid of non-ASCII characters return by 'strings'
> strings -a core | more
The first part of the output may reveal the executable name.
References
NOTE:118252.1 - How to Process an Express Core File Using dbx, dbg, dde, gdb or ladebug NOTE:362791.1 - STACKX User Guide |