Hints for Debugging Parallel Programs

zoukankan html css js c++ java

Hints for Debugging Parallel Programs
Using ddd/gdb:

Use of a debugging tool like gdb can save you large amounts of time and frustration in any debugging project. But it is especially useful in debugging parallel programs. Whenever possible, avoid doing your debugging by simply adding calls to printf()! Use a debugging tool, for instance gdb, if possible. (Ordinarily I suggest using the ddd GUI to gdb. However, when debugging a parallel program, this may be difficult, as GUIs take up a lot of space on one's monitor screen.) I have a writeup on the art of debugging, and an introduction to gdb, at my debugging-tutorial Web page.
Debugging of parallel programs is particularly difficult, both because there is "too much happening at once," and because debugging tools like gdb were not designed for parallel use. However, here is how you can use gdb with MPI, PVM, the various DSM packages, and so on (see important note on page-based DSMs later on):
First, when you compile your application source code, make sure to use the -g option, to retain the symbol table for gdb.
Now, get the program running, say on the partition
```
fajita.engr.ucdavis.edu
chimi.engr.ucdavis.edu
```
Say the name of the program is Prime. A copy of Prime will now be running on each machine. You will need to go to each machine and attach gdb to these invocations of the program. To do this, type
```
ps ax | grep Prime
```
(or ps -e or ps -ux, depending on the system), and find the process number for Prime at each machine. You might find several lines of output from this, such as "tcsh Prime ..." or "rsh chimi Prime...". Ignore these; you want the line which is for the execution of Prime itself.
(Note: One way around this would be to actually initiate the execution of the program at each node via gdb itself. However, this might be difficult to do with some parallel processing library packages.)
Then type
```
gdb Prime process_number
```
and then use gdb as usual from that point on.
Note that Prime was ALREADY running at fajita and chimi! What we have done is attach gdb to two already-running processes. However, in order to keep those process from running away from you, get them to wait for you, using the following method:
In your source code define an integer variable named something like "DebugWait", initialize it to 1, and insert code like
```
while (DebugWait) ;
```
at the very beginning of main(). When you attach gdb to the two Prime processes, both will be stuck at that "while" loop line -- which is exactly what you want. Then for both of them, give the gdb command
```
(gdb) set DebugWait = 0
```
to "liberate" them. Then use gdb as usual, setting break points, single-stepping through the code and so on.
If you are using a page-based DSM, you need to tell gdb to ignore seg faults, which comprise the central mechanism for page-based DSM. To do this, issue the command
```
handle 11 nostop noprint
```
to gdb. (Seg faults are signal number 11 in UNIX.) Or better yet, place such a line in your .gdbinit startup file during the times when you are debugging your DSM programs.
Other Debugging Hints:

Make sure that you do not have any "zombie" processes still hanging around from previous debugging runs. In our examples above, for instance, our program was named Prime; make sure there aren't any old Prime processes still running, since they may interfere with new Prime processes.
Use malloc() instead of declaring static arrays. Some message-passing packages, for instances, will just quit without an error message of you have declared large (or in some cases even medium-sized) arrays.
If you find that your program still does not accept large arrays, use the Unix limit command to increase your maximum stack size.
查看全文

相关阅读:
栈的压入、弹出序列
 HM代码分析--TAppEncoder
HM代码分析--TAppDecoder
包含min函数的栈
 GMOJ 6841. 【2020.11.5提高组模拟】淘淘蓝蓝之树林
 【2020.11.5提高组模拟】总结
 dsu on tree学习总结（树上启发式合并）
GMOJ 6847. 【2020.11.03提高组模拟】通往强者之路
 2020.11.03【NOIP提高A组】模拟
 【2020.11.02提高组模拟】总结

原文地址：https://www.cnblogs.com/cy163/p/765658.html

Hints for Debugging Parallel Programs

Using ddd/gdb:

Other Debugging Hints: