Sunday, June 15, 2008

IBM HTTP Server child process core dumps

IBM HTTP Server child process core dumps failing to serve some or all of the requests. You might see the following Segmentation fault errors in the error.log of the server.

[Sun Feb 10 19:26:52 2008] [notice] IBM_HTTP_Server/6.1.0.9 Apache/2.0.47 (Unix) configured -- resuming normal operations
[Sun Feb 10 19:26:52 2008] [notice] CoreDumpDirectory not set; core dumps may not be written for child process crashes
[Sun Feb 10 19:27:20 2008] [notice] child pid 7755 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:21 2008] [notice] child pid 7756 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:23 2008] [notice] child pid 7758 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:25 2008] [notice] child pid 7759 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:26 2008] [notice] child pid 7760 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:27 2008] [notice] child pid 7761 exit signal Segmentation fault (11)
[Sun Feb 10 19:27:37 2008] [notice] child pid 7762 exit signal Segmentation fault (11)
[Sun Feb 10 19:29:36 2008] [notice] child pid 7763 exit signal Segmentation fault (11)
[Sun Feb 10 19:29:49 2008] [notice] child pid 7771 exit signal Segmentation fault (11)
[Sun Feb 10 19:30:21 2008] [notice] child pid 7772 exit signal Segmentation fault (11)
[Sun Feb 10 19:30:27 2008] [notice] child pid 7773 exit signal Segmentation fault (11)
[Sun Feb 10 20:12:12 2008] [notice] child pid 8056 exit signal Segmentation fault (11)

In our case the problem appears to be that we had mistakenly specified the ResponseChunkSize value that is too high like 400000. This value is the number of 1024 byte pages so it is multipled by 1024 which exceeds an acceptable value. so setting it to a lower value of 4000 seems to have resolved the issue. It took several days before we could figure out the issue after working with IBM, and these are following steps you can do that might help debugging the issue if in your case is not related to the ResponseChunkSize.

Please do the following:

1. To set up system to obtain cores:

http://publib.boulder.ibm.com/httpserv/ihsdiag/coredumps.html

2. Make sure you have latest IHSDIAG for debugging IHS issues.

http://www-1.ibm.com/support/docview.wss?uid=swg24008409

3. Run IHSDIAG against core that is obtained:

http://publib.boulder.ibm.com/httpserv/ihsdiag/gather_crash_doc.html

4. Run IHSDIAG against good system to get system information:

http://publib.boulder.ibm.com/httpserv/ihsdiag/describeconfig.html

No comments: