/usr/share/openmpi/help-mpi-btl-tcp.txt is in openmpi-common 2.0.2-2.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | # -*- text -*-
#
# Copyright (c) 2009-2016 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2015-2016 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# This is the US/English help file for Open MPI's TCP support
# (the openib BTL).
#
[invalid if_inexclude]
WARNING: An invalid value was given for btl_tcp_if_%s. This
value will be ignored.
Local host: %s
Value: %s
Message: %s
#
[invalid minimum port]
WARNING: An invalid value was given for the btl_tcp_port_min_%s. Legal
values are in the range [1 .. 2^16-1]. This value will be ignored
(reset to the default value of 1024).
Local host: %s
Value: %d
#
[client connect fail]
WARNING: Open MPI failed to TCP connect to a peer MPI process. This
should not happen.
Your Open MPI job may now fail.
Local host: %s
PID: %d
Message: %s
Error: %s (%d)
#
[client handshake fail]
WARNING: Open MPI failed to handshake with a connecting peer MPI
process over TCP. This should not happen.
Your Open MPI job may now fail.
Local host: %s
PID: %d
Message: %s
#
[accept failed]
WARNING: The accept(3) system call failed on a TCP socket. While this
should generally never happen on a well-configured HPC system, the
most common causes when it does occur are:
* The process ran out of file descriptors
* The operating system ran out of file descriptors
* The operating system ran out of memory
Your Open MPI job will likely hang (or crash) until the failure
resason is fixed (e.g., more file descriptors and/or memory becomes
available), and may eventually timeout / abort.
Local host: %s
PID: %d
Errno: %d (%s)
#
[unsuported progress thread]
WARNING: Support for the TCP progress thread has not been compiled in.
Fall back to the normal progress.
Local host: %s
Value: %s
Message: %s
#
[peer hung up]
An MPI communication peer process has unexpectedly disconnected. This
usually indicates a failure in the peer process (e.g., a crash or
otherwise exiting without calling MPI_FINALIZE first).
Although this local MPI process will likely now behave unpredictably
(it may even hang or crash), the root cause of this problem is the
failure of the peer -- that is what you need to investigate. For
example, there may be a core file that you can examine. More
generally: such peer hangups are frequently caused by application bugs
or other external events.
Local host: %s
Local PID: %d
Peer host: %s
#
|