Using IntelOneAPI/2025.2 with IMPI v2021.16 in IMB-MPI1 v2021.10
The official documentation says:
In the case of two processes, Sendrecv is equivalent to the PingPing benchmark of IMB1.x. For two processes, it reports the bidirectional bandwidth of the system, as obtained by the optimized MPI_Sendrecv function
But when I run it on the exactly same two cores, sendrecv outperforms pingping for all sizes:
mpirun -np 2 ./IMB-MPI1 .....
| #bytes |
Mbytes/sec (PingPong) |
Mbytes/sec (PingPing) |
Mbytes/sec (SendRecv) |
| 0 |
0 |
0 |
0 |
| 1 |
4.31 |
3.68 |
7.43 |
| 2 |
8.9 |
7.35 |
14.87 |
| 4 |
16.61 |
14.47 |
28.59 |
| 8 |
32.51 |
29.6 |
58.55 |
| 16 |
68.8 |
53.84 |
118.52 |
| 32 |
141.79 |
115.19 |
238.32 |
| 64 |
255.81 |
181.23 |
373.96 |
| 128 |
500.49 |
391.98 |
807.68 |
| 256 |
927.73 |
715.7 |
1512.49 |
| 512 |
1498.66 |
1203.17 |
2466.14 |
| 1024 |
2532.21 |
1998.36 |
3949.3 |
| 2048 |
4160.18 |
3214.35 |
6291.44 |
| 4096 |
5527.35 |
4414.81 |
9261.44 |
| 8192 |
6522.92 |
5468.8 |
11183.32 |
| 16384 |
7174.1 |
6546.23 |
12956.97 |
| 32768 |
7839.75 |
8239.51 |
17546.81 |
| 65536 |
10287.31 |
8336.17 |
18407 |
| 131072 |
11192.94 |
13060.11 |
26332.25 |
| 262144 |
18100.42 |
13692.36 |
27133.69 |
| 524288 |
19710.77 |
13276.27 |
22186.71 |
| 1048576 |
14577.86 |
8797.04 |
13967.69 |
| 2097152 |
13295.74 |
7716.41 |
15339.54 |
| 4194304 |
13594.43 |
7233.45 |
14542.28 |
Using
IntelOneAPI/2025.2withIMPI v2021.16in IMB-MPI1 v2021.10The official documentation says:
But when I run it on the exactly same two cores,
sendrecvoutperformspingpingfor all sizes:mpirun -np 2 ./IMB-MPI1 .....