2080 と 1080 ではサーバ本体の性能(Device <-> Host)は変わらないのですが、 Device to Device Bandwidth の性能はかなり異なります。
○ GeForce RTX 2080 Ti
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: GeForce RTX 2080 Ti
Device 1: GeForce RTX 2080 Ti
Device 2: GeForce RTX 2080 Ti
Device 3: GeForce RTX 2080 Ti
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 49.0
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 52.7
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 2082.6
Result = PASS
○ GeForce GTX 1080 Ti
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: GeForce GTX 1080 Ti
Device 1: GeForce GTX 1080 Ti
Device 2: GeForce GTX 1080 Ti
Device 3: GeForce GTX 1080 Ti
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 48.0
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 52.7
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 1406.6
Result = PASS
○参考:Tesla C2070
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: Tesla C2070
Device 1: Tesla C2070
Device 2: Tesla C2070
Device 3: Tesla C2070
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 21778.5
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 21781.8
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 398520.2
Result = PASS
○ GeForce RTX 2080 Ti
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: GeForce RTX 2080 Ti
Device 1: GeForce RTX 2080 Ti
Device 2: GeForce RTX 2080 Ti
Device 3: GeForce RTX 2080 Ti
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 49.0
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 52.7
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 2082.6
Result = PASS
○ GeForce GTX 1080 Ti
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: GeForce GTX 1080 Ti
Device 1: GeForce GTX 1080 Ti
Device 2: GeForce GTX 1080 Ti
Device 3: GeForce GTX 1080 Ti
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 48.0
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 52.7
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(GB/s)
32000000 1406.6
Result = PASS
○参考:Tesla C2070
# ./bandwidthTest --device=all
[CUDA Bandwidth Test] - Starting...
!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!
Running on...
Device 0: Tesla C2070
Device 1: Tesla C2070
Device 2: Tesla C2070
Device 3: Tesla C2070
Quick Mode
Host to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 21778.5
Device to Host Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 21781.8
Device to Device Bandwidth, 4 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 398520.2
Result = PASS