米国 SGI Altix 330 Benchmarking vs. D-Core Opteron

2006-02-07 | SuperComputer

米 SGI Performance Report:
"Comparing the Performance of the SGI Altix 330
　with similar Dual Core Opteron-based Systems", 02.02.2006.
　Robert Comperts, principal scientist, SGI and
　Martin Higeman, chemical application enginee, SGI
　http://www.sgi.com/pdfs/3905.pdf

　1.0 Introduction
　"SGI Applications Engineering performed benchmark tests to
　 compare the performance of the SGI Altix 330 using Intel
　 Itanium2 processors against comparable AMD dual core
　 Opterin based systems."

　"Standard benchmark test cases from the following chemistry
　 applications were used:
　・Gaussian 03 rev C.02 (Gaussian)
　・DMol3 from Accelrysョ Materials Studioョ (Accelrys)
　・CASTEP from Accelrys Materials Studio (Accelrys)
　・VASP (University of Vienna),
　・NAMD (University of Illinois at Urbana Champagne),
　・Sander (Amber), and
　・PMEMD (Amber) "

　"Results of the tests revealed that the SGI Altix 330 using shared
　 memory and unmatched I/O throughput achieved superior
　 performance and ran faster than twice the number of cores in
　 comparable AMD dual core Opteron-based systems."

　"In addition, a Parallel Throughput System test revealed that the
　 SGI Altix 330 built on SGI NUMAflex architecture experienced
　 only 1% degradation in performance when running Gaussian
　 test397 on a fully loaded system (24.4 minutes on a fully loaded
　 system compared to 24.1 minutes in a stand alone test) as compared
　 to a 33% degradation in performance for the comparable
　 AMD dual core Opteron-based system (71 minutes on a fully
　 loaded system compared to 53 minutes in a stand alone test)."
　私自身はこのレポートはざっと目を通しただけですが、競争相手とのベンチマークなので、
　Opteron側の環境については十分注意しながら読み進めた方が良さそうです。
　(Opteronでは AMD Core Math Library (ACML)使用されていないという情報もあります)
　個人的に先ず初めに気付いたのは、後述するインターコネクト情報です。

米国 SGI Altix 330
　http://www.sgi.com/products/servers/altix/330/

SGI Altix 330は独自の NUMAflexでインターコネクトを実現していますが、Opteron側の
情報が見つかりません。Cray XD1, PathScale InfiniPath HTX,
PCI-Expressベースの InfiniBand, Quadrics Elan4, MyriNet, 10GbEではかなりの
性能差 (個性) が生じています。
　NUMAflexでは、6.4GB/secondと記載されていますが、データ重視 (I/Oインテンシブ) な
アプリケーションの場合、NUMAシステムでは I/Oバランスを十分に考慮する必要があります
(この問題は、いずれの機会に書きます)。
　Top500の 4位にランクインしている NASA Columbia: SGI Altix 3700 (10,240 tanium2)
では、SGI NUMAlinkをベースとしていますが、Voltaire Grid Director ISR 9288, a 288 port
Multi-service switchmo利用しています。
　NASA Project: Columbia
　　http://www.nas.nasa.gov/About/Projects/Columbia/columbia.html
　Voltaire Grid Director ISR 9288
　　http://www.voltaire.com/isr_9288.htm
NUMALinkと InfiniBandの使い分けに非常に興味があります。プロセッサ間のデータ通信は
NUMALink経由で行い、ストレージには一定のプロセッサボックスごとに割り当てられた
InfiniBand経由で Voltaire Grid Directorから Fibre Channelストレージにアクセスしている
と推測しています(データの質/目的による物理的なデータ経路の分離)。
　Voltaire Grid Director ISR 9288は東京工業大学の新規システムでの採用され、Opteron
クラスター (InfiniBand) とストレージ (Fibre Channel) とのルーティングを実現する予定です。
トータルパフォーマンスを向上させるにには、足腰のしっかりしたシステムの実現が必須です。
　東工大のシステムで思い出しましたが、ClearaSpeed浮動小数点演算プロセッサは
133MHz PCI-X経由で接続するようですが、将来的には HTX経由で HypereTransport
接続になって欲しいですね。
　"最近の話題", Ando's Processor Information Page, 2006年1月29日
　　http://www.geocities.jp/andosprocinfo/wadai06/20060129.htm
　　"１．AMDの浮動小数点演算アクセラレータ"

高速インターコネクトを活用したクラスター構成でも "シングルシステムイメージ" の環境を
ユーザに提供することは可能です (私は実現しました)。個人的には NUMA構成による
構築は
"シングルシステム"と考えています (シングルシステム "イメージ" では無い！)。

例えば、
　TruCluster/VAXclusterクラスタ技術に関する資料 [05/01/20], 2006-01-07
　Oracle Cluster File System on Linux [05/08/13], 2006-01-07
を参照して下さい。

米国(欧米)の SGIはベンチマークや最適化等の基盤技術をしっかり研究していますね。

最新の画像［もっと見る］

コメントを投稿

ブログ作成者から承認されるまでコメントは反映されません。

goo blog お知らせ

	ブログを読むだけ。毎月の訪問日数に応じてポイント進呈
	gooブロガーの今日のひとこと
	訪問者数に応じてdポイント最大1,000pt当たる！
	goo blogは20周年を迎えました！

日	月	火	水	木	金	土
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

徒然なるままに

Mail: topography "AT" mail.goo.ne.jp

米国 SGI Altix 330 Benchmarking vs. D-Core Opteron

コメントを投稿