|
Projects : Beowulf Cluster
|
|
Beowulf Cluster - Cost-effective Supercomputing
Beowulf clusters are the cost-effective approach to supercomputing. By
fully utilizing the capacity of "commodity off the shelf" hardware and
the performance of open source software, Beowulf clusters offer massive
parallel-computing power that rivals traditional supercomputers at a
fraction of the cost.
At Chaogic Systems, we custom design and manufacture Beowulf clusters
according to specific application needs. Whether it's scientific,
engineering, or financial, we carefully analyze the nature of intended
applications, and optimize critical system hardware/software
configurations to deliver supercomputing solutions that minimize
computational bottlenecks and maximize application peak performance.
One of our first clusters was a 64 processor dual Athlon cluster built for
a University of Houston
physics research group headed by
professor Kevin Bassler. Prof. Bassler's research interests include
complex dynamic systems, such as lattice structures of superconducting
materials and emergent behaviors of simple agent networks. His work often
involves mathematical modeling and simulations that require massive
computing power.
The cluster had a peak performance of 51.96 Gflops with Linpack benchmark,
where N is set to 60000. Linpack benchmark solves a dense linear system
of order N using Gaussian elimination. It is the industry standard for
benchmarking supercomputers. In comparison, a typical Intel P4 2GHz
desktop PC has roughly 1 Gflops of computing power.
Hardware/Software Configurations
Dual-CPU Computing Nodes (x 32):
AMD Athlon MP 2200+ CPU (x 2)
Corsair 1GB PC2100 Registered ECC DDR RAM
Tyan Tiger MPX S2466N Motherboard
Seagate 40G 7200RPM IDE Hard Drive (Local Caching)
Built-in 3Com 3C920 100Mbit NIC
Dual-CPU Master Node:
AMD Athlon MP 2200+ CPU (x 2)
Corsair 2GB PC2100 Registered ECC DDR RAM
Tyan Thunder K7X S2468UGN Motherboard
Seagate 36.7G 10000RPM U160 SCSI Hard Drive (x 2)
Built-in 3Com 3C920 100Mbit NIC (x 2)
3Com 3C996B-T Gigabit Copper Server NIC
3Com 3C996-SX Gigabit Fiber-SX Server NIC
RAID Array Storage:
Raidstor 8 Bay IDE Array w/ 120GB HD, 840GB @ RAID5
Raidstor 12 Bay IDE Array w/ 200GB HD, 2.20TB @ RAID5
Inter-connect:
KTI 24 port 100Mbit Switch w/ 800Mbit Trunking (x 3)
Software:
Operating System: RedHat Linux 7.3 / Kernel 2.4.18
Message Passing Library: MPICH 1.2.3, LAM 6.5.6
Math Library: ATLAS 3.4.1, LAPACK 3.0
Scheduler: OpenPBS 2.3.16
Linpack Benchmark Results
============================================================================
HPLinpack 1.0 -- High-Performance Linpack benchmark -- September 27, 2000
Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK
============================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 60000
NB : 60
P : 8
Q : 8
PFACT : Right
NBMIN : 4
NDIV : 2
RFACT : Right
BCAST : 1ringM
DEPTH : 1
SWAP : Mix (threshold = 64)
L1 : transposed form
U : transposed form
EQUIL : yes
ALIGN : 8 double precision words
----------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual checks will be computed:
1) ||Ax-b||_oo / ( eps * ||A||_1 * N )
2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 )
3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W11R2R4 60000 60 8 8 2771.39 5.196e+01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0522793 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0127078 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0024242 ...... PASSED
============================================================================
Finished 1 tests with the following results:
1 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
----------------------------------------------------------------------------
End of Tests.
|