Data-Intensive Scalable Computing Laboratory (DISCL)

Resources

The research infrastructure of the Data-Intensive Scalable Computing Laboratory (DISCL) in the Computer Science Department at Texas Tech University includes a 16-node 128-core Dell PowerEdge cluster DISCFarm, advanced storage devices, a 5-node GPU cluster testbed (dual quad-core Intel Xeon E5-2660 processors and dual 512-core NVidia M2090 GPUs on each compute node with advanced SSD storage and Mellanox QDR InfiniBand interconnection), workstations, and other advanced computing and communication facilities.

Data-Intensive Scalable Computing Instrument (DISCI)

Hostname: disci.hpcc.ttu.edu

Data-Intensive Scalable Computing Instrument (DISCI) consists of one head node, sixteen compute nodes, sixteen data-intensive nodes and five data storage nodes. The head node connects users to build and launch their jobs. It contains two Xeon E5-2650v2 2.6 GHz 8-core processors with 64 GB memory. Compute nodes are mainly responsible for executing parallel jobs. Each compute node includes two Xeon E5-2450v2 2.5 GHz 8-core processor with 32 GB memory. Data-intensive nodes are designed to conduct in-situ processing to minimize the data movement. Each data-intensive node has two Xeon E5-2450v2 2.5 GHz 8-core processor with 96 GB memory. The storage node is equipped with two Xeon E5-2650v2 2.6 GHz 8-core processor with 128 GB memory.

DISCI cluster is connected through four-channel quad data rate (QDR) Infiniband networking that provides 40 GB/sec bandwidth among the parallel nodes. Each storage node is attached with four data intensive nodes. Infiniband switch with 36 ports allows full connection of data intensive nodes and compute nodes.

DISCI has a storage system capable of providing storage for up to 78.4 terabyte of data. The storage system are combined Hard disk drives (HDDs) with flash-memory based SSDs to maximize the data access performance. The head node consists of two 300 GB 15K RPM SAS HDDs. Each compute node installs a 500 GB Near-line SAS HDD for computing tasks. The data-intensive node is equipped with three 500 GB SAS HDDs and two 200GB SSDs as the online storage for applications. Each storage node uses eight 4 TB Near-line SAS HDDs as offline storages.

DISCI is used as the experimental platform and expected to overcome the shortcoming of existing computing-centric HPC instruments toward data-intensive scientific computing. DISCI will increase the productivity of data-intensive discovery and innovation in many disciplines, such as computational chemistry, climate sciences, computational biology, high-energy physics, and information retrieval. DISCI will also offer enhanced training in scientific computing and HPC.

Front, Head node&compute nodes

Back, Connection

Data-Intensive Scalable Computing Farm (DISCFarm)

Hostname: discfarm.hpcc.ttu.edu

The Dell PowerEdge DISCFarm cluster, Data-Intensive Scalable Computing Farm (DISCFarm), is composed of one PowerEdge R515 rack server node and 15 PowerEdge R415 nodes, with a totally of 32 processors and 128 cores. The nodes are connected fully via PowerConnect 2848 network switch with 48 1Gigabits Ethernet Ports. The PowerEdge R515 server node has dual quad-core 2.6GHz AMD Opteron 4130 processors, 8GB memory, and a RAID-5 disk array with 3TB storage capacity composed of 7200 RPM Near-Line SAS drives. Each PowerEdge R415 node has dual quad-core 2.6GHz AMD Opteron 4130 processors, 4GB memory and a 500GB 7200RPM Near-Line SAS hard drive. Two PowerEdge R415 nodes are equipped with CRUCIAL Technology RealSSD C300 64GB SATA 6GB/s SSD. All nodes are connected to a KVM and controlled by Dell 1U KMM Console with Touchpad Keyboard and 17 LCD.

The DISCFarm cluster is used as experimental platform for research on parallel and distributed computing, parallel I/O and storage, high-performance computing, Cloud computing, computer architectures and systems software with a focus on building scalable computing systems for data-intensive applications in high-performance scientific computing/high-end enterprise computing. They are also used as testbed for students’ course projects from Computer Science and other departments in Texas Tech University.

Front, Head node&compute nodes

Back, Connection

Hadoop Cluster (via StackVelocity and CAC@TTU)

Hostname: hadoop.hpcc.ttu.edu

This Hadoop cluster is made possible via StackVelocity and CAC@TTU.

GPU Cluster Testbed

Front Node Inside Node

Hardware Specifications

This cluster(website) consists of 5 nodes: 1 head node and 4 compute nodes. The detailed hardware descriptions for each node is as follows:

Head Node

CPU: 2 x Intel Xeon E5430 (Details)

RAM: 16GB (8x2GB) @ DDR3-1333Mhz

HDD: 7200 RPM Hard Disk Drive (~106GB of NFS drives)

GPU: none

Ethernet: connected

Infiniband: Mellanox QDR (Quad Data Rate - 40Gb/s)

Compute Nodes (per node specs)

CPU: 2 x Intel Xeon E5-2660 (Details)

RAM: 64GB (8x8GB) @ DDR3-1333Mhz

HDD: Various combinations of 100GB-200GB Dell Solid State Drives

GPU: 2 x NVidia Tesla M2090 (Details)

Ethernet: connected

Infiniband: Mellanox QDR (Quad Data Rate - 40Gb/s)

Software Installed

Operating System: Rocks+ (version 6.0.1) by StackIQ (Details)

Compilers:

GNU - version 4.4.6

OpenMPI - version 1.4.3

MVAPICH - version 1.2.0

CUDA (nvcc) - version 4.2

Intel (2013 compilers coming soon)

Acknowledgment: We are grateful to Dell Inc. and NVidia for their generous sponsorship and donation of this cluster testbed.

Storage Devices

The DISCFarm cluster testbed has 11TB total storage capacity. The headnode has a RAID-5 disk array with 3TB storage capacity composed of 7200 RPM Near-Line SAS drives. Each PowerEdge R415 node has a 500GB 7200RPM Near-Line SAS hard drive. Eight nodes are configured with a parallel file system PVFS2 (Version 2.8.2 as of October 2011) with 4TB shared storage capacity.

1TB NearLine SAS 7200RPM HardDrive

500GB NearLine SAS 7200RPM HardDrive

Two CRUCIAL Technology RealSSD C300 64GB SATA 6GB/s SSDs are installed on two PowerEdge R415 nodes for research of improving SSD performance for data-intensive applications.

RealSSD C300 64GB SSD

PCIe OCZ RevoDrive X2, 100GB :

PVFS2-IO-25-17

PVFS2-IO-25-16

Access : /mnt/ssd/ssd(1~4)

revodrive.jpg

OCZ VERTEXZ 120GB

PVFS2-IO-25-15

PVFS2-IO-25-14

Access : /mnt/ssd

(Now located on compute-0-2 and compute-0-6)

vertexz.jpg

Crucial M4 SSD 64GB

PVFS2-IO-25-13

PVFS2-IO-25-12

Access : /mnt/ssd

crucial.jpg

CORSAIR Force SATA 3, 120GB

PVFS2-IO-25-11

Access : /mnt/ssd

(Now located on compute-0-3)

corsair.jpg

INTEL SSDSC2MH12, 120GB

Located on compute-0-4

INTEL SSDSA2BZ10, 100GB

Located on compute-0-5

GPU boards

GeForce GTX 480: Based on the Fermi third-generation Streaming Multiprocessor (SM) architecture, GeForce GTX 480 features 480 CUDA Cores with a peak performance is 1344.96 GFLOPs. The clock speed for the CUDA Cores is 1401MHz. It has1536MB (1.5GB) of GDDR5 memory with a memory interface of 384-bit and peak theoretical bandwidth of 177.4GB/s.

Quadro FX 5800: Based on the NVIDIA Unified Architecture technology, Quadro FX 5800 features 240 CUDA Cores with a peak performance 933.12 GFLOPs. The clock speed for the CUDA Cores is 650MHz. It has 4096 MB (4GB) of GDDR3 memory with a memory interface of 512-bit and peak theoretical bandwidth of 102GB/s.

Workstations

8 Apple iMac 27-inch Desktop

Processor 2.7GHz Quad-Core Intel Core i5

Memory 4GB 1333MHz DDR3 SDRAM - 2x2GB

HardDrive 1TB Serial ATA Drive

Graphics AMD Radeon HD 6770M 512MB GDDR5

3 Dell Desktop

Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz

2GB DDR2 SDRAM

160GB HardDrive

Printers

HP LaserJet 9050dn Printer & HP color laserjet 5550dn printer

IP address: 129.118.164.28

color_printer.jpg

printer.jpg