Analytic data, such as temperature and power usage of each data centers’ equipment, are being more and more crucial in High-Performance Computing industry, especially for petascale and future exascale systems consisting of tens of thousands or even hundreds of thousands of servers. These data are very important for accurate and real-time system monitoring and by analyzing them the researchers and data centers’ administrators will be able to find anomalies in their data centers and to anticipate failure as well.
Examining new ideas for data gathering and analyzing them through running different tests against a production data center could be harmful and also could have unintentional consequences. For this reason, a simulated data center, in order to enable researchers and also data center’s administrator to test their developed management and data gathering toolset against it before running that tools in their production environment, is still lacking.
The data center simulator is able to simulate numerous emulated Redfish-enabled equipment such as compute nodes (servers), cooling units, power units, network switches. We have used container technology, Docker Swarm specifically, to simulate a scalable data center and we have also used Redfish Mockup Servers to emulate a Redfish-enable equipment.The proposed Data Center Simulator, by using container technology, increases the efficiency, reusability, simplicity, and most importantly scalability.
Redfish Mockup Servers, on the other hand, has been adjusted to react as similar as possible to a real equipment. We have also used a different kind of Mockup Servers with different datasets and different adjusted response time to make our infrastructure heterogeneous and consequently to add more fidelity to it.
In a data center the physical parameter data, such as temperatures, power usage, and fan speed could be collected from numerous data sources. These sources range from manufacturer embedded sensor on the server motherboard to the self-contained sensor devices produced by different companies.
In order to simulate a data center, first of all, we need to emulate Redfish-enabled data center equipment including compute nodes (servers), cooling equipment, power equipment, and network equipment. Different tools provided by DMTF have been used to create mockup datasets for data center equipment.
Redfish Mockup Creator is a Python 3.4 program that creates a Redfish Mockup directory structure from a real live Redfish service. The program executes Redfish GET requests to a Redfish service and saves all the responses in a directory structure as a dataset. This dataset could be considered as a snapshot of the target Redfish service. Redfish Mockup Creator has been developed by DMTF and is publicly available from official DMTF’s GitHub page at “https://github.com/DMTF/Redfish- Mockup-Server”
Redfish Mockup Server tool is Python program that use mockup datasets to serve Redfish requests. As a server, it can serve a REST’s GET request based on the data inside a mockup and is also able to serve REST’s PATCH, POST and DELETE commands to update the corresponding data inside the mockup. Redfish Mockup Server tool has also been developed by DMTF and is publicly available from official DMTF’s GitHub page at “https://github.com/DMTF/Redfish-Mockup-Creator”
Data Center Simulator supports at least three different kinds of Redfish-enabled servers. Redfish Mockup Creator tool has been used to take snapshots and create mockup datasets for three different kinds of servers in our supercomputer at Texas Tech University High-Performance Computing center, which is called Quanah.They have been listed in the following:
We have also used at least 15 other different mockup datasets created for different kinds of Redfish-enabled equipment as follows (These mockups are not based on real Redfish-enabled equipment and has been created manually):
For emulating an equipment we need to package all the before mentioned mockups, Redfish Mockup Server, and all necessary tools (including runtime environment, libraries, configuration files and so on) that Redfish Mockup Server needs to run in a Docker image. Docker Swarm will be used to create numerous containers based on this image afterward. This also enables us to scale each equipment out to a specific number, So we will be able to simulate a real data center with a specific number of different Redfish enabled equipment.
Docker platform has two distinct components: Docker Engine, which is responsible for creating and running containers; and Docker Hub, a cloud service for distributing containers. Further tooling has also been developed by Docker including Docker Swarm which is its native clustering manager.
Swarm is the native clustering feature of Docker. A Swarm cluster consists of one or more Docker hosts that act as Swarm Managers and a single or multiple Docker hosts that act as Swarm Workers. Docker Engine on all these hosts has to run in the Swarm mode. Swarm Managers are responsible for cluster management, orchestration, scheduling, and delegation whereas Swarm Workers run containers. To deploy an application to a Swarm cluster (in our case DMTF Redfish Mockup Server tool), a service is submitted to the Swarm Manager and then the Swarm Manager dispatches containers to the Swarm Workers. For creating a service, we also need to determine the number of replicas (the number of containers that need to be run in the service), the port that a service needs to expose to the outside world, and the network that the service’s tasks will connect to.
Docker includes different kinds of network drivers to support different types of networking. Among all network drivers, we are interested in Docker MacvLan/IPvLan Network Driver that support multi-host networking in Docker Swarm.
IPvLan is a network driver supported by Docker that use Linux kernel IPvLan network driver to expose underlay network or host interfaces directly to containers that are running on that host. The Linux implementation of IPvLan is too lightweight because they are easily associated with a Linux ethernet interface to separate networks from connectivity to the physical network. This can finally lead to the ability of removing the bridges that traditionally resides in between the Docker host network interface and container interface and let the container interfaces to attach directly to the Docker host network interface.
There are different technologies that we have put together to create Data Center Simulator. For the operating system layer, we used CentOS Linux release 7.3.1611 (Core). We have also used Docker Engine version 17.06.0-ce, which is the current version of Docker, to support containerization. For the native Docker orchestration, we used Docker Swarm mode that comes out of the box with Docker engine. We also used Docker IPvLan network driver to support networking.
How far can we scale out a data center simulation? Although it depends on different aspects of the environment that the simulation runs on, the quick answer is that for a data center simulator with hundreds of emulated equipment, we need the single-node version and for a data center simulator with thousands of emulated equipment, we need the multi-nodes version.
We have simulated a datacenter with up to 8,500 Redfish-enabled equipment by using 18 servers, as follows:
Each Swarm Worker with 16 GB of memory is able to run up to 500 emulated Redfish-enabled equipment. Theoretically, there is no limit for the simulation scale. If there are enough resources, even huge datacenters can be simulated.
Any tool that is able to use Redfish REST API could be considered as a data center simulator client. There is a wide range of tools that could be used as the simulator client, ranging from general tools such as browsers to a very specialized command line tool for accessing the Redfish API called “redfishtool”, provided by DMTF and is publicly available from official DMTF's GitHub page at “https://github.com/DMTF/ https://github.com/DMTF/Redfishtool”.
As shown in the following figure, if the machine that runs the clients is on the same network with data center simulator, then the client will be able to access all the emulated Redfish-enabled equipment in the simulator. In other words, all needed for accessing the data center simulator is assigning an IP address to the client host’s NIC in the range of the emulated Redfish-enabled equipment’s IP addresses.