نبذة مختصرة : Energy consumption is by far the most important contributor to HPC cluster operational costs, and it accounts for a significant share of the total cost of ownership. Advanced energy-saving techniques in HPC components have received significant research and development effort, but a simple measure that can dramatically reduce energy consumption is often overlooked. We show that, in capacity computing, where many small to medium-sized jobs have to be solved at the lowest cost, a practical energy-saving approach is to scale-in the application on large-memory nodes. We evaluate scaling-in; i.e. decreasing the number of application processes and compute nodes (servers) to solve a fixed-sized problem, using a set of HPC applications running in a production system. Using standard-memory nodes, we obtain average energy savings of 36%, already a huge figure. We show that the main source of these energy savings is a decrease in the node-hours (node_hours = #nodes x exe_time), which is a consequence of the more efficient use of hardware resources. Scaling-in is limited by the per-node memory capacity. We therefore consider using large-memory nodes to enable a greater degree of scaling-in. We show that the additional energy savings, of up to 52%, mean that in many cases the investment in upgrading the hardware would be recovered in a typical system lifetime of less than five years. ; Peer Reviewed ; Postprint (published version)
Relation: http://dl.acm.org/citation.cfm?id=2989083; info:eu-repo/grantAgreement/SEV-2015-0493; info:eu-repo/grantAgreement/MINECO/1PE/TIN2015-65316-P; info:eu-repo/grantAgreement/EC/H2020/671578/EU/European Exascale Processor Memory Node Design/ExaNoDe; info:eu-repo/grantAgreement/SVP-2014-068501; Zivanovic, D., Radulovic, M., Llort, G., Zaragoza, D., Strassburg, J., Carpenter, P., Radojkovic, P., Ayguade, E. Large-memory nodes for energy efficient high-performance computing. A: International Symposium on Memory Systems. "MEMSYS 2016: proceedings of the Second Intaernational Symposium on Memory Systems: Alexandria, VA, USA: October 03-06, 2016". Alexandria, VA: Association for Computing Machinery (ACM), 2016, p. 3-9.; http://hdl.handle.net/2117/97864
No Comments.