![]() |
Online Tools Sponsored by:
|
||||||
HPC Is a Technology in Flux Part 1: The role of hardware in the age of the personal supercomputer. | Published July 2, 2008
Computing power available for engineering has grown exponentially recently, riding the surge of processing power that is driven by Moore’s law. But in the past five years, a new development has moved beyond the ability to produce faster processors to redefine the analytical limits that once tied the hands of engineers. The evolution of high-performance computing (HPC) has harnessed principles used by traditional supercomputers — multiple, usually commercial processors linked in a single system — to push the level of computational power even further. This technology allows the engineer to use compute-intensive processes to work with larger data sets and take on complex problems that many desktop systems cannot adequately handle. What Is HPC? HPC systems are, at one level, similar to the types of systems running on your desktop. The basic components are the same in many respects, primarily commercial off-the-shelf hardware. Both types of systems are predominantly based on x86 technology (see Figure 1, page 38). And in both cases, vendors attempt to match the capability of the memory, I/O system, and bus structure. In each case, however, there are bottlenecks, based on the demands of the system.
In terms of differences, the desktop system is optimized for graphics and MCAD and shoehorned into a form factor that will fit either under or on top of the desk. As a consequence, it relies on shared memory. On the other hand, HPC systems are not necessarily set up to run MCAD packages because they don’t do well with interactive graphics and are usually based on distributed-memory architecture. You can now buy a four-socket workstation (to get eight cores on a desktop), but because of the shared-memory architecture and its memory-bandwidth limitations, you might not be able to use all the cores efficiently. In contrast, HPC systems have better memory bandwidth and thus are able to deliver more efficient use of cores. HPC systems have certain fundamental characteristics. All are based on a server environment where the processing takes place. Further, according to Steve Conway, research VP for HPC at IDC, a leading provider of market intelligence for information technology, about two-thirds of HPC systems use cluster architecture (see Figure 2, below). Clusters consist of multiple processors linked together to form a single system. This ensures a high density of processors, giving the systems their “supercomputer” power.
The power of the systems is further enhanced through the use of high-end components. “Usually, these systems use the fastest processors, much larger amounts of memory, and much higher I/O capabilities,” says Greg Clifford, HPC automotive segment manager at IBM. The essential element of HPC is parallel processing, which breaks an application or project into pieces so separate processors can execute each simultaneously. This approach maximizes the advantage of the processing power and allows HPC systems to take on floating point–intensive applications. And herein lies the key benefit: “You can do more testing, more simulation, in a shorter amount of time. The end result is speed to market,” says Peter Lillian, senior product manager for Dell HPC Solutions. Another characteristic of HPC systems is their modular structure, where cores, nodes, servers, and clusters can be added or removed with relative ease, making them flexible and scalable. “You start out with these building blocks, and you can either build a house or a cathedral,” says Lynn Lewis, Jr., executive alliance manager for manufacturing at SGI. HPC systems can also be defined by their relationship to those who use them. “These systems are not under the sole and immediate control of an individual user or even an individual team,” says Michael Schulman, HPC product line manager at Sun Microsystems. As stated earlier, HPC is a technology in flux. For example, in 2000, nearly 100 percent of the systems in use had a Unix operating system. Last year, Linux became dominant, with 66 percent of the market. According to IDC’s Conway, “Linux has penetrated very fast and very hard (see Figure 3, page 39). We expect Windows to have an important place and to grow in the market. It entered around 2001. But right now it has flattened out at around 4 percent.” On the other hand, interconnects, which play a critical role in parallel processing, seem to have stabilized. Here the dominant technologies are gigabit Ethernet, Myricom/Myrinet, and Infiniband.
Building the System With the technology changing so rapidly, it’s important to be aware of the limitations of its operational life. “Two years is about the end of the lives of x86-based systems,” says SGI’s Lewis. “Their usability diminishes. You will have a high-performance compute system that will be two years in the compute room, focused on high-performance technical computing, and then you can take the same system, which has outlived its state-of-the-art usefulness, and do your financials on it or make it a Web server.”
And the cost for that flexibility can range from $5,000 to $250,000 range, and into the millions for large enterprise systems. Hardware Players & Percentages “If you go back 10 years, hardware providers serving the CAE market were very diverse,” says Knute Christensen, manager of the HPC marketing group for HP. “Nobody probably had more than 25 percent. There were a lot more players, and there were a lot more choices.…We don’t see a startup emerging that would be a serious competitor in the short term. One of the reasons is that HP and IBM have economies of scale.“ While the vendors with large market share enjoy economy of scale, those at the other end of the spectrum have their own claim to fame. “SGI is one of the two companies that are really focused in this area,” says SGI’s Lewis. “SGI and Cray Research are the folks that are still really dedicated. This is our market, and this is what we do.” A Question of Balance Various types of applications tend to stress different components in the system. With some applications, the I/O capability does not need to be high, but with another, I/O performance is essential. “You have to get data to the processor,” says SGI’s Lewis. “Moving data around is the key to success.” A major bottleneck is memory access. All too often, data sits in memory, and the processors cannot get to it fast enough. If the processors must wait for data, everything slows down. “Memory speed is more important than the CPU speed,” says Sun’s Schulman. “If you are using the different cores and they are all accessing the same memory, you have a contention problem…. You want to let as many cores on a chip run a single application. You don’t want two people running different applications at the same time…. Most of the applications in the CAE world can scale to a minimum of four cores. So we are seeing a good gain as we go up to four cores. Beyond that, we’re not seeing that great a performance increase.“ One way to optimize performance is to use distributed resource management software, which identifies inactive servers and directs jobs to them. This avoids overloading any one server and keeps them all busy without oversubscribing any. IBM, Hewlett-Packard, Intel, Microsoft, and AMD have introduced reference architectures to ensure the balance that optimizes performance. These systems pre-integrate and pre-test hardware and software together.
Future Trends Because the demand for more computing power will be eventually unmet by simply increasing the clock speed of processors, companies will turn to more and larger HPC implementations, consisting of greater numbers of processing cores. “Most of the increase in performance is going to come through people scaling up the size of their clusters,” says IBM’s Clifford. “Increasing the number of cores in a data center is going to be a continued trend.” Technology leaders expect this trend will be supported by improvements in software’s ability to take advantage of HPC systems’ processing power. “You are going to see a lot more proliferation of clusters into the manufacturing and commercial environments because the applications will start taking better advantage of it — because the applications will have been written to do parallel computing,” says Dell’s Lillian. According to Clifford, “You are going to see more and more of analysis techniques such as stochastic analysis or design of experiment, which can leverage the compute power more than we have seen in the past.” Of course, as system size increases and use expands, power and cooling issues grow. Even for a small engineering firm, an HPC system consisting of just one partially filled rack can become a serious issue. Energy costs will be more of a concern, but companies are working to improve energy efficiency of systems at the same time. Finally, while the demand for the capabilities of HPC systems may be growing, the resources to purchase and support them are not always available. One solution to this problem is Capacity on Demand, a service offered by IBM. Companies of all sizes can go to the IBM data center and rent system time by the day, week, or month — whatever is required to perform the task. This way, they don’t have to purchase equipment that might not be used continuously. More Info: Cray Research Dell Hewlett-Packard IBM IDC InfiniBand Trade Association Intel Corp. Microsoft Corp. Myricom SGI Sun Microsystems Tom Kevan is a New Hampshire-based freelance writer specializing in technology. Send your comments about this article to DE-Editors@deskeng.com.
|
|||||||