Improved performance in HPC supported with expanding AVX’s floating point operation from 128 bits to 256 bits and it is effectively just doubling the throughput of HPC environment along with allowing intel’s Ethernet controller and adapters to talk directly with processor cache, effectively reducing Ethernet related latency [10]. These comprise enhanced Machine Check Architecture (MCA) features, such as MCA Recovery Execution Path, which extends software-assisted error recovery to include uncorrectable data errors, MCA I/O, which provides information on uncorrected I/O errors to the OS, and PCI Express Live Error Recovery (LER), which enables the system to contain and recover from PCI Express bus errors. Big change introduced as improvement is On-chip Direct PCI Express (Gen 3.0) of the Xeon E7 v2 and to make connection between each CPU peripheral devices over its predecessors, with the chips supporting 128 lanes of I/O in addition to the standard Intel Quick Path Interconnect (QPI) …show more content…
Other enhancements include new memory controller configurations, with two Scalable Memory Interconnect (SMI) Gen 2 links per home agent/memory controller, for a total of four links per processor socket. Intel's Jordan Creek Memory Extension Buffer connects to this and offers two DDR3 back channels per SMI, which can be configured to operate in two modes: lock-step mode for enhanced reliability or performance mode for higher performance. The chip now supports up to 1.5TB of RAM per socket, which means a 4S configuration tops out at 6TB and a 8S system at 12TB. This is a 3x improvement over the previous generation which is enabled by supporting more DIMMs (16 vs 24 per socket) at a larger capacity (32GB vs 64GB) (see Table I). The CPU comes with three Quick Path Interconnect (QPI) links at up to 8GT/s speeds. The QPI links are also utilized more efficiently using a home snoop protocol. While this would go beyond the scope of this article, it basically reduces the number of communications when a CPU asks for data that is neither in its own cache nor local RAM and thus provides better scalability at a minor increase in