CN / EN

HOME >> News center >>Industry dynamic >> In-depth analysis of server fault tolerance technology
Details

In-depth analysis of server fault tolerance technology

Time:2019-04-09     


Servers have higher availability and reliability than PCS.With the continuous deepening of informatization and the advancement of IT informatization of key business platforms, servers are facing the heaviest pressure ever, especially the application demands of industries and departments such as ISP, NCP, finance, telecommunication, securities, energy and scientific research, which constantly challenge servers.


This challenge is essentially a 24/7 operation.How to ensure the normal operation of the server in emergencies, and ensure that the failure will not bring business interruption, has become the top priority of the technology of server fault tolerance.


"Fault tolerance", as the name implies, is the server's ability to accommodate and correct errors and faults generated in the system operation. It is the goal pursued for server stability in enterprise applications.People commonly known as 99.999% is an intuitive reflection of the demand for high stability of the server system.Fault-tolerant servers, which allow for certain errors (failures), typically have functional modules that automatically repair and support redundancy.When errors or failures occur, these parts can be repaired or switched in time to ensure that the server is running continuously.At present, the fault tolerant technology of server mainly focuses on server cluster, dual redundancy backup and single fault tolerant technology.


Server fault-tolerant technology has not been around for a few years, but has been around since the 1980s.He is Tolerant of errors, a word that originated with the then famous Stratus company.In the 1980s, the first generation of fault-tolerant technologies began to enter the commercial field.At that time, it was mainly used in finance, telecommunications, securities, aviation and other industries.


Subsequently, the server fault tolerance technology has been further developed, and has experienced the development of the second generation of I860, the third generation of HP pa-risc, and the fourth generation of IA architecture fault tolerance technology.At present, the server fault-tolerant technology is more focused on a single server.Compared with other methods, this method has lower cost, higher fault tolerance and can meet the needs of most users.Next, we will focus on single and dual (redundant) fault-tolerant technologies.


As mentioned earlier, server fault-tolerant technology is mainly composed of server cluster, two-machine hot backup and single-machine fault-tolerant technology.Among the three server fault tolerant technologies, they are progressive from low level to high level, that is, the single machine fault tolerant technology has the highest level, while the cluster technology has the lowest level.


Dual machine hot backup technology is a system - level fault - tolerant technology.Typically, they are an additional Shared disk array in addition to two servers, or a RAID array in two servers, and are implemented in conjunction with the corresponding dual-machine hot backup software.


Dual machine hot standby fault-tolerant technology, mainly "double insurance" mechanism to ensure that any one of the servers failure, timely switch by another machine and ensure the continuous operation of business.However, since this approach often requires another server to be always in a standby state, there is a certain waste in the investment of hardware facilities and the utilization of computing resources.


In contrast, single machine fault tolerance technology is mainly achieved by means of component redundancy.The fault tolerance ability of this single machine fault tolerance technology is higher than that of server cluster and dual machine hot standby.


Fault-tolerant servers typically perform redundant backups of cpus, memory, disks, network CARDS, and even power supplies without causing system downtime or data loss in the event of any component failure.Many x86 servers based on industry standards can implement this kind of redundant fault-tolerant mechanism in a more cost-effective way.


Fault-tolerant servers are designed to minimize the impact of failures through redundant design and synchronization of hardware components.At present, fault-tolerant servers mainly focus on processors. At present, many server manufacturers have their own fault-tolerant servers.


HP, for example, provides NonStop(including NonStop S and Integrity NonStop) servers that focus on business-critical fault-tolerant technologies, which are divided into two categories according to different processors, namely, NonStop S with MIPS and Integrity NonStop servers with Intel entgroup chips.


Integrity NonStop has a number of new designs, and its family of products includes entry-level, mid - and high-end servers.Last year HP also expanded the itanium server family, with NS2100 and NS2200 for heterogeneous environments.


There are also two well-known manufacturers of fault-tolerant servers, including NEC and Express5800/ft servers and ftServer servers from Stratus.The latter has relatively mature experience in the field of fault-tolerant server technology, and has developed server products based on different processors such as Motorola M68000, Intel I860 chips, HP PARISC and VOS proprietary operating system.The company gradually adopted a common platform based on Linux and Windows instead of the dedicated VOS operating system to reduce the application cost of fault-tolerant servers.


NEC, through its investment in Stratus, has acquired and adopted a similar strategy for developing and marketing fault-tolerant servers.In the field of fault tolerant technology, NEC launched its first fault tolerant server based on IA architecture as early as 2001.Its Express5800/ft series is 99.999% reliable on Windows and Linux platforms, and the real-time protection technology is based on the STRATUS Fundamentals of Continuous pro-cessingdesign.


At present, fault-tolerant technology has been gradually transferred from traditional key application industries such as telecommunications, securities and finance to basic industries, such as manufacturing, energy, logistics and transportation.In addition, fault-tolerant servers will focus more on TCO's overall cost of ownership, and more users will abandon the traditional two-machine hot standby approach to maintain complex cluster servers in favor of server platforms with fault-tolerant technologies.


Copyright @ 2018 . All rights reserved.