Time is money. Even a few minutes of downtime can result in significant costs and cause internal business operations to come to a standstill. Downtime can also adversely impact a company’s relationship with its customers, business suppliers and partners. Reliability, or lack thereof, can potentially damage a company’s reputation and result in lost business. That's why always-on has become a global requirement and impacts every aspect of our lives. Here are some examples:

  • Healthcare – Healthcare institutions need to access patient information and be HIPPA compliant all of the time. System downtime can affect access to electronic records, quality of care, patient privacy and regulation compliance.
  • Public safety – First responders take care of emergencies 24 x 7 x 365. Application downtime can result in lack of community trust, computer-aided dispatch and public answering safety point issues and even lost lives.
  • Financial services – Financial services organizations manage thousands of transactions per second. System downtime can result in loss of revenue for the business and client, as well as privacy, compliance, customer trust, brand reputation and transactional processing systems issues.
  • Manufacturing – Manufacturers need to keep their production line up and running. System downtime can result in loss of revenue as well as quality, cost and compliance.issues.
  • Retail – Retailers have sales targets to meet day-in and day-out. Transaction system downtime can mean transaction and data loss: sales data, customer data – lost opportunity, abandoned opportunities and a tarnished reputation.
  • Security – Building security companies prevent external threats to organizations. Security application downtime can result in safety/security risks, an uncomfortable building environment and compliance issues.

Mission-critical workloads  — can be especially impacted by downtime. A 71 percent majority of organizations now require a minimum of 99.99 percent uptime for mission-critical hardware, operating systems and line-of-business applications.An hour of downtime can mean millions in lost revenue. In fact, a recent study found that 98 percent of firms say hourly downtimes costs exceed $100K and 81 percent estimate hourly downtime cost their companies over $300K.1  A separate study found that an hour of downtime for mission-critical applications can cost $500K for companies with 5-10K employees and $1.5M for companies with greater than 10K employees.2

Needless to say, server reliability, availability and serviceability (RAS) is of the upmost importance in doing away with unwanted downtine. Let's talk about the characteristics of RAS. 

Reliability – Contains built-in features designed to keep systems up and running

  • Has error detection and self-healing capabilities
  • Minimizes outage opportunities
  • Processes correct results all the time

Availability – Keeps running despite problems

  • Reduces frequency and duration of outages
  • Is self-diagnosing: works around faulty components or self-heals
  • Never stops or slows down

Serviceability – Minimizes outages/downtime

  • Avoids repeat failures with accurate diagnostics
  • Executes concurrent repair on higher failure rate items
  • Is easy to repair and upgrade

To ensure business continuity and increase end-user productivity, it is imperative that you maximize uptime of server hardware and server operating systems. How do you do that? Ensure your servers are the best in the industry in RAS. It just so happens, Lenovo servers are the leader. All Lenovo servers have strong RAS capabilities, but the Lenovo System x3850 X6 and System x3950 X6 servers have advanced RAS features not found in other servers. The differentiated X6 self-healing technology proactively identifies potential failures and transparently takes necessary corrective actions. The Lenovo X6 servers provide mainframe-like RAS because they integrate across the hardware and software stack. A recent study found that Lenovo System x servers (and IBM Power Systems) averaged the lowest percentage of server outages compared to HP ProLiant and Integrity servers, Dell PowerEdge, Oracle x86 and SPARC hardware platforms.1

Lenovo X6 have five levels of RAS (Four are BEYOND standard RAS available on other Intel processor-based servers).

1) Standard Intel processor-based Server RAS – Strong RAS capabilities available on all Lenovo servers.

  • Strong error prevention
  • Error detection/correction

2) Intel Run Sure Technology – This is enterprise-level RAS only available on the Intel E7-4800/8800 processors used in the x3850 X6 and x3950 X6.

  • MCA consumed error recovery
  • UPI faildown
  • Double device data correction (DDDC) for memory
  • PCIe live error recovery

3) Lenovo Platform RAS Innovation – This includes more platform-level RAS features for higher availability.

  • Automated processor failover
  • Automated firmware backup
  • Automated memory page sorting and page retire
  • Advanced transaction recovery

4) Lenovo Management Innovation – This includes greater solution-level RAS management with X6 software stack integration.

  • VMware virtualization
  • Microsoft virtualization

5) X6 modular design  A design that reduces service time by enabling quick, easy replacement of upgradeable or failed components.

  • Modular elements slide in and out of server chassis like books on a book shelf
  • No top cover to remove and modules are easily accessed from the front or the back

To learn more about unique X6 RAS features, watch the video RAS Features of X6 Servers

In addition to the above X6 RAS features, Lenovo xClarity, which is a new centralized systems management solution, also increases RAS for Lenovo servers.

  • Provides the tools needed to enable administrators to deploy platforms more quickly and manage them easier
  • Allows servers even "call home" if they detect an issue, so a potential problem may be fixed before it occurs
  • Collects and downloads diagnostic data, including logs, service data and inventory to help identify the cause of the issue

Lenovo X6 technologies drive the outstanding system availability and uninterrupted application performance needed to maximize uptime to host mission-critical applications. Find out more about the System x3850 X6 and System x3950 X6 server.


1 ITIC 2015-2016 Global Server Hardware and Server OS Reliability Report

2 IDC Storage Quickpoll 2013