Knowing the beneficial or productive usage time for large high performance computing (HPC) platforms is important for computing metrics that capture the reliability, availability, and serviceability (RAS) of the platform. Currently, application execution time is generally accounted as all productive system time, yet large-scale, long-running applications incur fault tolerance overhead such as c...