This paper presents a simulation-based performance prediction framework for large-scale, data-intensive applications on large-scale machines. The framework consists of two components: application emulators and a suite of sim-ulators. Application emulators provide a parameterized model of data access and computation patterns of the applications and enable changing critical application components...