Abstract Background Motivated by the size and availability of cell line drug sensitivity data, researchers have been developing machine learning (ML) models for predicting response to advance cancer treatment. As studies continue generating a common question is whether generalization performance existing prediction can be further improved with more training data. Methods We utilize empirical cu...