Completing the Results of the 2013 Boston Marathon
نویسندگان
چکیده
The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon's organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.
منابع مشابه
Completing the Results of the 2013 Boston Marathon Completing the Results of the 2013 Boston Marathon
The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathon’s organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting ...
متن کاملSituational Awareness from Social Media
This paper describes VIStology’s HADRian system for semantically integrating disparate information sources into a common operational picture (COP) for humanitarian assistance/disaster relief (HADR) operations. Here the system is applied to the task of determining where unexploded or additional bombs were being reported via Twitter in the hours immediately after the Boston Marathon bombing in Ap...
متن کاملA Case Study on Unconstrained Facial Recognition Using the Boston Marathon Bombings Suspects
The investigation surrounding the Boston Marathon bombings was a missed opportunity for automated facial recognition to assist law enforcement in identifying suspects. We simulate the identification scenario presented by the investigation using two state-of-the-art commercial face recognition systems, and gauge the maturity of face recognition technology in matching low quality face images of u...
متن کاملAcute coronary thrombosis in Boston marathon runners.
To the Editor: Regular exercise reduces the incidence of coronary atherosclerotic disease and decreases mortality after myocardial infarction,1 but vigorous activity increases the risk of myocardial infarction and sudden death among patients with occult and diagnosed coronary artery disease.2,3 We describe three male athletes in good condition without diagnosed coronary artery disease who prese...
متن کاملComparing Social Media and Traditional Surveys around the Boston Marathon Bombing
Sociological surveys have been a key instrument in understanding social phenomena, but do the introduction and popularity of social media threaten to usurp the survey’s place? The significant amount of data one can capture from social media sites like Twitter make such sources appealing. Limited work has tried to triangulate these sources pragmatically for research. This paper documents experie...
متن کامل