Electronic Theses and Dissertations
نویسنده
چکیده
Applications of Integer Programming Methods to Solve Statistical Problems by Michael James Higgins Doctor of Philosophy in Statistics University of California, Berkeley Jasjeet Sekhon, Co-chair Deborah Nolan, Co-chair Many problems in statistics are inherently discrete. When one of these problems also contains an optimization component, integer programming may be used to facilitate a solution to the statistical problem. We use integer programming techniques to help solve problems in the following areas: optimal blocking of a randomized controlled experiment with several treatment categories and statistical auditing using stratified random samples. We develop a new method for blocking in randomized experiments that works for an arbitrary number of treatments. We analyze the following problem: given a threshold for the minimum number of units to be contained in a block, and given a distance measure between any two units in the finite population, block the units so that the maximum distance between any two units within a block is minimized. This blocking criterion can minimize covariate imbalance, which is a common goal in experimental design. Finding an optimal blocking is an NP-hard problem. However, using ideas from graph theory, we provide the first polynomial time approximately optimal blocking algorithm for when there are more than two treatment categories. In the case of just two such categories, our approach is more efficient than existing methods. We derive the variances of estimators for sample average treatment effects under the Neyman-Rubin potential outcomes model for arbitrary blocking assignments and an arbitrary number of treatments. In addition, statistical election audits can be used to collect evidence that the set of winners (the outcome) of an election according to the machine count is correct—that it agrees with the outcome that a full hand count of the audit trail would show. The strength of evidence is measured by the pvalue of the hypothesis that the machine outcome is wrong. Smaller p-values are stronger evidence that the outcome is correct. Most states that have election audits of any kind require audit samples stratified by county for contests that cross county lines. Previous work on p-values for stratified samples based on the largest weighted overstatement of the margin used upper bounds that can be quite weak. Sharper p-values than those found by previous work can be found by solving a 0-1
منابع مشابه
Topical Categorization of Large Collections of Electronic Theses and Dissertations
Electronic Theses and Dissertations (ETDs) form an important part of scholarly work. Many universities in the USA, and other parts of the world, require their students to submit their theses and dissertations in electronic form. The ETDs are hosted by the respective universities, and no single point of access exists to the different ETD collections. Various initiatives like NDLTD have aimed to ...
متن کاملOverview of a Guide for Electronic Theses and Dissertations
This chapter provides an overview of a guide for electronic theses and dissertations that is being prepared as requested by UNESCO to help with the expansion of ETD activities around the world. It roughly follows the outline developed through discussions involving the many partners working on that guide, coordinated by Shalini Urs. It builds upon experiences related to the evolution of the Netw...
متن کاملAutomating the Preservation of Electronic Theses and Dissertations with Archivematica: Poster - iPres 2013 - Lisbon
This poster describes the tools, services, and workflows that Simon Fraser University is using to automate the movement of its ETDs (Electronic Theses and Dissertations) from its user-facing Thesis Registration System to the Archivematica digital preservation platform. The poster also describes Simon Fraser University’s plans to expand its digital preservation services using Archivematica, incl...
متن کامل