A Model of the Commit Size Distribution of Open Source
نویسندگان
چکیده
A fundamental unit of work in programming is the code contribution (“commit”) that a developer makes to the code base of the project in work. We use statistical methods to derive a model of the probabilistic distribution of commit sizes in open source projects and we show that the model is applicable to different project sizes. We use both graphical as well as statistical methods to validate the goodness of fit of our model. By measuring and modeling a fundamental dimension of programming we help improve software development tools and our understanding of software development.
منابع مشابه
Implementation of Hyperbolic Tangent Function to Estimate Size Distribution of Rock Fragmentation by Blasting in Open Pit Mines
Rock fragmentation is one of the desired results of rock blasting. So, controlling and predicting it, has direct effects on operational costs of mining. There are different ways that could be used to predict the size distribution of fragmented rocks. Mathematical relations have been widely used in these predictions. From among three proposed mathematical relations, one was selected in this stud...
متن کاملDeveloper Belief vs. Reality: The Case of the Commit Size Distribution
The design of software development tools follows from what the developers of such tools believe is true about software development. A key aspect of such beliefs is the size of code contributions (commits) to a software project. In this paper, we show that what tool developers think is true about the size of code contributions is different by more than an order of magnitude from reality. We pres...
متن کاملEmpirical Evidence on Developer's Commit Activity for Open-Source Software Projects
The manner of development is an important factor for the success of open-source software (OSS). Through mining the information of developer’s commits, researchers within the community of software engineering can investigate evolutionary aspects of OSS projects and analyze developer’s behaviors and collaboration. In this paper we conducted statistical analyses on commit activity for four OSS pro...
متن کاملA Stochastic Model for Prioritized Outpatient Scheduling in a Radiology Center
This paper discussed the scheduling problem of outpatients in a radiology center with an emphasis on priority. To more compatibility to real-world conditions, we assume that the elapsed times in different stages to be uncertain that follow from the specific distribution function. The objective is to minimize outpatients’ total spent time in a radiology center. The problem is formulated as a fle...
متن کاملEstimating Commit Sizes Efficiently
The quantitative analysis of software projects can provide insights that let us better understand open source and other software development projects. An important variable used in the analysis of software projects is the amount of work being contributed, the commit size. Unfortunately, post-facto, the commit size can only be estimated, not measured. This paper presents several algorithms for e...
متن کامل