Model-Based Mining of Source Code Repositories

نویسندگان

  • Markus Scheidgen
  • Joachim Fischer
چکیده

The Mining Software Repositories (MSR) field analyzes the rich data available in source code repositories (SCR) to uncover interesting and actionable information about software system evolution. Major obstacles in MSR are the heterogeneity of software projects and the amount of data that is processed. Model-driven software engineering (MDSE) can deal with heterogeneity by abstraction as its core strength, but only recent efforts in adopting NoSQL-databases for persisting and processing very large models made MDSE a feasible approach for MSR. This paper is a work in progress report on srcrepo: a model-based MSR system. Srcrepo uses the NoSQL-based EMF-model persistence layer EMF-Fragments and Eclipse’s MoDisco reverse engineering framework to create EMF-models of whole SCRs that comprise all code of all revisions at an abstract syntax tree (AST) level. An OCL-like language is used as an accessible way to finally gather information such as software metrics from these SCR models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

With mining software repositories (MSR), we analyze the rich data created during the whole evolution of one or more software projects. One major obstacle in MSR is the heterogeneity and complexity of source code as a data source. With model-based technology in general and reverse engineering in particular, we can use abstraction to overcome this obstacle. But, this raises a new question: can we...

متن کامل

Mining Container Image Repositories for Software Configuration and Beyond

This paper introduces the idea of mining container image repositories for configuration and other deployment information of software systems. Unlike traditional software repositories (e.g., source code repositories and app stores), image repositories encapsulate the entire execution ecosystem for running target software, including its configurations, dependent libraries and components, and OS-l...

متن کامل

Mining API Usage Patterns by Applying Method Categorization to Improve Code Completion

Developers often face difficulties while using APIs. API usage patterns can aid them in using APIs efficiently, which are extracted from source code stored in software repositories. Previous approaches have mined repositories to extract API usage patterns by simply applying data mining techniques to the collection of method invocations of API objects. In these approaches, respective functional ...

متن کامل

Software Repositories: A Source for Traceability Links

This paper analyzes six open source projects in order to assess software repositories, such as those managed by Subversion, as a source for uncovering/discovering traceability links between different types of software artifacts. Our finding suggests that software repositories store a variety of artifacts that are central to open source development and use. Furthermore, a heuristic-based approac...

متن کامل

Mining Unstructured Software Repositories Using Ir Models

MINING SOFTWARE REPOSITORIES, which is the process of analyzing the data related to software development practices, is an emerging field which aims to aid development teams in their day to day tasks. However, data in many software repositories is currently unused because the data is unstructured, and therefore difficult to mine and analyze. Information Retrieval (IR) techniques, which were deve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014