Unique file identification in the National Software Reference Library

نویسنده

  • Steve Mead
چکیده

The National Software Reference Library (NSRL) provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations involved with computer forensic investigations. The NSRL is comprised of three major elements: 1. A physical library of commercial software packages. 2. A database of information about each file within each software package. 3. A smaller database of the most widely used information that is updated and released quarterly. This database is called the NSRL Reference Data Set (RDS) and is NIST Special Database #28 [18]. During a forensic investigation, hundreds of thousands of files may be encountered. The NSRL is used to identify known files. This can reduce the amount of time spent examining a computer. Matches for common operating systems and applications do not need to be searched, either manually or electronically, for evidence. Additionally, the NSRL is used to determine which software applications are present on a system. This may suggest how the computer was being used and provide information on how and where to search for evidence. This paper examines whether the techniques used to create file signatures in the NSRL produce unique results—a core characteristic that the NSRL depends on for the majority of its uses. The uniqueness of the file identification is analyzed via two methods: an empirical analysis of the file signatures within the NSRL and research into the recent attacks on the hash algorithms used to generate the file signatures within the NSRL. The research addresses the following questions: 1. Are the file signatures in the NSRL unique? The NSRL was examined for distinct files that generated the same signature (i.e., a collision). 2. How likely is it that collisions will occur in the future? The probability of future collisions depends directly on the randomness of the file signatures. We ran statistical tests to answer the following questions: • Do file signatures appear to be random?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

1 . 1 Perl based framework for distributed processing

The National Software Reference Library (NSRL) of the U.S. National Institute of Standards and Technology (NIST) collects software from various sources and publishes file profiles computed from this software (such as MD5 and SHA-1 hashes) as a Reference Data Set (RDS) of information. The RDS can be used in the forensic examination of file systems, for example, to speed the process of identifyin...

متن کامل

Extensions of the UNIX File Command and Magic File for File Type Identification

File format identification is a core requirement for digital archives. The UNIX file command is among the most promising technologies for file type identification. This report describes extensions to the file command and magic file that enhance their utility for file format identification in archival systems. A File Format Library (database) has been created to manage information about file for...

متن کامل

Selection of Hashing Algorithms

INTRODUCTION The National Software Reference Library (NSRL) Reference Data Set (RDS) is built on file signature generation technology that is used primarily in cryptography. The selection of the specific file signature generation routines is based on customer requirements and the necessity to provide a level of confidence in the reference data that will allow it to be used in the U.S. Courts. T...

متن کامل

Testing the National Software Reference Library

The National Software Reference Library (NSRL) is an essential data source for forensic investigators, providing in its Reference Data Set (RDS) a set of hash values of known software. However, the NSRL RDS has not previously been tested against a broad spectrum of real-world data. The current work did this using a corpus of 36 million files on 2337 drives from 21 countries. These experiments a...

متن کامل

مقاسه سیستم طبقه بندی اقدامات کشورهای منتخب با ایران

Introduction: today, health care with desired quality, without one complete and effective procedure of the classification system it is impossible. In this system, the results of the care treatment will be registered in the patient's file with the standard codes. These codes are the basis of analysis the information for health care personnel, the investigators, policy - makers and the health - p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Digital Investigation

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2006