Checking Questionable Entry of Personally Identifiable Information Encrypted by One-Way Hash Transformation
نویسندگان
چکیده
BACKGROUND As one of the several effective solutions for personal privacy protection, a global unique identifier (GUID) is linked with hash codes that are generated from combinations of personally identifiable information (PII) by a one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and only GUID and hash codes are allowed. The quality of PII entry is critical to the GUID system. OBJECTIVE The goal of our study was to explore a method of checking questionable entry of PII in this context without using or sending any portion of PII while registering a subject. METHODS According to the principle of GUID system, all possible combination patterns of PII fields were analyzed and used to generate hash codes, which were stored on the GUID server. Based on the matching rules of the GUID system, an error-checking algorithm was developed using set theory to check PII entry errors. We selected 200,000 simulated individuals with randomly-planted errors to evaluate the proposed algorithm. These errors were placed in the required PII fields or optional PII fields. The performance of the proposed algorithm was also tested in the registering system of study subjects. RESULTS There are 127,700 error-planted subjects, of which 114,464 (89.64%) can still be identified as the previous one and remaining 13,236 (10.36%, 13,236/127,700) are discriminated as new subjects. As expected, 100% of nonidentified subjects had errors within the required PII fields. The possibility that a subject is identified is related to the count and the type of incorrect PII field. For all identified subjects, their errors can be found by the proposed algorithm. The scope of questionable PII fields is also associated with the count and the type of the incorrect PII field. The best situation is to precisely find the exact incorrect PII fields, and the worst situation is to shrink the questionable scope only to a set of 13 PII fields. In the application, the proposed algorithm can give a hint of questionable PII entry and perform as an effective tool. CONCLUSIONS The GUID system has high error tolerance and may correctly identify and associate a subject even with few PII field errors. Correct data entry, especially required PII fields, is critical to avoiding false splits. In the context of one-way hash transformation, the questionable input of PII may be identified by applying set theory operators based on the hash codes. The count and the type of incorrect PII fields play an important role in identifying a subject and locating questionable PII fields.
منابع مشابه
Algebraic Lower Bounds for Computing on Encrypted Data
In cryptography, there has been tremendous success in building primitives out of homomorphic semantically-secure encryption schemes, using homomorphic properties in a black-box way. A few notable examples of such primitives include items like private information retrieval schemes and collision-resistant hash functions (e.g. [14, 6, 13]). In this paper, we illustrate a general methodology for de...
متن کاملSufficient Conditions for Collision-Resistant Hashing
We present several new constructions of collision-resistant hash-functions (CRHFs) from general assumptions. We start with a simple construction of CRHF from any homomorphic encryption. Then, we strengthen this result by presenting constructions of CRHF from two other primitives that are implied by homomorphic-encryption: one-round private information retrieval (PIR) protocols and homomorphic o...
متن کاملAspect Oriented UML to ECORE Model Transformation
With the emerging concept of model transformation, information can be extracted from one or more source models to produce the target models. The conversion of these models can be done automatically with specific transformation languages. This conversion requires mapping between both models with the help of dynamic hash tables. Hash tables store reference links between the elements of the source...
متن کاملPrivacy Preserving Search of Multimedia
The advancement of information technology is rapidly integrating the physical world where we live and the online world where we retrieve and share information. One immediate example of such integration is the increasing popularity of storing and managing personal data using third-party web services, as part of the emerging trend of cloud computing. Secure management of sensitive data stored onl...
متن کاملA New Model of Data Protection on Cloud Storage
This paper focuses on studying cloud storage data protection model and implementing encrypted storage of user data in double-key form. User data are encrypted with symmetric encryption algorithm and this secret key is encrypted with asymmetric encryption algorithm. The private key is managed and controlled by users. In this way, users guarantee the security of their own data with the sole priva...
متن کامل