An assessment of the Nam Pehchan computer program for the identification of names of south Asian ethnic origin.

نویسندگان

  • C Cummins
  • H Winter
  • K K Cheng
  • R Maric
  • P Silcocks
  • C Varghese
چکیده

BACKGROUND An assessment was made of the usefulness and accuracy of a computer program for the identification of the south Asian population through the classification of names on a disease register. METHODS The computer program, Nam Pehchan, was used to classify names as either south Asian or non south Asian. The results were compared with a reference standard, which combined use of the program with visual inspection. The latter was facilitated by a computer-generated dictionary of common non south Asian names. The data set consisted of 356,555 cases of incident cancer (ICD9: 140-208) registered between 1990 and 1992 by Thames, Trent, West Midlands and Yorkshire cancer registries. RESULTS Nam Pehchan classified 5506 cases as south Asian. Visual inspection identified 2024 false positives (36.8 per cent of all cases identified as south Asian by Nam Pehchan) and 363 false negatives (9.5 per cent of those identified by the reference standard). Compared with the reference standard, Nam Pehchan had a sensitivity of 90.5 per cent and a positive predictive value of 63.2 per cent. CONCLUSION The Nam Pehchan program quickly identified a high proportion of the names classified as south Asian by the reference standard, but the high false positive rate means that the program alone is not an adequate single strategy. The time-consuming process of inspection of program negatives for large data sets can be substantially reduced by comparison with dictionaries of common non south Asian names.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of name recognition software, census data and multiple imputation to predict missing data on ethnicity: application to cancer registry records

BACKGROUND Information on ethnicity is commonly used by health services and researchers to plan services, ensure equality of access, and for epidemiological studies. In common with other important demographic and clinical data it is often incompletely recorded. This paper presents a method for imputing missing data on the ethnicity of cancer patients, developed for a regional cancer registry in...

متن کامل

Validation and utility of a computerized South Asian names and group recognition algorithm in ascertaining South Asian ethnicity in the national renal registry.

BACKGROUND The UK Renal Registry (UKRR) reports on equity and quality of renal replacement therapy (RRT). Ethnic origin is a key variable, but it is only recorded for 76% patients overall in the UKRR and there is wide variation in the degree of its completeness between renal centres. Most South Asians have distinctive names. AIM To test the relative performance of a computerized name recognit...

متن کامل

Development and validation of a computerized South Asian Names and Group Recognition Algorithm (SANGRA) for use in British health-related studies.

BACKGROUND Studies on ethnic variations in health have played an important role in aetiological and health services research. Most routine datasets, however, do not include information on ethnicity. South Asians, one of the largest minority ethnic groups in Britain, have distinctive names that also allow differentiation of the main sub-groups with their important differences in health-related e...

متن کامل

Effect of Globalization on National Sovereignty, Including the Role of the Islamic Republic of Iran and the Ethnic Identity: (Turkmen Tribe in Golestan Province)

Abstract:The main purpose of this research is to explain the role of globalization on the relation-ship between the sovereignty of the country and the identity of the Turkmens. This re-search is a developmental and applied goal and is qualitatively based on the nature of the research. The statistical community in the qualitative section is a social sciences and political scientist. Expert...

متن کامل

Rice Yield Distribution and Risk Assessment in South Asian Countries: A Statistical Investigation

In the last decades, rice yields in South Asian countries grew tremendously in one hand and a noticeable yield fluctuation on the other. The objective of this study was to examine the rice yield distributions, estimate yield risks at country level, and compare risks between five countries namely Afghanistan, Bangladesh, Nepal, Sri Lanka, and Pakistan. Anderson Darling (AD) test was applied to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of public health medicine

دوره 21 4  شماره 

صفحات  -

تاریخ انتشار 1999