نتایج جستجو برای: synthetic minority over sampling technique

تعداد نتایج: 1974657  

Journal: :SAR and QSAR in environmental research 2014
S B Gunturi N Ramamurthi

Computational models to predict the developmental toxicity of compounds are built on imbalanced datasets wherein the toxicants outnumber the non-toxicants. Consequently, the results are biased towards the majority class (toxicants). To overcome this problem and to obtain sensitive but also accurate classifiers, we followed an integrated approach wherein (i) Synthetic Minority Over Sampling (SMO...

2018
Ahmet Okutan Shanchieh Jay Yang Katie McConky

If cyber incidents are predicted a reasonable amount of time before they occur, defensive actions to prevent their destructive effects could be planned. Unfortunately, most of the time we do not have enough observables of the malicious activities before they are already under way. Therefore, this work suggests to use unconventional signals extracted from various data sources with different time...

2014
Ali Zughrat

Support Vector Machines (SVMs) is a popular machine learning technique, which has proven to be very effective in solving many classical problems with balanced data sets in various application areas. However, this technique is also said to perform poorly when it is applied to the problem of learning from heavily imbalanced data sets where the majority classes significantly outnumber the minority...

2014
Panagiotis Moutafis Ioannis A. Kakadiaris

In this paper, we propose a method to improve nearest neighbor classification accuracy under a semi-supervised setting. We call our approach GS4 (i.e., Generating Synthetic Samples Semi-Supervised). Existing self-training approaches classify unlabeled samples by exploiting local information. These samples are then incorporated into the training set of labeled data. However, errors are propagate...

2013
Kung-Jeng Wang Bunjira Makond Kung-Min Wang

BACKGROUND Breast cancer is one of the most critical cancers and is a major cause of cancer death among women. It is essential to know the survivability of the patients in order to ease the decision making process regarding medical treatment and financial preparation. Recently, the breast cancer data sets have been imbalanced (i.e., the number of survival patients outnumbers the number of non-s...

Journal: :Int. Arab J. Inf. Technol. 2014
Hossein Abbasimehr Mostafa Setak Mohammad Jafar Tarokh

Customer churn is a main concern of most firms in all industries. The aim of customer churn prediction is detecting customers with high tendency to leave a company. Although, many modeling techniques have been used in the field of churn prediction, performance of ensemble methods has not been thoroughly investigated yet. Therefore, in this paper, we perform a comparative assessment of the perfo...

Journal: :Healthcare analytics 2022

This study aims to train and validate machine learning deep models identify patients with risky alcohol drug misuse in a Screening, Brief Intervention, Referral Treatment (SBIRT) program. An observational cohort of 6978 adults was admitted the western region Alabama at three medical facilities between January December 2019. Data were cleaned pre-processed using data imputation techniques an aug...

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

Journal: :Statistical Analysis and Data Mining 2014
Cameron A. MacKenzie Theodore B. Trafalis Kash Barker

Recent advances in data mining have integrated kernel functions with Bayesian probabilistic analysis of Gaussian distributions. These machine learning approaches can incorporate prior information with new data to calculate probabilistic rather than deterministic values for unknown parameters. This paper extensively analyzes a specific Bayesian kernel model that uses a kernel function to calcula...

Journal: :International Journal on Semantic Web and Information Systems 2022

Student retention is a widely recognized challenge in the educational community to assist institutes formation of appropriate and effective pedagogical interventions. This study intends predict students at-risk low performances during an on-going course, those graduating late than tentative timeline predicting capacity campus. The data constitutes demographics, learning, academic related attrib...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید