Explainable predictive modeling for limited spectral data
نویسندگان
چکیده
Feature selection of high-dimensional labeled data with limited observations is critical for making powerful predictive modeling accessible, scalable, and interpretable domain experts. Spectroscopy data, which records the interaction between matter electromagnetic radiation, particularly holds a lot information in single sample. Since acquiring such complex task, it crucial to exploit best analytical tools extract necessary information. In this paper, we investigate most commonly used feature techniques introduce applying recent explainable AI interpret prediction outcomes spectral data. Interpretation outcome beneficial experts as ensures transparency faithfulness ML models knowledge. Due instrument resolution limitations, pinpointing important regions spectroscopy creates pathway optimize collection process through miniaturization spectrometer device. Reducing device size power therefore cost requirement real-world deployment sensor-to-prediction system whole. Furthermore, consider wide range machine learning that have been proven be successful Cetane Number fuels. We specifically design three different scenarios ensure evaluation robust real-time practice developed methodologies uncover hidden effect noise sources on final outcome. The performed both full model reduced using real dataset. Finally, propose correctness metric assess conformance selected subset features expertise. As result, Support Vector Regression yields better accuracy generalization leads less computationally more efficient than Neural Network. More importantly, from original deploying complex, models.
منابع مشابه
modeling loss data by phase-type distribution
بیمه گران همیشه بابت خسارات بیمه نامه های تحت پوشش خود نگران بوده و روش هایی را جستجو می کنند که بتوانند داده های خسارات گذشته را با هدف اتخاذ یک تصمیم بهینه مدل بندی نمایند. در این پژوهش توزیع های فیزتایپ در مدل بندی داده های خسارات معرفی شده که شامل استنباط آماری مربوطه و استفاده از الگوریتم em در برآورد پارامترهای توزیع است. در پایان امکان استفاده از این توزیع در مدل بندی داده های گروه بندی ...
Language Modeling for limited-data domains
With the increasing focus of speech recognition and natural language processing applications on domains with limited amount of in-domain training data, enhanced system performance often relies on approaches involving model adaptation and combination. In such domains, language models are often constructed by interpolating component models trained from partially matched corpora. Instead of simple...
متن کاملPredictive soil mapping with limited sample data
A . X . Z h u a,b,c,d, J . L i u d, F . D u d, S . J . Z h a n g c, C . Z . Q i n c, J . B u r t d, T . B e h r e n s e & T . S c h o l t e n e aSchool of Geography Science, Nanjing Normal University, 1 Wenyuan Road, Nanjing 210023, China, bJiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, 1 Wenyuan Road, Nanjing 210023, China, cState ...
متن کاملMining Facebook Data for Predictive Personality Modeling
Beyond being facilitators of human interactions, social networks have become an interesting target of research, providing rich information for studying and modeling user’s behavior. Identification of personality-related indicators encrypted in Facebook profiles and activities are of special concern in our current research efforts. This paper explores the feasibility of modeling user personality...
متن کاملData-intensive analytics for predictive modeling
The Data Abstraction Research Group was formed in the early 1990s, to bring focus to the work of the Mathematical Sciences Department in the emerging area of knowledge discovery and data mining (KD & DM). Most activities in this group have been performed in the technical area of predictive modeling, roughly at the intersection of machine learning, statistical modeling, and database technology. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Chemometrics and Intelligent Laboratory Systems
سال: 2022
ISSN: ['1873-3239', '0169-7439']
DOI: https://doi.org/10.1016/j.chemolab.2022.104572