Discovering high utility-occupancy patterns from uncertain data

نویسندگان

چکیده

It is widely known that there a lot of useful information hidden in big data, leading to new saying "data money." Thus, it prevalent for individuals mine crucial utilization many real-world applications. In the past, studies have considered frequency. Unfortunately, doing so neglects other aspects, such as utility, interest, or risk. sensible discover high-utility itemsets (HUIs) transaction databases while utilizing not only quantity but also predefined utility. To find patterns can represent supporting transaction, recent study was conducted high utility-occupancy whose contribution utility entire greater than certain value. Moreover, realistic applications, may exist transactions be connected an existence probability. this paper, novel algorithm, called High-Utility-Occupancy Pattern Mining Uncertain (UHUOPM), proposed. The found by algorithm are Potential High Utility Occupancy Patterns (PHUOPs). This divides user preferences into three factors, including support, probability, and occupancy. reduce memory cost time consumption prune search space mentioned above, probability-utility-occupancy list (PUO-list) probability-frequency-utility table (PFU-table) used, which assist providing downward closure property. Furthermore, original tree structure, support count (SC-tree), constructed algorithm. Finally, substantial experiments were evaluate performance proposed UHUOPM on both real-life synthetic datasets, particularly terms effectiveness efficiency.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting High Utility Occupancy Patterns

Most studies have considered the frequency as sole interestingness measure for identifying high quality patterns. However, each object is different in nature, in terms of criteria such as the utility, risk, or interest. Besides, another limitation of frequent patterns is that they generally have a low occupancy, and may not be truly representative. Thus, this paper extends the occupancy measure...

متن کامل

Mining of high utility-probability sequential patterns from uncertain databases

High-utility sequential pattern mining (HUSPM) has become an important issue in the field of data mining. Several HUSPM algorithms have been designed to mine high-utility sequential patterns (HUPSPs). They have been applied in several real-life situations such as for consumer behavior analysis and event detection in sensor networks. Nonetheless, most studies on HUSPM have focused on mining HUPS...

متن کامل

Efficient Mining of Uncertain Data for High-Utility Itemsets

High-utility itemset mining (HUIM) is emerging as an important research topic in data mining. Most algorithms for HUIM can only handle precise data, however, uncertainty that are embedded in big data which collected from experimental measurements or noisy sensors in real-life applications. In this paper, an efficient algorithm, namely Mining Uncertain data for High-Utility Itemsets (MUHUI), is ...

متن کامل

Discovering Novelty in Gene Data: From Sequential Patterns to Visualization

Data mining techniques allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyse by end-users. In this paper, we focus on sequential pattern mining and propose a new visualization system, which aims at helping end-users to analyse extracted knowledge and to highli...

متن کامل

Discovering Neuronal Connectivity from Serial Patterns in Spike Train Data

Repeating patterns of precisely-timed activity across a group of neurons (called frequent episodes) are indicative of networks in the underlying neural tissue. This paper develops statistical methods to determine functional connectivity among neurons based on “non-overlapping” occurrences of episodes. We study the distribution of episode counts and develop a two-phase strategy for identifying f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Sciences

سال: 2021

ISSN: ['0020-0255', '1872-6291']

DOI: https://doi.org/10.1016/j.ins.2020.10.001