COMS 4771 Spring 2015 Features

نویسنده

  • Daniel Hsu
چکیده

In many applications, no linear classifier over the “raw” set of features will perfectly separate the data. One recourse is to find additional features that are predictive of the label. This is called feature engineering, and is often a substantial part of the job of a machine learning practitioner. In some applications, it is possible to “throw in the kitchen sink”—i.e., include all possible features that might possibly be relevant. For instance, in document classification, one can include a feature for each possible word in the vocabulary that indicates whether that word is present in the given document (or counts the number of occurrences). One can also include a feature for each possible pair of consecutive words (“bi-grams”), each possible triple of consecutive words (“tri-grams”), and so on. In general, it is common to automatically generate features based on existing features x ∈ Rd, such as quadratic interaction features x 7→ (x1x2, x1x3, . . . , x1xd, x2x3, . . . , xd−1xd) ∈ R( d 2), as well as higher-order interaction features. The main drawback of these “kitchen sink” feature expansions is that it may be computationally expensive to work explicitly in the expanded feature space. Fortunately, there are some ways around this.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COMS 4771 Spring 2015 Expectation - Maximization

Example 1 (Mixture of K Poisson distributions). The sample spaces are X = Z+ := {0, 1, 2, . . . } = and Y = [K] := {1, 2, . . . ,K}. The parameter space is Θ = ∆K−1 × R++, where ∆K−1 := {π = (π1, π2, . . . , πK) ∈ R+ : ∑K j=1 πj = 1}, R+ := {t ∈ R : t ≥ 0}, and R++ := {t ∈ R : t > 0}. Each distribution Pθ in P = {Pθ : θ ∈ Θ} is as follows. If (X,Y ) ∼ Pθ for θ = (π, λ1, λ2, . . . , λK), then Y ...

متن کامل

COMS 4771 : Homework 2 Solution

f(x) = { 1 if wx+ b > 0, −1 otherwise. Consider d+1 points x = (0, ..., 0) , x = (1, 0, ..., 0) , x = (0, 1, ..., 0) , ..., x = (0, 0, ..., 1) . After these d+1 points being arbitrarily labeled: y = (y0, y1, ..., yd) T ∈ {−1, 1}. Let b = 0.5 · y0 and w = (w1, w2, ...wd) where wi = yi, i ∈ {1, 2, ..., d}. Thus f(x) can label all these d+ 1 points correctly. So the VC dimension of perceptron is a...

متن کامل

I-COMS: Interprotein-COrrelated Mutations Server

Interprotein contact prediction using multiple sequence alignments (MSAs) is a useful approach to help detect protein-protein interfaces. Different computational methods have been developed in recent years as an approximation to solve this problem. However, as there are discrepancies in the results provided by them, there is still no consensus on which is the best performing methodology. To add...

متن کامل

COMS 6998 : Advanced Complexity Spring 2017 Lecture 2 : January 26 , 2017

1. The Andre'ev (1987) bound: explicit function with size f = ˜ Ω(n 5 2) 2. The relationship between formula size and circuit depth, and how to accomplish " depth reduction " for a general circuit 3. Monotone formulas for monotone functions (example: monotone formula for the majority function MAJ)

متن کامل

New economy: from crisis of dot-coms to virtual business

Article about objective regularity and mechanisms of formation of network economy in the conditions of crisis of «new economy». The network economy is considered as a basis of a following business cycle. The author analyzes features of institutional transformation of economic relations under the influence of development of cloudy technologies and the virtual organizations.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015