Wednesday 28 November 2012

Tuesday 13 November 2012

Buying air miles is a bad idea

For United Airlines, a saver award round trip ticket from Boston to Shanghai requires 65,000 miles. The round trip usually costs around $1000. This means 1 mile is worth $0.015. When I buy miles, each mile costs $0.035. So it's not worth it to buy miles.

I made the mistake by considering how much it costs to earn miles through taking flights. But earning miles this way is basically free because I'm taking the flight anyways.

So in conclusion, it is almost always a bad idea to buy air miles.

Thursday 1 November 2012

Law of total probability

If \(\{B_n:n=1, 2, 3, \ldots\}\) is a finite partition of a sample space (i.e., a set of pairwise disjoint events whose union is the entire sample space), then for any event A of the same probability space:
$$Pr(A) = \sum_n Pr(A\cap B_n)$$
or, alternatively,
$$Pr(A) = \sum_n Pr(A|B_n)Pr(B_n)$$

Decision Tree

Decision trees are classifiers for instances represented as feature vectors. Nodes are tests for feature values. There is one branch for each value of the feature. Leaves specify the category.

The central choice in the algorithm to build a decision tree is selecting which attribute to test at each node in the tree. We would like to select the attribute that is most useful for classifying examples. A good quantitative measure of the worth of an attribute is information gain, which measures how well a given attribute separates the training examples according to their target classification.

Information gain is related to entropy, that characterizes the (im)purity of an arbitrary collection of examples. Given a collection S, containing positive and negative examples of some target concept, the entropy of S relative to this boolean classification is
$$Entropy(S)\equiv -p_\oplus \log_2 p_\oplus-p_\ominus\log_2 p_\ominus$$
where \(p_\oplus\) is the proportion of positive examples in S and \(p_\ominus\) is the proportion of negative examples in S.

Information gain  is simply the expected reduction in entropy by partitioning the examples according to this attribute. Gain(S, A) of an attribute A, relative to a collection of examples S, is define as
$$Gain(S, A) \equiv Entropy(S) - \sum_{v\in Values(A)}\frac{|S_v|}{|S|}Entropy(S_v)$$