Wednesday, January 5, 2011

stanford machine learning course - lecture#11

My takeaways from 11th lecture of stanford machine learning course.

This lecture first describes the bayesian statistics.
Lectures in the beginning of this course talked about parameter($θ$) fitting by maximizing the likelihood function. We treated $θ$ as a constant value but unknown and tried to estimate it by statistical procedures such as ML(maximum likelihood). This approach is called frequentist statistics. In Bayesian statistics, we think of $θ$ as a *random variable* whose value is unknown. In this approach, we would specify a prior distribution (a probability mass/density function) on $θ$ that expresses our "prior beliefs" about it. Using this we derive formula to compute/predict output variable for some input, given a training set.

Then, briefly, online learning is mentioned where algorithm is supposed to make predictions continuously even while it is learning whereas the algorithms we talked about earlier used to get trained once at the beginning and then used for prediction.

Next, comes the most important/practical part of this lecture(in fact the course). This is the guideline about how to apply machine learning algorithms. I should probably watch this part again and again. Also, (very informative)slides shown are available here[warning: pdf]. In summary it teaches you 3 things.

Diagnostics: How to diagnose when you're not getting expected results. So that you can pinpoint the right problem and fix it.

Error Analysis, Ablative Analysis: Error analysis is the way to learn to know improving what parts of the system will get you maximum accuracy boost. Ablative analysis is the way to learn what features are important and adding most to the accuracy.

How to start with a learning problem: This is like software. One way is to spend time upfront, design things carefully and come up with beautiful algorithms that just work. Next approach is to quickly hack stuff together that works and continuously improve it.