Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nah. Least squares is necessarily optimal under a normal distribution, but that does not imply that it does not work if the distribution is not normal. It might even be optimal for some other distributions, I just cannot remember, and I have to watch the frying pan.


Least squares is disastrously bad for anything with a fat tail (e.g., power law decay). The reason is that in these cases one or two outliers will dominate the sum of the tails.

It's not optimal for any other distribution. For a general error distribution g(x), you want to maximize sum(log(g(x-x0))) (or equivalently prod(g(x-x0))) - this only turns to least squares if log(g(x)) = C + D(x-x0)^2.


That sounds neat. Have you got a reference to it handy?


I don't know of any references, most of this is just stuff that falls out pretty easily once you try and do the math, and that's probably faster than reading a book.

I.e., set up maximal likelihood, take logs, and you immediately get least squares for g(x) a gaussian. If g(x)=exp(-|x|), you get l1 minimization. Other distributions give you other things.

The general topic to investigate is robust regression: https://en.wikipedia.org/wiki/Robust_regression




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: