Least squares regression through the origin#375
Least squares regression through the origin#375itsalissonsilva wants to merge 1 commit intoleanprover:mainfrom
Conversation
|
Hi, thanks for your interest in contributing! However, I think this is likely more in scope for Mathlib. I would ask in the Zulip, as my assumption is that they would want a more general formulation. |
|
Hey Chris, I was thinking purely in the ML aspect of it which seemed like something that would fit in CSLib. Since I plan to formalize other algorithms as well, would decision-tree classifiers or k-NN also fit in mathlib or would it fit in cslib? Sorry for the confusion. |
|
Quick comment:
Also @itsalissonsilva : numerical algorithms are traditionally a CS subject that mathematics departments might not teach. So I can't say with 100% certainty that least squares regression is outside the scope of "undergrad CS curricula", especially since they also show up in ML courses which are standard these days. But they are clearly in scope for SciLean which is a repository for algorithms of scientific computing. |
This PR adds ordinary least squares regression through the origin. It defines Sxx = ∑ xᵢ², Sxy = ∑ xᵢ yᵢ, the estimator aStar = Sxy / Sxx, and the squared loss:
loss(a) = ∑ (a * xᵢ - yᵢ)²,
and proves that aStar minimizes loss(a) assuming Sxx ≠ 0. Tests are included under CslibTests (if applicable).
Local checks: lake build, lake test.