Least Squares Regression in C/C++
Solution 1
the gold standard for this is LAPACK. you want, in particular, xGELS
.
Solution 2
When I've had to deal with large datasets and large parameter sets for non-linear parameter fitting I used a combination of RANSAC and Levenberg-Marquardt. I'm talking thousands of parameters with tens of thousands of data-points.
RANSAC is a robust algorithm for minimizing noise due to outliers by using a reduced data set. Its not strictly Least Squares, but can be applied to many fitting methods.
Levenberg-Marquardt is an efficient way to solve non-linear least-squares numerically. The convergence rate in most cases is between that of steepest-descent and Newton's method, without requiring the calculation of second derivatives. I've found it to be faster than Conjugate gradient in the cases I've examined.
The way I did this was to set up the RANSAC an outer loop around the LM method. This is very robust but slow. If you don't need the additional robustness you can just use LM.
Solution 3
Get ROOT and use TGraph::Fit()
(or TGraphErrors::Fit()
)?
Big, heavy piece of software to install just of for the fitter, though. Works for me because I already have it installed.
Or use GSL.
Solution 4
If you want to implement an optimization algorithm by yourself Levenberg-Marquard seems to be quite difficult to implement. If really fast convergence is not needed, take a look at the Nelder-Mead simplex optimization algorithm. It can be implemented from scratch in at few hours.
Related videos on Youtube

Ohanes Dadian
Updated on April 15, 2022Comments
-
Ohanes Dadian about 1 month
How would one go about implementing least squares regression for factor analysis in C/C++?
-
captncraig over 12 yearsThis is a FORTRAN solution, albeit a good one. The point is valid, however, that there are existing libraries and statistical packages in c that are much easier to use than rolling your own.
-
kennytm over 12 years@Captn: There's C/C++ port of LAPACK.
-
Michael Anderson over 12 yearsLooks like the GSL library mentioned by dmckee supports Levenberg-Marquardt. This would be a good starting point if you want to go down this route. I think GSL may have been unusable for us due to its GPL license.
-
Stephen Canon over 12 yearsWhat KennyTM said. Most platforms that provide LAPACK also provide C interfaces.
-
Kena over 12 years+1 for mentioning RANSAC, because it's a nifty algorithm that doesn't get the exposure it desserves