Since I finished the “Machine Learning” course on Coursera, I wanted to try it in Erlang, Unfortunately Erlang is notorious with its inability to do intensive math calculations. In short - no matrix calculations - no Machine learning. Solution - use C and call C-functions from Erlang. Or better - use the GPU (in C) and call the functions from Erlang. I guess, this approach is no more qualified as “pure Erlang” one, but that is the best we can get. In fact - most of the Python ML libraries, so famous in the ML society, are using NumPy library, which guess what - is written in C (at least - the important part of it).
So, first thing to do was - google “erlang matrix operations cuda gpu”….Almost nothing. Actually, I found two projects - OpenCL bindings for erlang https://github.com/tonyrog/cl and Kevin Smith’s Pteracuda. I liked Kevin’s approach, as using the Thrust library would hide the pure C boilerplate CUDA code. Also some algorithms are one-liners implemented with Thrust.
I forked Pteracuda and added matrix operations (CUBLAS-based), but since it had some parts that I didn’t need (like: strings, search, min/max), I decided to create another project and so - the NumEr was born :)
Basically, it is a bunch of Erlang NIF functions for BLAS operations on vectors and matrices. Both are natively implemented as Thrust host/device vectors and special “buffer” classes are used to transfer them from Erlang to CUDA and back.
Here is an example:
1 2 3 4 5 6 7 8 9 10 11
As you see one of the parameters in the matrix buffer is “?ROW_MAJOR”. It is kinda borrowed from Boost library, but not yet fully implemented in NumEr. I mean - currently only row-major matrices are supported. However, under the hood in the Thrust vectors the numbers are stored in column-major format. I chose to do it in this way, because the CUBLAS library is using column-major storage - being a derivative of the FORTRAN BLAS library.
There are several modules, which are wrappers for the NIF functions, like: numer_blas.erl - for BLAS operations, numer_buffer.erl - for operations with buffers (new, delete, read, write), etc.
Using numer_buffer module, the above example will look like:
1 2 3 4
Now, the more interesting part - calculations with vectors and matrices. Here is the BLAS GEMV example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
As for the ML part of the whole exercise - there is an implementation of the Logistic Regression (without regularization) algorithm. Take a look at the numer_logreg.erl module.
The numer_ml.erl module contains an all-native implementation of Logistic Regression, while the numer_logreg.erl is using buffers and NIFs. The first one I used to compare the speed between native and “Erlang” implementations.
Since using buffer operations can make the code awkward to read, there is also a helper module - numer_helpers.erl, wich can be used for prototyping the algorithms. WARNING - it is extremely slow and suitable ONLY for prototyping. Here is how you can use it:
1 2 3 4 5 6 7
It is much more readable and useful for one-time calculations, but in the ML “training” stage (with hundreds of iterations) it will be unusable, due to the multiple buffer transfers.
The project is still a work in progress and needs a lot of polishing and if anyone is willing to give a hand I’ll be more than happy. Any suggestions to improve the framework are also very welcome.
That’s all folks! Happy hacking! :-)
Link to the GitHub repository: https://github.com/vascokk/NumEr