Source Code by Shai Shalev-Shwartz
Terms and Conditions
The code below is given under the GNU Lesser General
Public License.
Use on your own risk !
The code was developed on cygwin using g++ version 3.4.4.
Most of the code was also tested on Linux using g++ version 4.0.3
Utilities
pdfjoin
This small shell script merges (join) the pages of different pdf files into a single pdf file.
The script requires pdflatex installed.
It was written by Ambuj Tewari. There's a much more sophisticated set of scripts (look for PDFjam in your favorite search engine).
Learning Tools
The following libraries are extensively used in my code.
I usually assume that the following libraries are within
a directory under the name learning_tools.
- infra - Linear
algebra library for machine learning applications implemented on the
top of the BLAS and ATLAS libraries. The library
was originally written by Ofer Dekel as a very
efficient MATLAB-like interface to C++. It was refined by Joseph Keshet and by
me.
- infra
utilities - Utilities for handling infra file formats and
dataset formats.
-
kernels -
Mercer kernels implementation for discriminative algorithms. A very
efficient computation of the standard kernels using the BLAS and ATLAS libraries.
-
active set
- Implementation of an active set for iterative algorithms for support-vector
machine. I was written as a
support-vectors data-structure for Hildreth's like algorithms
[Hildreth, 1957].
See my paper "Efficient Learning of Label Ranking by Soft Projections onto Polyhedra"
available from my homepage.
-
cmdline - C++
library for parsing command line arguments. Written by Joseph Keshet.
-
simple sparse
vector - Implementation of a very simple sparse vector
using stl map.
Label Ranking
-
Sopopo -
This code implements the Sopopo algorithm for label ranking.
The code was written together with Yoram Singer.
See the paper
"Efficient Learning of Label Ranking by Soft Projections onto Polyhedra"
available from my homepage.
Refer to the README file for installation details.
Online Multiclass Prediction
- Dense data, additive updates -
This code implements three multiclass online learning algorithms:
- Perceptron with max-update.
- PA-I with max-update.
- SOPOPO for multiclass.
See the papers "Online Passive Aggressive Algorithms" and
"Efficient Learning of Label Ranking by Soft Projections onto Polyhedra"
available from my homepage.
The data is assumed to be in infra format.
Several datasets in the dense infra format are given in
datasets
page.
Make sure to edit the Makefile according to where you put
the learning tools.
The bipartite solver implementation was written together with Yoram Singer.
- Sparse data, multiplicative updates -
This code implements three multiclass online learning algorithms with multiplicative updates:
- Winnow/EG with max-update.
- PA-I version of EG with max-update.
- SOPOPO version of EG for multiclass.
See the paper "Online Learning meets Optimization in the Dual"
available through my homepage.
The implementation is based on an efficient primal-dual interior point method specifically
tailored to this problem.
The data is assumed to be in sparse format.
The Enron dataset after preprocessing in this format is given in my
datasets
page.
Make sure to edit the Makefile according to where you put
the learning tools.
Pegasos -- Solving SVM
-
Pegasos -
This code implements the Pegasos algorithm for solving SVM in the primal.
See the paper
"Pegasos: Primal Estimated sub-GrAdient SOlver for SVM"
available from my homepage.
Refer to the README file for installation details.