Learning Task-Specific Similarity

by Greg Shakhnarovich
PhD Thesis
Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology, 2005
Thesis Advisor: Trevor Darrell, MIT
Committee: Bill Freeman, MIT
Michael Collins, MIT
Shimon Ullman, Weizmann Institute of Science


The right measure of similarity between examples is important in many areas of computer science. In particular it is a critical component in example- based learning methods. Similarity is commonly defined in terms of a conventional distance function, but such a definition does not necessarily capture the inherent meaning of similarity, which tends to depend on the underlying task. We develop an algorithmic approach to learning similarity from examples of what objects are deemed similar according to the task-specific notion of similarity at hand, as well as optional negative examples. Our learning algorithm constructs, in a greedy fashion, an encoding of the data. This encoding can be seen as an embedding into a space, where a weighted Hamming distance is correlated with the unknown similarity. This allows us to predict when two previously unseen examples are similar and, importantly, to efficiently search a very large database for examples similar to a query.

This approach is tested on a set of standard machine learning benchmark problems. The model of similarity learned with our algorithm provides and improvement over standard example-based classification and regression. We also apply this framework to problems in computer vision: articulated pose estimation of humans from single images, articulated tracking in video, and matching image regions subject to generic visual similarity.

Thesis chapters