Cornell Information Science contact | cis home
   home  about us  undergrad programs  grad programs  research  faculty and researchers
 About Us
 Overview
 Seminar Series
 Facilities
 Contact
 INFORMATION SCIENCE SEMINAR

Machine Learning for Non-Traditional Performance Criteria

 

Speaker: Rich Caruana, Assistant Professor, Computer Science, Cornell University

Date: Wednesday, October 06, 2004 4:15-5:15p

Location: Cornell Information Science, 301 College Avenue, Seminar Room

 

Abstract -

We now have a variety of supervised learning methods that yield excellent performance on tried and true criteria such as accuracy and squared error. But these methods often yield sub-optimal results on other criteria for which they were not designed such as precision and recall, break even point, F-score, area under the ROC curve, probability calibration, etc. Sometimes it is possible to modify a learning method so that it works better on a different metric. For example, years ago we modified backprop so that it was better at predicting orderings (e.g. precision, recall, and ROC area). More recently Thorsten Joachims, Alex Niculescu, and I experimented with an SVM that also was modified to predict orderings. But no one learning method such as neural nets or SVMs currently dominates the others, and it is painful to modify each learning method so that it works well on each criterion. In the talk I'll present progress we have made devising supervised learning methods that are near-optimal for any efficiently computable performance metric. These methods currently outperform all other learning algorithms we have compared them to -- partly because they build on top of other learning algorithms instead of competing against them. I'll also discuss recent work in trying to understand how different performance metrics compare to each other, and how we might go about devising better super-metrics. This is joint work with Alex Niculescu and David Skalak and was recently funded by the NSF.


Bio -

Rich got his Ph.D. in Computer Science from CMU in 1998 where he worked with Tom Mitchell and Herb Simon. Most of his research is in machine learning and data mining. Rich likes to work on real problems, and let the problems force him to develop improved learning algorithms. About half of his applied work is in machine learning for medical decision making, but he is interested in any hard learning problem for which someone really cares about the results. Rich joined Cornell's Computer Science Department in 2001.

 

For more information please contact Jeff Hancock.