Poster Sessions
There will be two poster sessions, one Friday evening and one Saturday
evening. The poster locations indicate where the poster will be presented,
please make sure to hang your poster in the right location. Attendees will get
a map at registration.
Posters - Friday, 6pm
Location: Duffield Atrium
Studying Algorithmic Scope in Sequences of Natural Computing Algorithms to Address Small Traveling Salesman Problems, Romerl Cabale Elizes and Michael GarganoPoster Location: 1
Abstract Researchers have implemented a variety of algorithms including Lin-Kernighan, Hybrid, Genetic, Tabu-Search, Backtracking, and other relevant algorithms to apply a near optimal algorithm that will make the Traveling Salesman Problem (TSP) close to a non-NP hard problem. Although optimal tours have been reached in many of the well-known TSP problems, there is no one optimal solution that clearly addresses TSP through all city instances. The focus of this research is two fold: (1) to investigate the feasibility of sequence-based hybrid algorithms as sufficient heuristics to TSP and (2) to determine the importance of algorithmic scope in the execution of sequences of these algorithms. The algorithmic scope is the total number of algorithms in the sequence that found better solutions in solving a specific TSP city data set. The research will reveal some algorithm(s) such as genetic algorithm, ant colony optimization, and simulated annealing will bear the brunt of computation in the sequence thus deeming other algorithm(s) in that sequence as wasteful. The Life Cycle Model will serve as a construct for this research. Predicting Diverse Subsets Using Structural SVMs, Yisong Yue and Thorsten JoachimsPoster Location: 2
Abstract In many retrieval tasks, one important goal involves retrieving a diverse set of results (e.g., documents covering a wide range of topics for a search query). First of all, this reduces redundancy, effectively presenting more information with the presented results. Secondly, search queries are often ambiguous at some level. For example, the query “Jaguar” can refer to many different topics (such as the car or the feline). A set of documents with high topic diversity ensures that fewer users abandon the query because none of the results are relevant to them. Unlike existing approaches to learning retrieval functions, we present a method that explicitly trains to diversify results. In particular, we formulate the learning problem of predicting a diverse subset and derive a training algorithm based on structural SVMs. Discovering Cyclic Causal Models by Independent Components Analysis, Gustavo Lacerda, Peter Spirtes, Joseph Ramsey and Patrik O. HoyerPoster Location: 3
Abstract We generalize Shimizu et al’s (2006) ICA-based approach for discovering linear
non-Gaussian acyclic (LiNGAM) Structural Equation Models (SEMs) from causally sufficient, continuous-valued observational data. By relaxing the assumption that the generating SEM’s graph is acyclic, we solve the more general problem of linear non-Gaussian (LiNG) SEM discovery. In the large sample limit, LiNG discovery algorithms output the set of distribution-equivalent SEMs that represent the population distribution. We
also give sufficient conditions under which only one of the distribution-equivalent output SEMs is “stable”, and apply a LiNG discovery algorithm to simulated data. Data-Mining Dynamical Systems: Automated Symbolic System Identification for Exploratory Analysis, Michael Douglas Schmidt and Hod LipsonPoster Location: 4
Abstract This paper describes a new algorithm for automatically reverse-engineering symbolic analytical models of dynamical systems directly from experimental observations, for the purpose of modeling, control and exploratory analysis. The new algorithm builds on genetic programming techniques used in symbolic regression to infer differential equations from time series data. We introduce the core algorithm for building coherent mathematical models efficiently and then describe its application to system identification. The method is demonstrated on a number of nonlinear mechanical and biological systems. Privacy-Preserving Belief Propagation and Sampling, Michael Kearns, Jinsong Tan and Jennifer WortmanPoster Location: 5
Abstract We provide provably privacy-preserving versions of belief propagation,
Gibbs sampling, and other local algorithms --- distributed multiparty
protocols in which each party or vertex learns only its final local
value, and absolutely nothing else. Emotional Robotics: Tug of War, David Grant Cooper, Dov Katz and Hava T. SiegelmannPoster Location: 6
Abstract Emotional communication skills are dominant in biological systems. Although
the rules that govern creating and broadcasting emotional cues are inherently
complex, their effectiveness makes them attractive for biological systems.
Emotional communication requires very low bandwidth and is generally easy to
interpret. Despite the advantages of emotional communication, little or no
research has explored which emotional cues are the most effective when used
by a robot. To study this question, we introduce an interactive environment
in which a person can learn the robot's emotional responses through
interaction. We then present a one player game in which a person attempts to
attract the robot's attention, make it move towards and stay close to the
person. We further develop this concept into a two player version, in which the
players engage in a "Tug of War" game, competing for the
robot's heart. We propose our system as a potential test bed for human-robot
interaction, both for engineers, and clinical psychologists.
A Rate-Distortion One-Class Model and its Applications to Clustering, Koby Crammer, Partha Pratim Talukdar and Fernando PereiraPoster Location: 7
Abstract We study the problem of one-class classification, in which we seek a rule to separate a coherent subset of instances similar to a few positive examples from a large pool of instances. We find that the problem can be formulated naturally in terms of a rate-distortion tradeoff, which can be analyzed precisely and leads to an efficient algorithm that competes well with two previous one-class methods. We also show that our model can be extended naturally to clustering problems in which it is important to remove background clutter to improve cluster purity. Supervised Dimensionality Reduction with Group Constraints, M. Arthur Munson and Rich CaruanaPoster Location: 8
Abstract Analyzing a learned model to gain knowledge about the problem domain is hard when the model uses hundreds or thousands of features. Supervised dimensionality reduction methods can summarize the relevant information in the inputs and lead to models better suited to knowledge discovery. Many domains contain diverse features that should not be summarized together; however, existing reduction algorithms mix these features when finding low dimensional representations. We propose the problem of thematic supervised dimensionality reduction, a variant of supervised dimensionality reduction in which a) the original features are grouped, and b) each of the projected dimensions is constrained to depend on a single group of input features. We adapt existing dimensionality reduction methods to respect group constraints and empirically evaluate their performance on three real problems from medicine, ecology, and image recognition.
An On-Line, Non-Stationary IP Clustering Algorithm, Robert Beverly and Karen SollinsPoster Location: 9
Abstract We pose partitioning a k-bit Internet Protocol (IP) address space as a supervised learning task. Given incomplete and sparse training data, we require partitions that provide accurate predictions on new addresses. We develop an IP-specific clustering algorithm that provides a natural means to penalize complexity, limit memory consumption, and is amenable to on-line, non-stationary learning. To validate our approach, we model an agent's partial view of the
Internet using both synthetic and real data sets. On a latency prediction task, our algorithm outperforms previous approaches, provides O(k) running time, and is fast in practice. Using
synthetic changes injected into real data, we show the ability to accommodate structural and dynamic changes inherent in networks. Statistical Power Analysis for Improved Learning of Bayesian Network Structure, Andrew Fast, Michael Hay and David JensenPoster Location: 10
Abstract We present a novel hybrid algorithm for learning the structure of a Bayesian network from data. Our algorithm, Power, outperforms current state-of-the-art hybrid learning algorithms in both likelihood of the final model and runtime of the algorithm. We demonstrate that current hybrid algorithms produce many false negative errors due to low-power statistical tests. These errors can be attributed to a failure to consider the size of effect when deciding when to run a test. In order to avoid these errors, Power incorporates techniques from statistical power analysis to reason about the size of effect and statistical power of the hypothesis tests. In addition, Power avoids exponential increases in the runtime of constraint-based algorithms in certain datasets by limiting the size of the conditioning sets considered by the algorithm. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression, David Mimno and Andrew McCallumPoster Location: 11
Abstract Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.
Essential Phenomena of General Intelligence, Marc Pickett, Don Miner and Tim OatesPoster Location: 12
Abstract We present a set of cognitive phenomena that should be exhibited by a
generally intelligent system. To date, we know of few systems that
address more than a handful of these phenomena, and none that are able
to explain all of them. Thus, these phenomena should motivate a
system's design, test its generality, and can be used to point out
fundamental shortcomings. The phenomena encourage autonomous
learning, development of representations, and domain independence,
which we argue are critical for general intelligence.
Collective Inference on Markov Models for Modeling Bird Migration, Daniel Sheldon, M. A. Saleh Elmohamed and Dexter KozenPoster Location: 13
Abstract We investigate a family of inference problems on Markov models, where
many sample paths are drawn from a Markov chain and partial
information is revealed to an observer who attempts to reconstruct the
sample paths. We present algorithms and hardness results for several
variants of this problem which arise by revealing different
information to the observer and imposing different requirements for
the reconstruction of sample paths. Our algorithms are analogous to
the classical Viterbi algorithm for Hidden Markov Models, which finds
the single most probable sample path given a sequence of observations. Our
work is motivated by an important application in ecology: inferring
bird migration paths from a large database of observations. Useful Topological Features of Planning State Spaces, Blazej Bulka and Marie desJardinsPoster Location: 14
Abstract The topological characteristics of the state space graph for a planning problem are related to the internal structure of the underlying problem, and can affect the efficiency of planning. We show some important topological features for benchmark planning domains and state spaces, and discuss how one could determine or approximate these features without extensive exploration of the state space graph. We also show that the knowledge of some of these features can improve planning performance. Combining Collective Classification and Link Prediction, Galileo Mark Namata, Mustafa Bilgic and Lise GetoorPoster Location: 15
Abstract The problems of object classification (labeling the object features) and link prediction (predicting link existence) have been largely studied independently. Commonly, object classification is performed assuming a complete set of known links and link prediction is done assuming a fully observed set of node attributes. In most real world domains, however, attributes and links are often missing or incorrect. In this paper, we propose an approach that addresses these two problems by iteratively interleaving object classification and link prediction in a collective algorithm. We investigate empirically the conditions under which an integrated approach to object classification and link prediction improves performance, and find that performance improves over a wide range of network types, and algorithm settings. Two View Learning over Structured and Non-Identical Output Spaces, Kuzman Ganchev, Joao Graca, John Blitzer and Ben TaskarPoster Location: 16
Abstract For many interesting machine learning problems, we have ample
unlabeled data and limited labeled training data. For some of these
problems a class of semi-supervised learning methods has recently been
developed that rely on the assumption that multiple views of each
example should agree on its label. In this paper we present a new
algorithm for probabilistic two-view learning. Our algorithm works on
structured and unstructured problems and easily generalizes to partial
agreement scenarios. For the full agreement case, our algorithm
works by minimizing the Bhattacharyya distance between the models
of each view, and performs better than CoBoosting and two-view
Perceptron on several flat and structured classification problems. In
addition to demonstrating these improvements we demonstrate that our
framework can give significant improvements on two views with
differing output spaces. From Backbone to Bias: Assessing Probabilistic Inference for Heuristic Search, Eric I Hsu, Christian J Muise, J Christopher Beck and Sheila A McIlraithPoster Location: 17
Abstract Backbone variables have the same assignment
in all solutions to a given constraint satisfaction
problem; more generally, bias represents the proportion
of solutions that assign a variable a particular
value. Intuitively such constructs would
seem important to efficient search, but their study
to date has assumed a mostly conceptual perspective,
in terms of indicating problem hardness
or motivating and interpreting heuristics. In
this work, we first measure the ability of both
existing and novel probabilistic message-passing
techniques to directly estimate bias (and identify
backbones) for the specific problem of Boolean
Satisfiability (SAT). We confirm that methods
like Belief Propagation and Survey Propagation–
plus Expectation Maximization-based variants–
do produce good estimates with distinctive properties.
We demonstrate the use of bias estimation
within a modern SAT solver, and exhibit a
correlation between accurate, stable, estimates
and successful backtracking search. The same
process also yields a family of search heuristics
that can dramatically improve search efficiency
for the hard random problems considered in this
study. Computing Most Probable Worlds for Action Probabilistic Logic Programs, Samir Khuller, Vanina Martinez, Dana Nau, Gerardo Ignacio Simari, Amy Sliva and VS SubrahmanianPoster Location: 18
Abstract Probabilistic logic programs have primarily studied the problem of
entailment of probabilistic atoms. However, there are some
interesting applications where we are interested in finding a
possible world that is most probable. Our first result shows that
the problem of computing such "maximally probable worlds" (MPW) is
intractable. We subsequently show that we can often greatly reduce
the size of the linear program used in past work (by Ng and
Subrahmanian) and yet solve the problem exactly. However, the
intractability results still make computational efficiency quite impossible.
We therefore also develop several heuristics to solve the
MPW problem and report extensive experimental results on the
accuracy and efficiency of such heuristics. Utilizing Emotions in Strategic Real-Time Artificial Intelligence, Megan Marie Olsen, Kyle Harrington and Hava SiegelmannPoster Location: 19
Abstract Strategic real-time systems are of high potential and their applications are growing, although they are mostly prevalent in video games, military training, and military planning. We propose a paradigm to advance current systems by introducing emotions into the simulated agents that make decisions and solve situations cooperatively. By utilizing emotional reactions and communication, we hope to advance these systems so that the decision process better mimics human behavior. Since our system allows sharing of emotions with nearby agents it utilizes both internal and external emotional control. Learning Predictive Models Of Weather Dynamics, Sooho Park and Nicholas RoyPoster Location: 20
Abstract We address the problem of identifying the most informative path for a mobile
sensor in a highly dynamic weather system. In choosing plans for the sensor
based on predicted weather conditions, the computational demands of
numerical weather simulation allow the information gain to be
evaluated at very few sensing locations, even using greedy planning strategies.
We describe the use of Kernel Regularized Least-Squares to learn
an approximate model of computationally-intractable weather
dynamics. We use the learned model to predict the information gained by
different sensor trajectories through the weather system. We show in
physically-motivated weather simulations that the learned model allows us to
evaluate considerably more trajectories than conventional prediction algorithms, leading to
improved sensor trajectory selection and improved overall weather prediction
accuracy. New Insights into Performance of A* Search, Hang Trung DinhPoster Location: 21
Abstract We study the behavior of the classical A* search algorithm when coupled with a heuristic that provides estimates, accurate to within a small multiplicative factor, of the optimal cost path to a solution. We prove general upper bounds on the time complexity of A* search on trees, for both admissible and unconstrained heuristic functions, that depend on both the accuracy of the heuristic and the distribution of solution objective values. We go on to provide nearly matching lower bounds that are attained even by non-adversarially chosen solution sets induced by a simple stochastic model. Finally, we provide an upper bound for A* search on graphs, depending on graph eigenvalues. Our rigorous results indicate that in many cases and in average case, the effective branching factor of A* search is significantly reduced if given an approximately accurate heuristic. The main conclusion stating ``the effect of a heuristic function is to reduce the effective depth of search rather than the effective branching factor'' in (Korf & Reid, 1998; Korf et al., 2001; Korf, 2000a; Korf, 2000b) is wrong in general, and it was based on serious mathematical flaws. Detecting hand-ball events in video sequences, Nicholas Miller and Richard MannPoster Location: 22
Abstract We analyze video sequences of a hand interacting with a ball. The hand motion is segmented using a piecewise polynomial motion model inspired by research in motor control. Next, an adaptive gravitational model of the ball is used to locate hand-ball events. We show that hand-ball events can be automatically classified from velocity and acceleration profiles.
SNePS Agent for QPR Suicide Prevention Method, Ali Patrice SeyedPoster Location: 23
Abstract 11th leading cause of death in the United States in 2004, accounting for 32,439 deaths and an estimated 8 to 25
attempted suicides occur per every suicide
death. The goal of this
project is to create a SNePS (Semantic Net-
work Processing System) KRR system to
model the cognitive processing crucial to as-
sist an individual contemplating suicide. Visualizing and Clustering Multivariate Time Series Data to Detect Medical Conditions, Patricia Ordonez, Marie desJardins, Jim Fackler, Carolyn Feltes and Christoph LehmannPoster Location: 24
Abstract Efficient unsupervised algorithms for the detection
of patterns in time series data, often
called motifs, have been used in many applications,
such as identifying words in different
languages, detecting anomalies in ECG readings,
and finding similarities between images.
We present an approach named Multivariate
Time Series Amalgamation (MTSA) that
creates a personalized multivariate time series
representation, a Multivariate Amalgam
(MVA) of physiological data and laboratory
results that physicians can trust and visually
interpret. We then apply a technique that
has demonstrated success with interpretation
of univariate data, named Symbolic Aggregate
Approximation (SAX), to visualize patterns
in the MVAs that may differentiate between
medical conditions such renal and respiratory
failure. Learning Dynamics with Recurrent Neural Nets, James A LenfesteyPoster Location: 25
Abstract We are exploring the problem of how a robot learn to use its own body by first
modeling the dynamical relationship between motor control signals and sensory
phenomena. Our approach has been to train recurrent neural networks via a
hybrid evolutionary algorithm and gradient descent. Prelimary results suggest
the ability to represent the time evolution of arbitrary dynamical systems.
A New Kernel for SVM MLLR based Speaker Recognition, Zahi N Karam and William M CampbellPoster Location: 26
Abstract Speaker recognition using support vector machines
(SVMs) with features derived from
generative models has been shown to perform
well. Typically, a universal background
model (UBM) is adapted to each utterance
yielding a set of features that are used in an
SVM.We consider the case where the UBM is
a Gaussian mixture model (GMM), and maximum
likelihood linear regression (MLLR)
adaptation is used to adapt the means of the
UBM. We examine two possible SVM feature
expansions that arise in this context:
the first, a GMM supervector is constructed
by stacking the means of the adapted GMM,
and the second consists of the elements of the
MLLR transform. We examine several kernels
associated with these expansions. We
show that both expansions are equivalent
given an appropriate choice of kernels. Experiments
performed on the NIST SRE 2006
corpus clearly highlight that our choice of
kernels, which are motivated by distance metrics
between GMMs, outperform ad-hoc ones.
We also show that by using multi-class MLLR
adaptation we can approach state of the art
performance. Forest Reranking: Discriminative Parsing with Non-Local Features, Liang HuangPoster Location: 27
Abstract Conventional n-best reranking techniques often suffer from the limited scope of the
n-best list, which rules out many potentially good alternatives. We instead propose forest reranking, a method that reranks a packed forest of exponentially many parses. Although exact inference is intractable with non-local features, we present an approximation algorithm that makes discriminative training practical over the whole Treebank. Our final result, an F-score of 91.7, outperforms both 50-best and 100-best reranking baselines, and is better than any previously reported systems trained on the Treebank.
Groves of Trees, Daria Sorokina, Rich Caruana and Mirek Riedewald
Poster Location: 28
Abstract
We present Groves of trees --- a new method for building an additive ensemble of trees. Groves are based on additive models enhanced by techniques of gradient boosting and bagging and combined with a new algorithm for training additive models. Like bagging (but unlike boosting), the ensemble can use large trees and does not suffer from overfitting. Like gradient boosting, it can be adapted to optimize an arbitrary loss function. Like backfitted additive models, it makes use of an additive structure of the response. As a result, it significantly outperforms other ensembles on regression problems, and on classification problems is consistently one of the top performing methods.
An important task where Groves are useful is discovering additive structure and interactions. Discovering additive structure is an important step towards understanding a complex multi-dimensional function because it allows the function to be expressed as the sum of lower-dimensional components. When variables interact, however, their effects are not additive and must be modeled and interpreted simultaneously. We present a new approach for the problem of interaction detection. Our method is based on comparing the performance of unrestricted and restricted additive Groves. We demonstrate the results of interaction detection on real data describing the abundance of different species of birds in the prairies east of the southern Rocky Mountains.
Posters - Saturday, 6pm
Location: Upson Hall, 5th Floor
Learning from Collective Behavior, Michael Kearns and Jennifer WortmanPoster Location: 1
Abstract Inspired by longstanding lines of research in sociology and related fields, and by more recent large-population human subject experiments on the Internet and the Web, we initiate a study of the computational issues in learning to model collective behavior from observed data. We define formal models for efficient learning in such settings, and provide both general theory and specific learning algorithms for these models. Dynamic Visual Category Learning, Tom YehPoster Location: 2
Abstract Dynamic visual category learning calls for efficient adaptation as
new training images become available or new categories are
defined, existing training images or categories become modified or
obsolete, or when categories are divided into subcategories or
merged together. We develop novel methods for efficient
incremental learning of SVM-based visual category classifiers to
handle such dynamic tasks. Our method exploits previous classifier
estimates to more efficiently learn the optimal parameters for the
current set of training images and categories. We show
empirically that for dynamic visual category tasks, our
incremental learning methods are significantly faster than batch
retraining. Learning When to Take Advice: A Statistical Test for Achieving A Correlated Equilibrium, Greg Hines and Kate LarsonPoster Location: 3
Abstract In this paper we study a multiagent learning problem where agents can either learn via repeated interactions, or can follow the advice of a mediator who suggests possible actions to take. We present an algorithm that each agent can use so that, with high probability, they can verify whether or not the mediator's advice is useful. In particular, if the mediator's advice is useful then agents will reach a correlated equilibrium, but if the mediator's advice was not useful, then agents are not harmed by using our test, and can fall back to their original learning algorithm. We then generalize our algorithm and show that in the limit it always correctly verifies the mediator's advice. Ranking in a Multiple Instance Setting, Charles Bergeron, Jed Zaretzki, Curt Breneman and Kristin P. BennettPoster Location: 4
Abstract This paper introduces a novel machine learning model called
multiple instance ranking (MIRank) that enables ranking to be
performed in a multiple instance learning setting. The motivation
for MIRank stems from the hydrogen abstraction problem in
computational chemistry, that of predicting the grouping of
hydrogen atoms from which a hydrogen is abstracted (removed)
during metabolism. The model predicts the preferred hydrogen
grouping within a molecule by ranking the groups, with the
ambiguity of not knowing which hydrogen within the preferred
grouping is actually abstracted. This paper formulates MIRank in
its general context and proposes an algorithm for solving MIRank
problems using successive linear programming. The method
outperforms multiple instance classification models on several
real and synthetic datasets. Conformant Planning through QBF, Alexandra Goultiaeva and Fahiem BacchusPoster Location: 5
Abstract Conformant planning involves finding a plan that is guaranteed to achieve the goal in the face of incomplete information. Unlike classical planning, conformant planning cannot be directly encoded into SAT. It can, however, be directly encoded in the more expressive language of Quantified Boolean Formulas (QBF). The resulting QBF can then be solved in different ways including through the application of generic QBF solvers. In this paper we investigate how recent advances in QBF solving, specifically the use of circuit-based representations, can be exploited to solve conformant planning problems more effectively. We develop some new ways of encoding these problems so as to better exploit the circuit-based representation, and show empirically that this can lead to significant performance improvements. An Empirical Evaluation of Supervised Learning in High Dimensions, Ainur Yessenalina, Nikos Karampatziakis and Rich CaruanaPoster Location: 6
Abstract In this paper we perform an empirical evaluation
of supervised learning methods on highdimensional
data. We evaluate learning performance
on three metrics: accuracy, AUC,
and squared loss. We also study the effect
of increasing dimensionality on the relative
performance of the learning algorithms. Our
findings are consistent with previous studies
for problems of relatively low dimension,
but suggest that as dimensionality increases
the relative performance of the various learning
algorithms changes. Structured Learning with Approximate Inference, Alex Kulesza and Fernando PereiraPoster Location: 7
Abstract In many structured prediction problems, the highest-scoring labeling
is hard to compute exactly, leading to the use of approximate
inference methods. However, when inference is used in a learning
algorithm, a good approximation of the score may not be sufficient. We
show in particular that learning can fail even with an approximate
inference method with rigorous approximation guarantees. There are
two reasons for this. First, approximate methods can effectively
reduce the expressivity of an underlying model by making it impossible
to choose parameters that reliably give good predictions. Second,
approximations can respond to parameter changes in such a way that
standard learning algorithms are misled. In contrast, we give two
positive results in the form of learning bounds for the use of
LP-relaxed inference in structured perceptron and empirical risk
minimization settings. We argue that without understanding
combinations of inference and learning, such as these, that are
appropriately compatible, learning performance under approximate
inference cannot be guaranteed. Convex Clustering with Exemplar-based Models, Danial LashkariPoster Location: 8
Abstract Clustering is often formulated as the maximum
likelihood estimation of a mixture model that explains the data. The
EM algorithm widely used to solve the resulting optimization problem
is inherently a gradient-descent method and is sensitive to
initialization. The resulting solution is a local optimum in the
neighborhood of the initial guess. This sensitivity to
initialization presents a significant challenge in clustering large
data sets into many clusters. In this paper, we present a different
approach to approximate mixture fitting for clustering. We introduce
an exemplar-based likelihood function that approximates the exact
likelihood. This formulation leads to a convex minimization problem
and an efficient algorithm with guaranteed convergence to the
globally optimal solution. The resulting clustering can be thought
of as a probabilistic mapping of the data points to the set of
exemplars that minimizes the average distance and the
information-theoretic cost of mapping. We present experimental
results illustrating the performance of our algorithm and its
comparison with the conventional approach to mixture model
clustering.
Copeland Voting: Ties Matter, Piotr Faliszewski, Edith Hemaspaandra and Henning SchnoorPoster Location: 9
Abstract We study the complexity of manipulation for a family of election systems derived from Copeland voting via introducing a parameter alpha that describes how ties in head-to-head contests are valued. We show that the thus obtained problem of manipulation for unweighted Copeland^alpha elections is NP-complete even if the size of the manipulating coalition is limited to two. Our result holds for all rational values of alpha such that 0 < alpha < 1 except for alpha = 1/2 . Better Alignments = Better Translations?, Joao Graca, Kuzman Ganchev and Ben TaskarPoster Location: 10
Abstract Automatic word alignment is a key step in training statistical machine
translation systems. Despite much recent work on word alignment
methods, alignment accuracy increases often produce little or no
improvements in machine translation quality. In this work we analyze a
recently proposed agreement-constrained EM algorithm for unsupervised
alignment models. We attempt to tease apart the effects that this
simple but effective modification has on alignment precision and
recall trade-offs, and how rare and common words are affected across
several language pairs. We propose and extensively evaluate a simple
method for using alignment models to produce alignments better-suited
for phrase-based MT systems, and show significant gains (as measured
by BLEU score) in end-to-end translation systems for six languages
pairs used in recent MT competitions. Training Structural SVMs when Exact Inference is Intractable, Thomas W Finley and Thorsten JoachimsPoster Location: 11
Abstract While discriminative training (e.g., CRF, structural SVM) holds much promise for machine translation, image segmentation, and clustering, the complex inference these applications require make exact training intractable. This leads to a need for approximate training methods. Unfortunately, knowledge about how to perform efficient and effective approximate training is limited. Focusing on structural SVMs, we provide and explore algorithms for two different classes of approximate training algorithms, which we call undergenerating (e.g., greedy) and overgenerating (e.g., relaxations) algorithms. We provide a theoretical and empirical analysis of both types of approximate trained structural SVMs, focusing on fully connected pairwise Markov random fields. We find that models trained with overgenerating methods have theoretic advantages over undergenerating methods, are empirically robust relative to their undergenerating brethren, and relaxed trained models favor non-fractional predictions from relaxed predictors. Deep Belief Net Learning for Online Terrain Classification on a Mobile Robot, Raia Hadsell, Ayse Erkan, Pierre Sermanet, Marco Scoffier, Urs Muller and Yann LeCunPoster Location: 12
Abstract We present an online classifier that is able to accurately classify
complex terrain at distances up to the horizon, thus allowing
high-level strategic planning for a mobile robot. A deep belief
network is trained to extract informative and meaningful features from
an input image, and the features are used to train a realtime
classifier to predict traversability. Supervision for the training
comes from a short-range stereo module, which robustly labels the
image using 5 classes. The process was developed and tested on the
LAGR mobile robot. Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis, Kevin Lerman, Ari Gilder, Mark Dredze and Fernando PereiraPoster Location: 13
Abstract Media reporting of current events shapes public opinion which can in turn influence events. This is particularly true of political elections, in which candidates both respond to and shape public perception of their campaigns. We use computational linguistics to automatically predict changes in public perception of candidates during the 2004 US Presidential election season. Our system uses daily newspaper articles to predict shifts in public opinion as reflected in prediction markets. We discuss various types of features designed for this problem. The news system improves market prediction over baseline market systems. Learning from Labeled Features using Generalized Expectation Criteria, Gregory Druck, Gideon Mann and Andrew McCallumPoster Location: 14
Abstract It is difficult to apply machine learning to new domains because there are often no existing labeled problem instances. In this paper, we provide a solution to this problem that leverages domain knowledge in the form of correlations between input features and classes. Specifically, we propose a method for training discriminative probabilistic models with ``labeled features" and unlabeled instances. Unlike previous approaches that use labeled features to create labeled instances, we use labeled features directly to constrain the model's predictions on unlabeled instances. We express these soft constraints using generalized expectation (GE) criteria --- terms in a parameter estimation objective function that assign scores to values of a model expectation. The complete objective function also includes a Gaussian prior on parameters, which encourages generalization by spreading parameter weight to unlabeled features. Experimental results on text classification data sets show that this method outperforms heuristic approaches to training classifiers with labeled features, and experiments with human annotators show that it is more beneficial to spend limited annotation time labeling features rather than labeling instances. A Discriminative Semi-Markov Model for Robust Scene Text Recognition, Jerod J Weinman, Erik Learned-Miller and Allen R HansonPoster Location: 15
Abstract We present a system for recognizing scene text that is aided by the use of a lexicon, yet also allows non-lexicon words. Our discriminative semi-Markov probabilistic model fully integrates both character and word segmentation with recognition. The system uses wavelet image features for recognition and requires only approximate location of the text baseline and font size, but no binarization or prior word segmentation is necessary. To facilitate inference with a large lexicon, we compare several methods for approximate Viterbi beam search. Our system performs robustly on low-resolution images of signs containing text in fonts atypical of documents. Using Inadmissible Heuristics for Bounded Suboptimal Searc, Jordan Thayer, Wheeler Ruml and Ephrat BittonPoster Location: 16
Abstract Applications often demand we tackle problems that are too large to
solve optimally. We introduce two general approaches to heuristic
search with inadmissible heuristics and show how to guarantee a bound
on the suboptimality of the resulting solutions while ignoring many
duplicate states. We also propose a new inadmissible heuristic based
on on-line temporal difference learning. We show empirically that
these new approaches to suboptimal heuristic search can outperform
weighted A* on standard grid-world path-finding and temporal planning
benchmarks when finding near-optimal solutions. Using Functions on a Model Graph for Inductive Transfer, Eric Eaton, Marie desJardins and Terran LanePoster Location: 17
Abstract In this paper, we propose a novel graph-based method for knowledge transfer. We embed a set of learned background models in a graph that captures the transferability between the models. We then learn a function on this graph that automatically determines the parameters to transfer to each learning task. Transfer to a new problem proceeds by mapping the problem into the graph, then using the function to determine the parameters to transfer in learning the new model. This method is analogous to inductive transfer along a manifold that captures the transfer relationships between the tasks. Minimax regret based elicitation of generalized additive utilities, Darius Braziunas and Craig BoutilierPoster Location: 18
Abstract We describe the semantic foundations for elicitation of generalized additively independent (GAI) utilities using the minimax regret criterion, and propose several new query types and strategies for this purpose. Computational feasibility is obtained by exploiting the local GAI structure in the model. Our results provide a practical approach for implementing preference-based constrained configuration optimization as well as effective search in multiattribute product databases. Classifying and Clustering Dialects of North American English, Keelan EvaniniPoster Location: 19
Abstract This paper presents the results of experiments in which machine learning techniques were applied to the problem of determining regional dialect boundaries. Specifically, decision trees classification and k-means clustering were applied to a corpus of phonetic measurements taken from a large survey of North American English vowels. Pairwise classification and clustering experiments were done for all combinations of ten dialect regions determined by dialectologists. The results show which of these dialect regions are most distinct and similar, suggesting which of the distinctions that are usually used by linguists are the most meaningful. Furthermore, the classification trees are analyzed to show which vowel formants are most informative for each dialect region. Learning to Create Data-Integrating Queries, Partha Pratim Talukdar, Marie Jacob, Mohammad Salman Mehmood, Koby Crammer, Zack Ives and Fernando PereiraPoster Location: 20
Abstract In today's world of online information, the complexity of data
resources available for querying --- databases, data warehouses,
virtual integrated schemas --- continues to grow at an astounding
pace. Perhaps no area has seen this problem as acutely as the life
sciences, where hundreds of large, complex, interlinked, and
overlapping data resources have become available on fields like
proteomics, genomics, diseases, and chemical compounds.
With-in any of these data-bases, schemas are often massive,
but additionally, many biologists and pharmaceutical researchers
need to pose integrative queries across multiple sources,
exploiting external references, similar terms, and other means of
interlinking data. In this paper we develop and implement a model
in which the user poses keyword queries that are matched
against source relations and their attributes; the system uses
sequences of associations (foreign keys, URLs, references,
partial schema mappings, synonyms, taxonomies, etc. to create
multiple ranked (weighted) queries linking the matches to keywords;
the user is given sets of answers from the union of potential
queries, and the user provides feedback from which the system
ultimately learns to weight sources and associations according to
the user's specific information need. We evaluate the effectiveness
of our methods against ``gold standard'' weightings from expert
biologists and demonstrate their scalability. VOILA: Efficient Feature-value Acquisition for Classification, Mustafa Bilgic and Lise GetoorPoster Location: 21
Abstract We address the problem of efficient feature-value acquisition for classification in domains in which there are varying costs associated with both feature acquisition and misclassification. The objective is to minimize the sum of the information acquisition cost and misclassification cost. Any decision theoretic strategy tackling this problem needs to compute value of information for sets of features. Having calculated this information, different acquisition strategies are possible (acquiring one feature at time, acquiring features in sets, etc.). However, because the value of information calculation for arbitrary subsets of features is computationally intractable, most traditional approaches have been greedy, computing values of features one at a time. We make the problem of value of information calculation tractable in practice by introducing a novel data structure called the Value of Information Lattice (VOILA). VOILA exploits dependencies between missing features and makes sharing of information value computations between different feature subsets possible. To the best of our knowledge, performance differences between greedy acquisition, acquiring features in sets, and a mixed strategy have not been investigated empirically in the past, due to inherent intractability of the problem. With the help of VOILA, we are able to evaluate these strategies on five real world datasets under various cost assumptions. We show that VOILA reduces computation time dramatically. We also show that the mixed strategy outperforms both greedy acquisition and acquisition in sets. A consolidated actor-critic model for temporal difference learning with function approximation and application to POMDPs, Christopher Allen Niedzwiedz, Itamar Elhanany, Zhenzhen Liu and Scott LivingstonPoster Location: 22
Abstract Practical problems in artificial intelligence of-
ten involve both large state and/or action spaces.
In high-dimensional cases, function approxima-
tion methods, such as neural networks, are often
used to overcome limitations of traditional tab-
ular schemes. In the context of reinforcement
learning, the actor-critic architecture has received
much attention in recent years, in which an ac-
tor network maps states to actions and a seper-
ate critic network produces value function ap-
proximation given a state-action pair. This pa-
per presents a novel approach for consolidating
the actor and critic networks into a single net-
work that provides the functionality offered by
the two separate networks and a glance at current
work being undertaken with respect to Partially
Observable Markov Decision Processes.
Protein Function Prediction from Topological and Relational Features, Samuel Huang, Galileo Namata, Carl Kingsford and Lise GetoorPoster Location: 23
Abstract Protein-protein interaction (PPI) networks provide rich structural data about the set of proteins in a given organism. With high-throughput techniques making more and more PPI networks widely available, there is increasing interest in computationally studying these noisy networks to extract biological knowledge and avoid more costly wetlab experiments. In this paper, we present a novel manner of using of these networks for protein function prediction. Using topological and relational features of proteins within these networks, we apply machine learning techniques to predict which set of functions a given protein has.
Epipolar Geometry for Humanoid Robotic Heads, Justin Wildrick Hart, Brian Scassellati and Steven ZuckerPoster Location: 24
Abstract Epipolar Geometry describes the projective relationship between two camera views in terms of the intrinsic parameters of each camera and their position and orientation with respect to one another. It is generally expressed as either the fundamental matrix or the essential matrix, which can be computed using corresponding image points in both views. In an active vision system, one or more cameras are mounted either on a movable platform or on a device that is capable of manipulating its environment. In the former case, if more than one camera is used, changes in their position or orientation with respect to each other will result in changes in their epipolar geometry. On humanoid robots, such changes occur whenever the vision system performs a saccade or verges upon an object. In this paper, we present a method that improves on existing algorithms for camera calibration and the estimation of epipolar geometry by estimating the kinematics of the active vision system and the mounting of the cameras with respect to one another. This knowledge allows us to update our knowledge of the cameras\' positions as the robot moves, allowing for the essential matrix to be updated in constant time. Preserving the Privacy of Sensitive Relationships in Graph Data, Elena Zheleva and Lise GetoorPoster Location: 25
Abstract In this paper, we focus on the problem of preserving the privacy of sensitive
relationships in graph data. We refer to the problem of inferring sensitive
relationships from anonymized graph data as link re-identification.
We propose five different graph-anonymization strategies, which vary in terms of
the amount of data removed (and hence their utility) and the amount of privacy preserved.
We assume the adversary has an accurate predictive model for links, and we show
experimentally the success of different link re-identification strategies under varying structural characteristics of the data. Parsing images of people with an MDP, Alexander Sorokin and David ForsythPoster Location: 26
Abstract We consider the task of finding people and their configuration in a
single frame. We phrase this task as a decision process resulting in
an MDP. We discuss what policy iteration and value iteration mean in
our framework. We finally mention our annotation efforts.
Large-Scale acquisition of labeled data for computer vision, Alexander Sorokin and David ForsythPoster Location: 27
Abstract We discuss a method of obtatining large collection of annotated data
at minimal cost for specific research project. Our approach has the
benefits of: low cost of entry, low overall cost, task-oriented data
annotation, fast turn-around, almost infinite scalability.
|