Poster Sessions
There will be two poster sessions, one Friday evening and one Saturday
evening. The poster locations indicate where the poster will be presented,
please make sure to hang your poster in the right location. Attendees will get
a map at registration.
Posters - Friday, 6pm
Location: Duffield Atrium
Ascending Auctions for Markets with Structured Externalities, Michael Pavlin and Craig BoutilierPoster Location: 1
Abstract While ascending auctions have found considerable popularity applied to combinatorial allocation problems, recent proposals for ascending auctions which account for externalities have placed significant demands on bidding agents by requiring that they accurately value and communicate bids on the full outcome space (including the allocations awarded to competitors). This requirement is unreasonable in most practical settings. We identify a large class of problems, including wireless networks with interference, where externalities are characterized by aggregate resource usage. We propose a more general allocation model in which such problems can be naturally encoded. We adapt recent ascending auctions for combinatorial allocations to this model and show that, given myopic bidders, our mechanism optimizes social welfare. We illustrate its performance in a multi-hop wireless network domain with interference. The Impact of Network Topology on Pure Nash Equilibria in Graphical Games, Bistra Dilkina, Carla Gomes and Ashish SabharwalPoster Location: 2
Abstract Graphical games capture some of the key aspects relevant to the study and
design of multi-agent systems. It is often of interest to find the conditions
under which a game is stable, i.e., the players have reached a consensus on
their actions. In this paper, we characterize how different topologies of the
interaction network underlying a graphical game with random payoffs affect the
probability of existence of a pure Nash equilibrium. We show that for tree topologies with unbounded diameter the probability of a pure Nash equilibrium vanishes as the number of players grows large. On the positive side, we define several families of graphs for which the probability of a pure Nash equilibrium is at least 1-1/e even as the number of players goes to infinity. We also show
how adding a small number of connections can increase the probability of pure
Nash.
Structured Local Training and Biased Potential Funtions for Conditional Random Fields with Application to Coreference Resolution, Yejin Choi and Claire CardiePoster Location: 3
Abstract Conditional Random Fields (CRFs) have shown
great success for problems involving structured output
variables. However, for many real-world NLP
applications, exact maximum-likelihood training is
intractable because computing the global normalization
factor even approximately can be extremely
hard. In addition, optimizing likelihood often does
not correlate with maximizing task-specific evaluation
measures. In this paper, we present a novel
training procedure, structured local training, that
maximizes likelihood while exploiting the benefits
of global inference during training: hidden variables
are used to capture interactions between local
inference and global inference. Furthermore,
we introduce biased potential functions that empirically
drive CRFs towards performance improvements
w.r.t. the preferred evaluation measure for
the learning task. We report promising experimental
results on two coreference data sets using two
task-specific evaluation measures. Analysis of Patterns in Web Traffic Using an Improved Gibbs Sampling Algorithm, Rui Yan, Igor Jurisica, Julie Waterhouse and Eric PoulinPoster Location: 4
Abstract In this paper, we propose a new, approximate Gibbs sampling algorithm and apply it to traffic data generated by online shoppers. Experimental results show that there are 23.6% of the patterns from our algorithm that are overlooked by the Gibbs sampling algorithm. Agent Influence as a Predictor of Difficulty for Decentralized Problem-Solving, Martin W. Allen and Shlomo ZilbersteinPoster Location: 5
Abstract We study the effect of problem structure on the performance of dynamic programming for decentralized partially observable Markov decision processes. It is shown that restricting agent influence over problem dynamics can make problems easier to solve. Experiments establish that agent influence correlates with problem difficulty: as the gap between agent influences grows, problems tend to become easier. The measure thus provides a general-purpose, automatic characterization of decentralized problems, identifying those for which optimal methods are more or less likely to work. Such a measure is also of possible use as a heuristic in the design of algorithms that create task decompositions and control hierarchies in order to simplify multiagent problems.
U-Rest: An Unsupervised Record Extraction SysTem, Yuan Kui Shen and David R KargerPoster Location: 6
Abstract In this paper, we describe a system that can
extract record structures from web pages with
no direct human supervision. Records are
commonly occurring HTML-embedded data
tuples that describe people, offered courses,
products, company profiles etc. We present a
simplified framework for studying the problem
of unsupervised record extraction – one
which separates the algorithms from the feature
engineering. Our system, U-REST formalizes
an approach to the problem of unsupervised
record extraction using a simple a
two-stage machine learning framework. The
first stage involves clustering, where structurally
similar regions are discovered, and
the second stage involves classification, where
discovered groupings (clusters of regions) are
ranked by their likelihood of being records.
In our work, we describe, and summarize the
results of an extensive survey of features for
both stages. We conclude by comparing UREST
to related systems. The results of our
empirical evaluation show encouraging improvements
in extraction accuracy. A Situation-Calculus Semantics for an Expressive Fragment of PDDL, Yuxiao Hu, Jens Classen and Gerhard LakemeyerPoster Location: 7
Abstract The Planning Domain Definition Language (PDDL) has become a common language to specify planning problems, facilitating the formulation of benchmarks and a direct comparison of planners. Over the years PDDL has been extended beyond STRIPS and ADL in various directions, for example, by adding time and concurrent actions. The current semantics of PDDL is purely meta-theoretic and quite complex, which makes an analysis difficult. Moreover, relating the language to other action formalisms is also nontrivial. We propose an alternative semantics for an expressive fragment of PDDL within the situation calculus. This yields at least two advantages. For one, the new semantics is purely declarative, making it amenable to an analysis in terms of logical entailments. For another, it facilitates the comparison with and mapping to other formalisms that are defined on top of the same logic, such as the agent control language Golog. In particular we obtain the semantical foundation for embedding efficient PDDL-based planners into the more expressive, yet computationally expensive Golog, thus combining the benefits of both. Other by-products of our investigations are a simpler account of durative actions in the situation calculus and a new notion of compulsory actions. Group-Constrained Feature Extraction, Art Munson and Rich CaruanaPoster Location: 8
Abstract Analyzing a learned model to gain knowledge about the problem domain
is hard when the model uses hundreds or thousands of features.
Feature selection can reduce the number of features used in the model,
but often the best performing models still need tens to hundreds of
features. Feature extraction (i.e. synthesis) can further reduce the
dimensionality of the feature space, but current methods ignore the
expert's preference for how features are logically related. Although
the resulting new set of features is small, the features are often
difficult to understand: each new feature may be a function of dozens
of seemingly unrelated features from the original space. To ease
model analysis, we propose new feature extraction methods that are
constrained by human-generated groups of the original features. The
extracted features are easier to understand because each one
summarizes only one logical group of attributes. We adapt existing
feature extraction methods to obey group constraints and empirically
measure how well they preserve predictive performance while reducing
dimensionality. Finally, we show how extracted features can be
visually understood in relationship to the original features.
A Support Vector Method for Optimizing Average Precision, Yisong Yue, Thomas Finley, Filip Radlinski and Thorsten JoachimsPoster Location: 9
Abstract Mean average precision (MAP) is among the most
common error metrics used for evaluating ranked
retrieval systems. However, learning ranking functions
that optimize for MAP has proven difficult,
since most machine learning algorithms are designed
to optimize for accuracy. In this paper we
present an SVM learning algorithm which directly
optimizes for MAP. We formulate the optimization
problem concisely, present our algorithm, and
prove it to efficiently find the optimal solution for
our formulation. We evaluate our approach using
the TREC 9 and TREC 10 Web Track corpus
(WT10g), comparing it to regular SVMs optimized
for accuracy and ROC-area, as well as to
the methods used by TREC participants. In most
cases we show our method to produce statistically
significant improvements in MAP scores. A Multi-Agent System that Attains Longevity Via Death, Megan M Olsen and Hava T SiegelmannPoster Location: 10
Abstract We propose a novel approach to self-regenerating systems which require continuous operation, such as security surveillance. For that aim we introduce HADES, a self-regenerating cooperative multi-agent system simulation with local monitoring. When agents of HADES find local failures they repair them. However, in extreme cases repair may not be possible and irregular aggressive agents will multiply. These irregular agents may use all of the system's resources and thus take over the system. To optimize system longevity, we identify protocols for killing these irregular agents. Our primary contribution is a double communication protocol of alert and death signals among the agents, making the multi-agent system robust to failures and attacks. Improving User Taught Task Models, Phillip Michalak and James AllenPoster Location: 11
Abstract Task models are essential components in many approaches to user modelling because they provide the context with which to interpret,predict, and respond to user behavior. The quality of such models is critical to their ability to support these functions. Typically these models are painstakingly constructed by hand for each domain. This paper describes work on improving task models that are automatically acquired from demonstration. The improved task models are better suited to modelling and recognizing user behavior than the baseline models. Modifications to a standard planning algorithm which achieve this result are described and applied to an example learned task model, showing the utility of incorporating plan-based reasoning into task learning systems. Task Selection for Multi-Task Bayesian Network Structure Learning, Alexandru Niculescu-Mizil and Rich CaruanaPoster Location: 12
Abstract We tackle the task selection problem for multi-task Bayes Net
structure learning. We present a method for automatically selecting,
from a pool of available tasks, a set of related tasks to be used by
the multi-task structure learning algorithm. The proposed task
selection method takes advantage of the fact that, unlike in other
multi-task learning scenarios, in the multi-task structure learning
setting there is a clear definition of task relatedness: two tasks are
related if they have similar structures. Experiments on a synthetic
problem generated from the ALARM network and on a real bird ecology
problem show that multi-task structure learning using the proposed
task selection method outperforms it's non selective counterpart. Synthesis of Human Motion and Video Texture with a Binary Latent Variable Model, Graham W. Taylor, Geoffrey E. Hinton and Sam RoweisPoster Location: 13
Abstract We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued "visible" variables that represent joint angles. The latent and visible variables at each time step receive directed connections from
the visible variables at the last few time-steps. Such an architecture makes on-line inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. We briefly demonstrate that this model can be successfully applied to the synthesis of video texture. Graph clustering with network structure indices, Matthew J RattiganPoster Location: 14
Abstract Graph clustering has become ubiquitous in the study of relational data sets. We examine two simple algorithms: A graphical adaptation of the k-means algorithm and the Girvan-Newman method based on edge betweenness centrality. We show that they can be effective at discovering the latent groups or communities that are defined by the link structure of a graph. However, both approaches rely on prohibitively expensive computations given the size of modern relational data sets. Network structure indices (NSIs) are a proven technique for indexing network structure and efficiently finding short paths. By incorporating NSIs into these graph clustering algorithms, we can overcome these complexity limitations. We also present promising quantitative and qualitative evaluations of the modified algorithms on synthetic and real data sets.
Cutting Plane Algorithms for Variational Inference, David A Sontag and Tommi JaakkolaPoster Location: 15
Abstract We propose a cutting-plane style algorithm for finding the maximum a posteriori (MAP) state and approximately inferring marginal probabilities in discrete Markov Random Fields (MRFs). The variational formulation of both problems consists of an optimization over the marginal polytope, with the latter having an additional non-linear entropy term in the objective. While there has been significant progress towards approximating the entropy term, the marginal polytope is generally approximated by the local consistency constraints, which give only a loose outer bound. Our algorithm efficiently finds linear constraints that are violated by points outside of the marginal polytope, making use of the cut polytope, which has been studied extensively in the context of MAX-CUT. We demonstrate empirically that our algorithm finds the MAP solution for a larger class of MRFs than before. We also show that tighter yet efficient relaxations of the marginal polytope result in more accurate pseudomarginals. Transductive Structured Classification through Constrained Min-Cuts, Kuzman Ganchev and Fernando PereiraPoster Location: 16
Abstract We extend the Blum and Chawla (2001) graph min-cut algorithm to structured problems. This extension can alternatively be viewed as a joint inference method over a set of training and test instances where parts of the instances interact through a pre-specified associative network. The method has has an efficient approximation through a linear-programming relaxation. On small training data sets, the method achieves up to 34.8% relative error reduction.
Conformant Planning Through SAT, Alexandra Goultiaeva and Fahiem BacchusPoster Location: 17
Abstract This paper describes an approach to solving conformant planning problems using satisfiability. It is based on the idea that if a conformant planning problem is transformed into a QBF in CNF form without introducing any new variables, then universal reduction can be used to turn it into a SAT problem. But such transformation to CNF could take exponential time. This paper explores a way to
utilize universal reduction without explicitly converting to CNF, using logical transformation, decomposition and multiple SAT calls. The Role of Kinematic Conditioning in Dexterous Machines, Stephen Hart and Roderic GrupenPoster Location: 18
Abstract This paper examines how classical conditioning measures can be used in a non-traditional way: as plans to solve behavioral problems. Such an approach provides a novel means of responding to input errors and for managing sensor geometries. First, we examine how the classical measures of kinematic conditioning introduced in Yoshikawa provide a domain-general approach for posturing a robot in unknown environments. The formalism Yoshikawa used to derive a measure of manipulability (MoM) is reexamined in the context of stereo reconstruction, resulting in a measure of localizability (MoL). We argue that, in light of unknown interactions with the environment, a robot should condition itself isotropically for imparting velocities and forces. Next, we consider other conditioning metrics that keep the system within its range of motion and away from self-collisions. Furthermore, we show how to sequence conditioning controllers in a concurrent control framework to impart intrinsic conditioning goals. To illustrate the value of framing robotics problems in the form of conditioning goals, we demonstrate the critical role they can play in an industrially relevent inspection task using a coarsely calibrated system. Sparse Message Passing Algorithms for Weighted Maximum Satisfiability, Aron Culotta, Andrew McCallum, Ashish Sabharwal and Bart SelmanPoster Location: 19
Abstract Weighted maximum satisfiability is a well-studied problem that has
important applicability to artificial intelligence (for instance, MAP
inference in Bayesian networks). General-purpose stochastic search
algorithms have proven to be accurate and efficient for large problem
instances; however, these algorithms largely ignore structural
properties of the input. For example, many problems are highly
clustered, in that they contain a collection of loosely coupled
subproblems (e.g. pipelines of NLP tasks). In this paper, we
propose a message passing algorithm to solve weighted maximum
satisfiability problems that exhibit this clustering property. Our
algorithm fuses local solutions to each subproblem into a global
solution by iteratively passing summary information between clusters
and recomputing local solutions. Because the size of these messages
can become unwieldy for large problems, we explore several message
compression techniques to transmit the most valuable information as
compactly as possible. We empirically compare our algorithm against a
state-of-the-art stochastic solver and show that for certain classes
of problems our message passing algorithm finds significantly better
solutions. Improving Partitioning-Based Author Coreference by Incorporating Web Pages as Graph Nodes, Pallika H Kanani and Andrew K McCallumPoster Location: 20
Abstract Entity resolution in the research paper domain is an important, but difficult problem. It suffers from insufficient contextual information, hence using information from the web significantly improves performance. We formulate the author coreference problem as one of graph partitioning with discriminatively-trained edge weights. In previous work, the information obtained from the web is incorporated as additional features. We incorporate web information as additional nodes in the graph. Gathering information from the web is expensive, hence in this paper we provide new insights into strategies for achieving most performance gain using only a fraction of the queries. We compare and discuss two criteria for query selection and propose alternative queries for gathering relevant web information. Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns, Lisa Friedland and David JensenPoster Location: 21
Abstract We present a family of algorithms to uncover tribes—groups of individuals who share unusual sequences of affiliations. While much work inferring community structure describes large-scale trends, we instead search for small groups of tightly linked individuals who behave anomalously with respect to those trends. We apply the algorithms to a large temporal and relational data set consisting of millions of employment records from the National Association of Securities Dealers. The resulting tribes contain individuals at higher risk for fraud, are homogenous with respect to risk scores, and are geographically mobile, all at significant levels compared to random or to other sets of individuals who share affiliations. A Framework for Mixed-Initiative Clustering, Yifen Huang and Tom MitchellPoster Location: 22
Abstract Mixed-initiative clustering is a machine learning task that
integrates a machine's clustering capability and a user's guidance
in order to obtain the user's desired result. This task is different
from traditional autonomous clustering tasks by introducing user
criterion, a user's understanding of data and purpose of sorting. We
propose a framework for mixed-initiative clustering to solve
problems such as how to model a user's criterion and how to utilize
the criterion to improve the clustering performance. Our approach
consists of representing properties of the clustering model to a
user and letting the user gives feedback on these properties. We
also demonstrate the feasibility of this framework using an
application called "activity extraction from personal workstation
contents." Our work includes building structured activity
descriptions to represent a machine's speculation and a new
clustering algorithm, the SpeClustering model, that enables extended
user feedback. Furthermore, we identified the SpeClustering model as
an instantiation of mixed-initiative clustering. Winner-Take-All EM Clustering, Vasileios Kandylas, Lyle H Ungar and Dean P FosterPoster Location: 23
Abstract The EM algorithm is often used with mixture models to cluster data,
but for efficiency reasons it is sometimes desirable to produce hard
clusters. Several hard clustering limits of EM are known. For example,
$k$-means clustering can be derived from EM in a Gaussian mixture
model by taking the limit of all variances going to zero. We present a
new method of deriving Winner-Take-All versions of EM that can be used
for mixtures, such as heteroscedastic Gaussians, where it is not
possible to take that limit. The resulting clusters can have
non-convex boundaries, allowing for some of the clusters to reside
``inside'' others, producing dense foreground clusters embedded in a
more diffuse background. Experiments show that using unequal variances
can give better clusters on real data sets in terms of external
quality measures.
Introducing Directed Components Analysis: A Principal Components Analysis That Incorporates Domain Knowledge, Charles Bergeron, Kristin Bennett and George PlopperPoster Location: 24
Abstract This paper introduces Directed Components Analysis, a modeling
approach similar to Principal Components Analysis in which one or
more components are chosen from domain knowledge rather than the
maximum variability criterion. A proteomics application
illustrates how selecting a directed component that is of
biological interest results in a simple, interpretable model for
the differentiation of mesenchymal stem cells into osteoblasts
that yields fresh insights into this process. The main
contribution of this paper is the demonstration that machine
learning practitioners can make significant inferences despite the
availability of a limited number of datapoints. Semi-Automated Named Entity Annotation, Kuzman Ganchev, Fernando Pereira, Mark Mandel, Steven Carroll and Peter WhitePoster Location: 25
Abstract We investigate a way to partially automate corpus annotation for named entity recognition, by requiring only binary decisions from an annotator. Our approach is based on a linear sequence model trained using a k-best MIRA learning algorithm. We ask an annotator to decide whether each mention produced by a high recall tagger is a true mention or a false positive. We conclude that our approach can reduce by up to 58% the effort of extending a seed training corpus.
RLMario: A New Reinforcement Learning Domain, John Asmuth and Chris MansleyPoster Location: 26
Abstract Reinforcement learning is about designing agents that successfully
maximize their rewards in environments. In this project, we propose a
new domain of environments based on the popular Super Mario Brothers
games. This domain is a good match for existing algorithms by virtue
of being discrete and fully observable and would introduce new
challenges by way of having a relatively large state space and
possibly being non-stationary. This proposal outlines the environment
and a new algorithm designed to efficiently explore and build a stochastic model of this environment. A Bilevel Optimization Approach to Model Selection for SVMs, Gautam Kunapuli, Kristin Bennett, Jing Hu and Jong-Shi PangPoster Location: 27
Abstract SVMs and related classification models require
the solution of convex optimization
problems that have one or more regularization
hyper-parameters. Typically, the hyperparameters
are selected to minimize crossvalidated
estimates of the out-of-sample classification
error of the model. This crossvalidation
optimization problem can be formulated
as a bilevel program in which the
outer-level objective minimizes the average
number of misclassified points across the
cross-validation folds, subject to inner-level
constraints such that the classification functions
for each fold are (exactly or nearly) optimal
for the selected hyper-parameters. Feature
selection is included in the bilevel program
in the form of bound constraints in the
weights. The resulting bilevel problem is converted
to a mathematical program with linear
equilibrium constraints, which is solved using
state-of-the-art optimization methods. This
approach is significantly more versatile than
commonly used grid search procedures, enabling,
in particular, the use of models with
many hyper-parameters. Identifying Expressions of Opinion in Context, Eric John Breck, Yejin Choi and Claire CardiePoster Location: 28
Abstract While traditional information extraction systems have been built to
answer questions about facts, subjective information extraction
systems will answer questions about feelings and opinions. A crucial
step towards this goal is identifying the words and phrases that
express opinions in text.
Indeed, although much previous work has relied on the identification of opinion
expressions for a variety of sentiment-based NLP tasks, none has
focused directly on this important supporting task.
Moreover, none of the proposed
methods for identification of opinion expressions has been evaluated at the
task that they were designed to perform.
We present an approach for identifying opinion expressions that uses conditional
random fields and we evaluate the approach at the expression-level using a
standard sentiment corpus. Our approach achieves expression-level performance
within 5% of the human interannotator agreement.
Posters - Saturday, 6pm
Location: Upson Hall, 5th Floor
Combining Multiple Heuristics Online, Matthew Streeter, Daniel Golovin and Stephen F. SmithPoster Location: 1
Abstract We present black-box techniques for learning how to interleave the execution of multiple heuristics in order to improve average-case performance. In our model, a user is given a set of heuristics whose only observable behavior is their running time. Each heuristic can compute a solution to any problem instance, but its running time varies across instances. The user solves each instance by interleaving runs of the heuristics according to a task-switching schedule. We present (i) exact and approximation algorithms for computing an optimal task-switching schedule offline, (ii) sample complexity bounds for learning a task-switching schedule from training data, and (iii) a no-regret strategy for selecting task-switching schedules online. We demonstrate the power of our results using data from recent solver competitions. We outline how to extend our results to the case in which the heuristics are randomized, and the user may periodically restart each heuristic with a fresh random seed.
Complexity of Bribery in Elections, Piotr Faliszewski, Lane A. Hemaspaandra and Edith HemaspaandraPoster Location: 2
Abstract We study the complexity of influencing elections through bribery:
How computationally complex is it for an external actor to determine
whether by a certain amount of bribing voters a specified candidate can
be made the election's winner? For example, we provide a complete
classification of the complexity of bribery for the broad class of elections
(including plurality, Borda, k-approval, and veto) known as scoring protocols.
Matrix Tile Analysis, Inmar Givoni, Vincent Cheung and Brendan J. FreyPoster Location: 3
Abstract Many tasks require finding groups of elements in a matrix of numbers, symbols or class likelihoods. One approach is to use efficient bi- or tri-linear
factorization techniques including PCA, ICA, sparse matrix factorization and
plaid analysis. These techniques are not appropriate when addition and
multiplication of matrix elements are not sensibly defined. More directly,
methods like bi-clustering can be used to classify matrix elements, but these
methods make the overly-restrictive assumption that the class of each element is a function of a row class and a column class. We introduce a general computational problem, `matrix tile analysis' (MTA), which consists of
decomposing a matrix into a set of non-overlapping tiles, each of which is
defined by a subset of usually nonadjacent rows and columns. MTA does not
require an algebra for combining tiles, but must search over an exponential
number of discrete combinations of tile assignments. We describe a loopy BP(sum-product) algorithm and an ICM algorithm for performing MTA. We compare the effectiveness of these methods to PCA and the plaid method on hundreds of randomly generated tasks. Using double-gene-knockout data, we show that MTA finds groups of interacting yeast genes that have biologically-related functions. Active Exploration for Learning Rankings from Clickthrough Data, Filip Radlinski and Thorsten JoachimsPoster Location: 4
Abstract We address the task of learning rankings of documents from search engine
logs of user behavior. Previous work on this problem has relied on
passively collected clickthrough data. In contrast, we show that an
active exploration strategy can provide data that leads to much faster
learning. Specifically, we develop a Bayesian approach for selecting
rankings to present users so that interations result in more informative
training data. Our results using the TREC-10 Web corpus, as well as
synthetic data, demonstrate that a directed exploration strategy quickly
leads to users being presented improved rankings in an online learning
setting. We find that active exploration substantially outperforms
passive observation and random exploration. Online Policy Improvement in Large POMDPs via an Error Minimization Search, Stephane Ross, Brahim Chaib-draa and Joelle PineauPoster Location: 5
Abstract Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical framework for planning under uncertainty. However, most real world systems are modelled by huge POMDPs that cannot be solved due to their high complexity. To palliate to this difficulty, we propose combining existing offline approaches with an online search process, called AEMS, that can improve locally an approximate policy computed offline, by reducing its error and providing better performance guarantees. We propose different heuristics to guide this search process, and provide theoretical guarantees on the convergence to epsilon-optimal solutions. Our experimental results show that our approach can provide better solution quality within a smaller overall time than state-of-the-art algorithms and allow for interesting online/offline computation tradeoff. Comparisons of Sequence Labeling Algorithms and Extensions, Yunsong Guo and Nam NguyenPoster Location: 6
Abstract In this paper, we survey a variety of current state-of-art models for structured learning problems, such as Hidden Markov
Model (HMM), Conditional Random Fields (CRF), Averaged Perceptron (AP), Support Vector Machine for structured output
($SVM^{struct}$), Max Margin Markov Networks (M$^3$N), and an integration of search and learning algorithm (SEARN). With all
due tuning efforts of various parameters of each model, on the data sets we have applied the models to, we found that
SVM$^{struct}$ enjoys better performance compared with the others. We also propose a new Structured Learning Ensemble (SLE)
method to combine these structured learning models. Empirical results show that our SLE algorithm is able to provide more
accurate solutions compared with the best results of the individual models. Detecting Statistical Interactions with Groves of Trees, Daria Sorokina, Rich Caruana, Mirek Riedewald and Daniel FinkPoster Location: 7
Abstract Discovery of additive structure is an important step towards understanding a complex multi-dimensional function, because it allows for expressing this function as the sum of lower-dimensional components. When variables interact, their effects cannot be decomposed into independent lower-dimensional contributions and hence must be modeled simultaneously. Variable interactions make learning harder and complicate the analysis of the relationships between predictors and response. A method that reliably determined which variables do and do not interact would make model interpretation much simpler. We propose a new approach for the problem of interaction detection based on comparing performance of different regression models. Our method is based on a new machine learning algorithm, a grove of trees, which combines additive models with regression trees in a way that allows variable interactions to be carefully controlled. By comparing the performance of restricted and unrestricted groves of trees, the existence and degree of variable interactions in the response function can be reliably detected and estimated.
Email Keyword Summarization and Visualization with Topic Models, Mark Dredze and Hanna M WallachPoster Location: 8
Abstract One of the most sought after tools for email triage is summarization. We
present an approach for email summarization focusing on
keyword summarization instead of full sentence descriptions.
We generate salient keywords using topic models,
which can draw from the entire mailbox instead of just a single email.
Our second contribution is to develop a new method for visualizing a
keyword summarization using tag clouds, which naturally expose to the
user information contained in a topic model. We discuss our approach and
present preliminary examples from our system. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification, John Blitzer, Mark Dredze and Fernando PereiraPoster Location: 9
Abstract Automatic sentiment classification has been extensively studied and
applied in recent years. However, sentiment is expressed differently
in different domains, and annotating corpora for every possible domain
of interest is impractical. We investigate domain adaptation for
sentiment classifiers, focusing on online reviews for different types
of products. First, we extend to sentiment classification the
recently-proposed structural correspondence learning (SCL) algorithm,
reducing the relative error due to adaptation between domains by an
average of 30% over the original SCL algorithm and 46% over a
supervised baseline. Second, we identify a measure of domain
similarity that correlates well with the potential for adaptation of a
classifier from one domain to another. This measure could for instance
be used to select a small set of domains to annotate whose trained
classifiers would transfer well to many other domains. A Reinforcement Learning Agent to Control Seizures in a Computational Model of Epilepsy, Robert D Vincent and Joelle PineauPoster Location: 10
Abstract Our research explores the design of a new generation of implantable electrical stimulation devices for the treatment of epilepsy. These devices would implement an adaptive stimulation strategy by incorporating feedback from a sensor to monitor the patient's status. This adaptive approach could improve both the efficiency and the efficacy of the device. We describe a reinforcement learning agent that could serve as the control algorithm in such a device. We also present a biologically plausible model of an epileptic neural network using integrate-and-fire neurons with partially stochastic inputs and multiple time scales of firing-dependent inhibitory behavior. We show that this network can exhibit behavior ranging from gentle random firing to the violent hypersynchronous activation which is characteristic of epilepsy. We train the reinforcement learning agent to control the dynamics of this network and reduce the occurrence of seizure-like events.
Orthogonal clustering using predictability, Abhishek Gupta, Dean Foster and Lyle UngarPoster Location: 11
Abstract Faceted classification is a flexible way of organizing information
based on mutiple independent labelings.
Faceting is usually contrasted with clustering. Both have
similar objectives of organizing data, but while clustering does this in an
automated way, faceting requires considerable manual labeling. In
this paper we aim to solve the problem of generating these facets
automatically. We propose a modification of the CPCM algorithm, the
Ortho-CPCM to solve this orthogonal clustering problem. The resulting
clusters inherit properties of CPCM, namely invariance to linear
transformations and tend to eliminate noise
features by driving their weights to zero. Recommendations using Absorbing Random Walks, Ajit Singh, Asela Gunawardana, Chris Meek and Arun SurendranPoster Location: 12
Abstract Collaborative filtering attempts to find items of interest for a user by utilizing the preferences of other users. In this paper we describe an approach to filtering that explicitly uses social relationships, such as friendship, to find items of interest to a user. Modeling user-item relations as a bipartite graph we augment it with user-user (social) links and propose an absorbing random walk that induces a set of stationary distributions, one per user, over all items. These distributions can be interpreted as personalized rankings for each user. We exploit sparsity of both user-item and user-user relationships to improve the efficiency of our algorithm. R-CAST-MED: Applying Intelligent Agents to Support Emergency Medical Decision Making Teams, Shizhuo Zhu, Joanna Abraham, Sharoda A. Paul, Madhu Reddy, John Yen, Mark Pfaff and Christopher DeFlitchPoster Location: 13
Abstract Decision-making is a crucial aspect of emergency response during mass casualty incidents (MCIs). MCIs require rapid decisions to be taken by geographically-dispersed teams in an environment characterized by insufficient information, ineffective collaboration and inadequate resources. Despite the increasing adoption of decision support systems in healthcare, there is limited evidence of their value in large-scale disasters. We conducted focus groups with emergency medical services and emergency department personnel who revealed that one of the main challenges in emergency response during MCIs is information management. Therefore, to alleviate the issues arising from ineffective information management, we propose R-CAST-MED, an intelligent agent architecture built on Recognition-Primed Decision-making (RPD) and Shared Mental Models (SMMs). A simulation of R-CAST-MED showed that this tool enabled efficient information management by identifying relevant information, inferring missing information and sharing information with other agents, which led to effective collaboration and coordination of tasks across teams. Online Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long-Range Traversability, Raia Hadsell, Pierre Sermanet, Jan Ben, Ayse Naz Erkan, Jeff Han, Beat Flepp, Urs Muller and Yann LeCunPoster Location: 14
Abstract We present a solution to the problem of long-range obstacle/path
recognition in autonomous robots. The system uses sparse
traversability information from a stereo module to train a classifier
online. The trained classifier can then predict the traversability of
the entire scene. A distance-normalized image pyramid makes it
possible to efficiently train on each frame seen by the robot, using
large windows that contain contextual information as well as shape,
color, and texture. Traversability labels are initially obtained for
each target using a stereo module, then propagated to other views of
the same target using temporal and spatial concurrences, thus training
the classifier to be view-invariant. A ring buffer simulates
short-term memory and ensures that the discriminative learning is
balanced and consistent. This long-range obstacle detection system
sees obstacles and paths at 30-40 meters, far beyond the maximum
stereo range of 12 meters, and adapts very quickly to new
environments. Experiments were run on the LAGR robot platform.
Expertise Modeling for Matching Papers with Reviewers, David Mimno and Andrew McCallumPoster Location: 15
Abstract An essential part of an expert-finding task, such as matching reviewers to submitted papers, is the ability to model the expertise of a person based on documents. We evaluate several measures of the association between an author in an existing collection of research papers and a previously unseen document. We compare two language model based approaches with a novel topic model, Author-Persona-Topic (APT). In this model, each author can write under one or more "personas", which are represented as independent distributions over hidden topics. Examples of previous papers written by prospective reviewers are gathered from the Rexa database, which extracts and disambiguates author mentions from documents gathered from the web. We evaluate the models using a reviewer matching task based on human relevance judgments determining how well the expertise of proposed reviewers matches a submission. We find that the APT topic model outperforms the other models. Boosting-based Learning Agents for Experience Classification, Po-Chun Chen, Xiaocong Fan, Shizhuo Zhu and John YenPoster Location: 16
Abstract The capability of learning from experience is of critical importance in developing multi-agent systems supporting dynamic group decision making. In this paper, we introduce a hierarchical learning approach, aiming to support hierarchical group decision making where the decision makers at lower levels only have partial view of the whole picture. To further understand such a hierarchical learning concept, we implemented a learning component within the R-CAST agent architecture, with lower-level learners using the LogitBoost algorithm with decision stumps. The boosting-based learning agents were then used in our experiments to classify experience instances. The results indicate that hierarchical learning can significantly improve decision accuracy when lower-level decision makers only have limited information accessibility. Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs, Sven Seuken and Shlomo ZilbersteinPoster Location: 17
Abstract Scalability of algorithms for solving decentralized POMDPs presents a major
research challenge because of its double exponential worst-case complexity.
Initial algorithms have only been able to solve very small problems. One
exception is the Memory-Bounded Dynamic Programming (MBDP) algorithm--an
approximation technique that has proved extremely efficient in handling large
problem horizons. In this paper we make several key contributions. We
generalize the MBDP algorithm and improve its scalability by reducing the
complexity with respect to the size of the observation space from exponential
to polynomial. We derive error bounds on solution quality with respect to this
new approximation and analyze the convergence behavior. To evaluate the
effectiveness of the improvements we introduce a new, larger benchmark problem.
Preliminary experimental results show that despite the high complexity of
decentralized POMDPs, scalable solution techniques such as MBDP perform
surprisingly well. Finally, we discuss a challenging open research question and
conclude with promising future directions of research. Solution Backjumping for #SAT, Jessica Davies, Eric Hsu and Horst SamulowitzPoster Location: 18
Abstract Model counting (#SAT) can be used to represent and solve many
different kinds of real-world problems (e.g., Bayes
Nets). Consequently, any further improvement in our ability to solve
#SAT has awaiting applications. State-of-the-art #SAT solvers
utilize modified versions of DPLL, the classical backtracking search
algorithm for solving the satisfiability problem (SAT). Inspired by the sort of solution learning exploited by the concept of "cubes" in QBF solvers, we enhance regular DPLL for #SAT with a backjumping scheme. This algorithmically reduces the amount of the solution space that must be explored. Learning Annotated Hierarchies from Relational Data, Daniel M. Roy, Charles Kemp, Vikash Mansinghka and Joshua TenenbaumPoster Location: 19
Abstract The objects in many real-world domains can be organized into hierarchies, where
each internal node picks out a category of objects. Given a collection of fea-
tures and relations defined over a set of objects, an annotated hierarchy includes a
specification of the categories that are most useful for describing each individual
feature and relation. We define a generative model for annotated hierarchies and
the features and relations that they describe, and develop a Markov chain Monte
Carlo scheme for learning annotated hierarchies. We show that our model discov-
ers interpretable structure in several real-world data sets. Using Expectation Maximization to Find Likely Assignments for Solving Constraint Satisfaction Problems, Eric I Hsu, Matthew Kitching, Fahiem Bacchus and Sheila McIlraithPoster Location: 20
Abstract We present a new probabilistic framework for finding likely variable
assignments in difficult constraint satisfaction problems. Finding
such assignments is key to efficient search, but practical efforts
have largely been limited to random guessing and heuristically
designed weighting systems. In contrast, we derive a new version of
Belief Propagation (BP) using the method of Expectation Maximization
(EM). This allows us to differentiate between variables that are
strongly biased toward particular values and those that are largely
extraneous. Using EM also eliminates the threat of nonconvergence
associated with regular BP. Theoretically, the derivation exhibits
appealing primal/dual semantics. Empirically, it produces an
"EMBP"-based heuristic that outperforms existing techniques for
guiding variable and value ordering during backtracking search. Aggregation in MDPs: Approximate Linear Programming and Policy Iteration, Marek PetrikPoster Location: 21
Abstract This paper investigates a connection between
approximate linear programming and approximate
policy iteration. We show that
when aggregation is considered, the differences
between them are only superficial. This
relationship may be used in the future to relate
aggregation methods for linear programs
to those for MDPs. A Heuristic Search Approach to Planning with Temporally Extended Preferences, Jorge A Baier, Fahiem Bacchus and Sheila A McIlraithPoster Location: 22
Abstract In this paper we propose a suite of techniques for planning with temporally
extended preferences (TEPs). To this end, we propose a method for
compiling TEP planning problems into simpler domains containing
only final-state (simple) preferences and metric functions. With
this simplified problem in hand, we propose a variety of heuristic
functions for planning with final-state preferences, together
with an incremental best-first planning algorithm. A key feature of the
planning algorithm is its ability to prune the search space.
We identify conditions under which our planning algorithm will
generate optimal plans. We implemented our algorithm as an
extension to the TLPlan planning system and report on extensive
testing performed to evaluate the effectiveness of our heuristics
and algorithm. Our planner, HPlan-P, competed in the 5th
International Planning Competition, achieving distinguished
performance in the qualitative preferences track.
Learning Visual Features for Outdoor Localization, Xuming He, Volodymyr Mnih, Yani Ioannou and Richard ZemelPoster Location: 23
Abstract We consider the problem of localization in an outdoor environment
using only images from a single consumer-grade camera. Our aim is
to learn a representation of an environment after a small number of
excursions through it, to enable localization from images on a new
excursion. This task is challenging due to: the high dimensionality
of the inputs; variability in images taken at the same location; and
the noisy ground truth data. We propose a discriminative
probabilistic framework in which a set of stable and distinctive
visual features are incrementally learned to form a prediction of
camera pose from the raw images. After learning these features, we
show how they can be used as an observation model in a Bayesian
filtering scheme for online localization. We apply our framework to
a real-world dataset and compare its performance to other
image-based localization methods.
Every Object has a Story: The Process of Creating PersonalSoundtrack, Greg ElliottPoster Location: 24
Abstract
Every object has a story. Objects of design represent a complex history of
iterations, design choices, motivations, assumptions, reworkings, impasses,
etc. In Reflective Design (Sengers et al., 2005), we are encouraged to fuse
reflection into design allowing it to shape, motivate, and inform our work.
The stories behind reflectively-designed objects have an even richer texture.
Tragically, the story is seldom visible by inspection of an object alone, and
often is lost as the process of creation fades from the designer's memory.
Critical Design tells a story by embedding alternate perspectives in the actual
design [(Sengers et al., 2005)(Dunne, 2000)(Dunne & Raby, 2001)]; however,
because this story is abstractly packaged into the design itself, it is likely
legible only to its creators. Ludic Design promotes reflection through usage,
where a story is told through interaction with an object (Gaver et al., 2004).
This story, however, is not the story of creation but a story of interaction.
Neither Critical Design nor Ludic Design provide users or designers with a
background story. Objects are presented as is, perhaps accompanied by a user
study or a few implications for design. I propose that the simple act of
telling an object's story via autobiography greatly enhances the practice of
Reflective Design. Process reveals assumptions, philosophical commitments, the
significant role of intuition, and other nuances that may be hidden from the
creator, other designers and users. Additionally, documentation of process may
help prevent a misunderstanding or misappropriating of its Critical Design
elements (Sengers et al., 2005). I'll begin with a bit of meta-reflection on
writing this paper in the next section. The third section moves into the
origins of the device, while the remainder of the paper is broken into three
main philosophical motivations which serve to ground and structure the
disorganization inherent in documenting process.
|