Seminartage Modellauswahl

Gemeinsames Seminar mit Alexander Schliep vom MPI molekulare Genetik Berlin/Rutgers, Raymond Hemmecke von der Otto-von-Guericke Universität Magdeburg/TU Darmstadt sowie Nihat Ay und Thomas Kahle vom MPI Mathematik in den Naturwissenschaften.

Termine SS 9

DatumOrtThemaVortragendeR
16.9. ÚTIA, Prague, CZ Announcement with practical hints and registration information Organized by Milan Studený and František Matúš
10:30 A survey of some issues that arise in statistical model selection Joe Whittaker, Lancaster University, UK
Statistical model selection has for many years been of interest to the academic and scientific profession. Here we discuss some of the current issues and, in particular, make a comparison to what was available thirty years ago.
11:30 coffee break
11:45 Polytopes of Bayesian network structures Raymond Hemmecke, TU Darmstadt & OvGU Magdeburg
In this talk we consider polytopes associated to Bayesian network structures. The study of these polytopes is fundamental for improving existing algorithms for learning such Bayesian network structures. By restricting our attention to special Bayesian networks, e.g. those that have an undirected tree or forest as the essential graph, we obtain very nice structural results on these associated polytopes. In particular, they allow us to give a mathematical proof that the GES algorithm (greedy equivalence search) is guaranteed to find an optimal Bayesian network structure of the pre-described special type. We give an example that this need not be the case in the general case.
[This is joint work with Silvia Lindner, Milan Studeny and Jirka Vomlel.]
12:45 lunch break
14:00 Maximizing the multi-information of stochastic processes Nihat Ay, MPI Leipzig
I will discuss various approaches to the geometry of the set of stochastic matrices motivated by information geometry. Based on this, I introduce a notion of multi-information for stochastic processes and study the corresponding maximization problem. The main result sets bounds on the entropy of a Markov process with maximal multi-information. Finally, I comment on some open problems related to the Kolmogorov-Sinai entropy of dynamical systems.
15:00 coffee break
15:15 Limiting in closures of exponential families František Matúš, ASCR
The closures of exponential families of probability measures in the variation distance and reversed information divergence have been used to understand better the optimization of likelihood and entropies. Existing results on the closures will be reviewed, including a quantum setting if time permits. For the standard exponential families on Euclidean spaces with finite supports limiting in the mean parametrization towards the boundary can be described in a considerable detail. This provides a control of the behavior of information divergences and variance functions around boundaries. Resulting expansions of variance functions at boundary points are used to prove that an exponential family has a finite support and quadratic variance function if and only if it is the product of multinomial families up to an affine transformation.
 
 
6.4. TU Darmstadt 11:00 Model Selection for Coevolution in Virology, Drug Resistance Developement, and Biophyiscs Kay Hamacher (TU Darmstadt)
Evolution reveals itself on the level of single amino acid changes in viral and other proteins. Particular positions in a protein are under varying selective pressure and show different dynamics. We are interested in models and measures for the correlated evolutionary dynamics, that is coevolution of amino acids and nucleotids in biomolecules. Commanding of such models would then allow us to leverage our biophysical models and structural arguments to annotate selective advantages in the molecular phenotype.
12.15 Moment matrices and real root finding Philipp Rostalski (ETH Zurich)
The problem of determining the existence of a measure, such that a given multi-sequence of real numbers agrees with its first moments is known as the truncated moment problem. The main tool to analyze this question is the so called moment matrix, a positive semi-definite matrix with quasi-Hankel structure, whose entries consist of the given moments. We will analyze the algebraic structure of this matrix and show, how it can be turned into an algorithm for computing all real roots, or even all roots in a given semi-algebraic subset.
14:30 Model Selection and Testing for Mixtures and Hidden Markov Models Hajo Holzmann (University of Marburg)
We investigate likelihood-based model selection and testing for mixtures and hidden Markov models (HMMs), in particular for choosing the number of components in the mixture or the number of states in the HMM. We discuss in which situations hypothesis testing or model selection based on information criteria is more suitable. Concerning model testing, we investigate testing for homogeneity in finite mixtures and testing for two states in an HMM. We also consider further non-standard testing situations such as testing for bimodality in two-component mixtures. The procedures are illustrated by several examples from economics and biology.
15:45 Selecting Dynamic Alternatives in Logical Signaling Networks Utz-Uwe Haus (University of Magdeburg)
Logical models of signaling events have proved to be a useful tool to model the signaling behavior of biological systems. We discuss how infeasibility of satisfiability problems underlying these models can be analyzed to gain insight into the dynamics of the a priory static models. A surprising computational feature of these non-monotonic boolean systems is discussed. We show how this problem can be embedded into much more general questions about combinatorial aspects of discretized dynamical systems.

Termine WS 8/9

DatumOrtThemaVortragendeR
30.10. ZIB Seminarraum 10:30 Matroids and Conditional Independence Thomas Kahle, MPI Mathematik in den Naturwissenschaften, Leipzig
11:45 The implication problem for conditional indepence Milan Studeny, Academy of Sciences of the Czech Republic, Prague
14:15 Model selection by empirical risk penalization: outlook and some applications to machine learning Gilles Blanchard, Fraunhofer FIRST, Berlin
20.2. MPI Leipzig 10:00 On the Optimization of binary functions using probabilistic relaxation Luigi Malago, AIRLab, Politecnico di Milano
 
11:15 Markov Bases and Beyond Johannes Rauh, MPI Mathematik in den Naturwissenschaften, Leipzig
In many applications (hypothesis testing and disclosure limitation) one wants to investigate the set of probability measures or contingency tables satisfying a given set of linear constraints. The main examples for such constraints are fixed expectation values or marginal distributions and conditional distributions of subsystems. Markov bases can be used to explore these sets. However, the use of Markov bases may not always be applicable, for example when the linear constraints are not given by integral equations. Some ideas how to generalize the Markov bases technique are presented.
14:00 Structure validation in clustering by stability analysis Joachim M. Buhmann, Department of Computer Science, ETH Zürich
Partitioning of data sets into groups defines an important preprocessing step for compression, prototype extraction or outlier removal. Various criteria of connectedness or proximity have been proposed to group data according to structural similarity but in general it is unclear which method or model to use. In the spirit of information theory we propose a decision process to determine the extractable information from data conditioned on a hypothesis class of structures. Maximizing the amount of information which can be reliably learned from data in the presence of noise selects appropriate models. Empirical evidence for this model selection concept is provided by cluster validation in bioinformatics and in computer security, i.e., the analysis of microarray data and multilabel clustering of Boolean data for role based access control.
15:00 Mathematical models of cancer progression Niko Beerenwinkel, Department of Biosystems Science and Engineering, ETH Zürich
Cancer progression is an evolutionary process that is driven by mutation and selection in a population of tumor cells. We discuss mathematical models of cancer progression, starting from traditional multistage theory. Each stage is associated with the occurrence of genetic alterations and their fixation in the population. We describe the accumulation of mutations using conjunctive Bayesian networks, an exponential family of waiting time models in which the occurrence of mutations is constrained by a partial order. Two opposing limit cases arise if mutations either follow a linear order or occur independently. We derive analytical expressions for the waiting time until a specific number of mutations have accumulated and show how the waiting time relates to the dependency structure among mutations.

Termine SS 8

DatumOrtThemaVortragendeR
25.4. FU Berlin, A3 Raum 005 10:00 Modellauswahl bei Transkriptionsfaktor-Bindungsstellen-Analyse Benjamin Georgi, MPI molekulare Genetik Berlin
11:00 An introduction to coalescent theory: the next hunting ground for algebraic statistics? David Bryant, Dept. Mathematics University of Auckland
Kaffeepause
13:00 Studeny's Analysis graphischer Modelle auf wenigen Variablen Thomas Kahle, MPI Mathematik in den Naturwissenschaften, Leipzig
14:00 Algebraic Statistics for model selection: an outlook Alexander Schliep, MPI molekulare Genetik Berlin
8.5. MPI Leipzig 10:30 Construction of Toric Gröbner Bases Christian Haase, FU Berlin
11:45 Consistency of information criteria Mathias Drton, University of Chicago
14:00 Geometry and BiologyJürgen Jost, MPI Mathematics in the Sciences Leipzig
15:00 Discussion on Markov Bases / InequalitiesJohannes Rauh, MPI Mathematics in the Sciences Leipzig
10.6. MPI Berlin Anreise Lecture Hall, Ground floor 10:30 From Arabidopsis roots to bilinear equations Dustin Cartwright, UC Berkeley
11:45 Lattice point problems arising from model selection problems Raymond Hemmecke, Otto-von-Guericke University Magdeburg
Lunch
14:00 Statistical Analysis of Digital Gene Expression Hugues Richard, MPI molekulare Genetik Berlin
11.7. OvGU Magdeburg Gebäude 2, Raum 311 10.30-11.30 Model discrimination using multiple steady state information Carsten Conradi, MPI Magdeburg
11.45-12.15 Static versus Dynamic Logic and Integer Programming > Models for Signaling Networks Kathrin Niermann, OvGU Magdeburg
12.15-12.45 Reconstructing biological models from experimental data Markus Durzinsky, OvGU Magdeburg
Lunch
14.15-15.15 Denoising and Dimension Reduction in Kernel Feature Space Klaus-Robert Müller, TU Berlin
15.30-16.30 Open Problems in Algebraic Statistics Bernd Sturmfels, UC Berkeley, TU Berlin
23.7. TU Berlin, MA 649 11:00 Tropical Geometry in Applied Mathematics Michael Joswig, TU Darmstadt
Lunch
13:30 Markov Bases in Statistical Genetics Caroline Uhler, UC Berkeley
14:30 Solving Constraint Optimization Problems via Importance Sampling in the Grand Canonical Ensemble Karl-Heinz Zimmermann, Technische Universität Hamburg-Harburg
Coffee
16:00 Toric Dynamical Systems Anne Shiu, UC Berkeley