One such example is a rank-maximal stable matching, which is a stable matching such that the greatest number of men and women gain their first-choice partner, and subject to that, their second choice, and so on. A polynomial-time algorithm for computing such a matching was described by Gusfield and Irving in However, this algorithm relies upon exponential weights which, when used in practice, may result in overflows or memory issues.
In this talk I describe a new polynomial-time algorithm to compute a rank-maximal stable matching using a combinatorial approach, without the need to revert to exponential weights. Kyle's talk will cover parallel search and heuristics within a constraint programming toolkit Choco , and how my MSci project proposes a new and unexplored method of exploiting parallelism using randomness, restarts and no-good sharing between threads to achieve significant speedup times.
Speaker: Anisse Ismaili In this work, we consider a three sided student-project-resource matching-allocation problem, in which students have preferences on projects, and projects on students. While students are many-to-one matched to projects, indivisible resources are also many-to-one allocated to projects whose capacities are thus endogenously determined by the sum of resources allocated to them. Traditionally, this problem is divided into two separate problems, e. Although both problems are well-understood, unless the expectations used in the first problem are correct, we obtain a suboptimal outcome.
Thus, it is desirable to solve this problem as a whole without dividing it in two. Then, we show that no strategyproof mechanism satisfies fairness and very weak efficiency requirements. Given this impossibility result, we develop a new strategyproof mechanism that strikes a good balance between fairness and efficiency, which is assessed by experiments. Makoto Yokoo. In this work, we consider a three sided student-project-resource matching-allocation problem, in which students have preferences on projects, and projects on students.
Speaker: David Richerby Any decision problem has a natural counting problem associated with it: instead of asking "Is there a solution? I will discuss the complexity of counting solutions modulo some fixed integer, especially my work on counting graph homomorphisms modulo 2. Graph homomorphisms can be seen as a generalization of graph colourings. We show that, for broad classes C of graphs, counting homomorphisms from an arbitrary input graph to a fixed graph H in C modulo 2 can either be done in polynomial time or is NP-hard, depending on the structure of H, with no problems of intermediate complexity.
We conjecture that this dichotomy applies to all graphs H. On the face of it, counting modulo 2 is the decision problem "Is the number of solutions even? Unexpected phenomena occur: for example, Valiant discovered a restriction of 3-SAT where satisfying assignments can be counted modulo 7 in polynomial time but it is NP-hard to count them modulo 2. Similar effects occur with graph homomorphisms. Computational experiments, sample size fifteen billion 19 February, Speaker: Ciaran McCreesh We have a fairly good understanding of how many solvers for hard decision problems behave on randomly generated instances: as the random parameter is altered, we move from an under-constrained and easy region, through a phase transition which is computationally hard, and into an over-constrained and easy region.
But what about optimisation problems? In an attempt to understand what makes the maximum clique problem easy or hard, we've recently run experiments involving over fifteen billion problem instances. I'll talk briefly about what we've found, but mostly I'll focus on the practicalities of carrying out this kind of work. I'll cover topics like automation, reproducibility, instance generation, how to make hardware and software reliable and scalable, and how to handle the volume of results.
I'll also talk about the various options for computational resources that available for empirical algorithmics, both within the School, and through EPSRC national resources. Speaker: Yiannis Giannakopoulos The central theme in this talk is the study of the behaviour of autonomous agents in congestion games; important special cases of such games include traffic routing in networks and scheduling.
We give exponential lower bounds on the Price of Stability PoS of weighted congestion games with polynomial cost functions. We further show that the PoS remains exponential even for singleton games and approximate equilibria. An appealing additional feature of our algorithm is that it uses only best-improvement steps in the actual game, as opposed to earlier approaches that first had to transform the game itself. Speaker: Jan De Muijnck-Hughes Dependent types provide an expressive environment in which one can specify, and reason about, the properties of our software programs.
In this talk I will introduce dependent types and examine how we can reason about program behaviour using finite state machines and type-level modelling. A light buffet lunch will be provided. Determining variant repeat length in a case of a Myotonic Dystrophy type 1 DM1 patient directly from raw, unaligned Next Generation Sequencing reads using dynamic programming 22 January, Speaker: Adam Kurkiewicz Myotonic Dystrophy type 1 DM1 is a rare genetic disorder, which results in a range of symptoms, involving multiple organs and tissues, with muscle wasting muscular dystrophy and myotonia inability to relax contracted muscle being the two major symptoms.
Although the cause of DM1 has been known for more than 20 years, viable treatment is yet to be discovered. One promising avenue is studying a small subpopulation of DM1 patients, who feature so called "variant repeats". These patients warrant our attention, because they are on average less affected than otherwise similar patients without the variant repeats. An important question when looking at the genetic information is the length of the variant repeat.
So far this would be established by manual inspection of multiple so-called "NGS reads" by a trained biologist, and would be both laborious and inexact. We've implemented an algorithm, which translates the problem into a multi-objective optimisation involving sequence alignment. However, the algorithm we've discovered is even slower than the biologist, and has an expected running time of months or even years for real-world instances of the problem.
Further algorithmic ideas are needed to solve the problem. Paths between colourings of sparse graphs 12 December, I will give a short proof of an existing theorem that addresses the conjecture for graphs with bounded maximum average degree. I will also discuss some progress on the conjecture for planar graphs. Speaker: Sofiat Olaosebikan In this presentation, I will talk about the Student-Project Allocation problem where students have preferences over projects, lecturers have preferences over students, and preferences may involve ties SPA-ST.
Our main finding based on this experiment is that, whilst super-stable matchings can be elusive, the probability of such a matching existing is significantly higher if ties are restricted to the lecturers' preference lists. An improved algorithm for a class of linear programming problems 20 November, Speaker: Paul Cockshott In the period from the late s to the mid s the Russian mathematician Kantorovich developed what came to be known in the West as Linear Programming.
Slightly later a similar algorithm, the Simplex one, was independently developed Danzig. The latter were developed by another Russian theorist Leontief, and have now become a standard tool of economic analysis. I will then go on to show how the open source LP-solve package can be used, with an appropriate pre-processor, to construct multi-year national economic plans.
Testing logically defined properties on graphs of bounded degree 13 November, Work in progress solving equations over free groups 06 November, Population structure and evolution: a survey of the graph Moran process 23 October, Speaker: John Lapinskas When a new mutation is introduced into a population of organisms, it will either die out completely extinction or become prevalent genetic fixation. The driving force behind evolution is that a mutation which confers a small advantage is quite likely to fixate, even in a very large population, whereas a mutation which confers a small disadvantage is exponentially unlikely to fixate.
In , Lieberman, Hauert and Nowak proposed the graph Moran process to examine the effects of population structure. Individuals are modelled as vertices on a graph, and only adjacent individuals can interact with each other. In this talk, I will survey a number a recent advances in our understanding of the model, including an almost-tight upper bound on expected absorption time and an answer to the question: can graph structure suppress evolution almost entirely, so that beneficial, neutral or harmful mutations all fixate with roughly the same probability?
Spoiler: It depends! The talk is intended to be as accessible as possible for people in different fields - it will assume only elementary graph theory and discrete probability, and no biology whatsoever. Speaker: Paulius Dilkas As autonomous vehicles and systems pervade the world around us, it becomes increasingly important to model their behaviour and decision-making in a variety of situations in order to ensure their reliability and safety. For this purpose we turn to bigraphical reactive systems BRSs , a formalism for describing how systems evolve in time and space.
We propose a new type of BRS, a nondeterministic BRS, which is capable of representing nondeterministic choices and probabilistic outcomes as well as rewards for reaching a certain state or choosing an action, allowing us to model the full expressiveness of Markov decision processes. In this talk I will use two example scenarios of agent movement in grid-like and structured spaces to introduce the new capabilities of nondeterministic BRSs and show how they can be programmed and visualised.
The examples explore topics such as collecting objects, tracking visited locations, nesting spaces inside each other, and uncertainty about the topology of space. All members of the university interested in forming research collaborations with colleagues in theoretical computer science are warmly invited to come and discuss their research problems. Multi-objective mixed integer programming: An exact algorithm 04 September, Speaker: William Pettersson Many heuristic or approximate methods have been used to find all non-dominated objective vectors to a multi-objective mixed integer program MOMIP , but no generic exact algorithms yet exist.
The first exact algorithm for the bi-objective case was only given in We present here an exact MOMIP algorithm that generalises to an arbitrary number of objective functions, the first such algorithm. Working with an arbitrary number of objective functions, say k, is particularly difficult as the set of non-dominated objective vectors consists of not necessarily convex polytopes of arbitrary dimension up to k. Our new algorithm consists of three phases: 1 a collection phase, in which the algorithm finds all polytopes which contain some part of the solution, 2 a convexity phase, which takes all such polytopes and creates a set of non-intersecting convex polytopes which cover the same set, and 3 a dominance phase, which then compares such polytopes to determine where in the objective space some polytopes may be either partially or totally dominated.
These three phases are presented so as to encourage further work to improve the performance of the algorithm, and a Python implementation currently in use for computational experiments will soon be made available for use by other researchers. To determine the number of seats in a parliament after elections, two main classes of methods are used: the divisor methods and the quota methods. It is known that the divisor methods are monotone and so do not suffer from the no-show paradox either. We formulate a condition that eliminates the no-show paradox from a quota method and for other quota methods quantify the occurrence of this paradox.
We consider Pareto-efficient assignments of players with preferences on each other to rooms. In his interesting result, Morrill demonstrates how can one borrow ideas from Edmonds' algorithm to find a pareto-improvement of a given assignment in case of twin rooms and strict preferences.
In this talk, we consider generalizations where rooms may accommodate more than two agents and preferences might not be strict. It turns out that some of these extensions are not tractable while for certain other ones there is a polynomial time algorithm. The key to some of the presented results is a connection between Pareto-efficient assignments and maximum-weight matchings in graphs. AI-augmented algorithms -- how I learned to stop worrying and love choice 23 July, Speaker: Lars Kotthoff Often, there is more than one way to solve a problem. It could be a different parameter setting, a different piece of software, or an entirely different approach.
Choosing the best way is usually a difficult task, even for experts. AI and machine learning allow to leverage performance differences of algorithms for a wide definition of "algorithm" on different problems and choose the best algorithm for a given problem automatically. In AI itself, these techniques have redefined the state of the art in several areas and led to innovative approaches to solving challenging problems.
In this talk, I will give examples of how AI can help to solve challenging computational problems, what techniques have been applied, and how you can do the same. I will argue that AI has fundamental implications for software development, engineering, and computer science in general -- stop making decisions when coding, having more algorithmic choices is better! Normalisers in permutation groups as automorphisms of linear codes 05 June, The best solution so far is to use backtrack search to look for the normalising elements. However, the current algorithm does not perform well in certain kinds of groups.
This includes the groups that have all orbits of size at most 2. By considering the problem as computing automorphisms of linear codes, the performance of the algorithm has been greatly improved. In this talk I will demonstrate some tests used in this algorithm to reduce the search size. Speaker: Martin Schirneck It is a long-standing open problem whether there exists an output-polynomial algorithm enumerating all minimal hitting sets of a hypergraph. A stronger requirement is to ask for an algorithm that outputs them in lexicographical order.
We give an algorithm that is optimal under ETH. A polynomial-time approximation algorithm for all-terminal network reliability 22 May, Speaker: Heng Guo All-terminal network reliability is the probability that, in a undirected graph, assuming each edge fails independently, the remaining graph is still connected. Our main contribution is to confirm a conjecture by Gorodezky and Pak Random Struct. Algorithms, , that the expected running time of the "cluster-popping" algorithm in bi-directed graphs is bounded by a polynomial in the size of the input.
Applying Formal Methods in Healthcare 15 May, This talk shows different logic-based approaches to detect and resolve conflicts in the treatment of multiple chronic conditions, and model and compare treatment options for chronic conditions. We developed an automated approach focusing on the composition of static or behavioural design models via constraint solvers. This approach is directly applicable to many different domains, and can be used for the detection of conflicts in clinical pathways.
Clinical pathways are care plans which detail essential steps in the care of patients with a specific clinical problem, usually a chronic disease. A pathway includes recommendations of medications prescribed at different stages of the care plan. For patients with three or more chronic diseases known as multimorbidities the multiple pathways have to be applied together.
One common problem for such patients is the possible adverse interaction between medications given for different diseases. In this context, we have used an optimising SMT solver Z3 to quickly find the set of medications with the minimal number and severity of conflicts which is assumed to be the safest.
In our approach we can vary the measures of interest to obtain different outcomes and generate the best option, second best, etc. We use a combination of SMT solvers and theorem provers for checking correctness. Conversely, automata can be used to capture the stages and treatment options for a chronic disease, where traces in the model highlight different treatments plans that patients can follow. Adding further information to such models allow us to explore extensions and combinations of different variants of timed automata and associated tools in different ways.
We can combine patient information and comorbidities with the disease automaton through composition. She has research interests in formal methods and dependability, logics and models for true-concurrency, and foundations of modelling languages. She is currently applying her foundational work to the healthcare domain, and. Bowles received her PhD Dr. The language of stratified sets, Quine's NF, rewriting, and higher-order logic 08 May, Speaker: Murdoch Gabbay Russell's paradox is a famous inconsistency in naive set theory, that there is a set R such that R is a member of itself if and only if it is not a member of itself.
I will motivate and describe Quine's NF, which is a simple system to define and which depends on a beautiful and mysterious "stratification condition". I will give an accessible account of this stratification condition, embed it into higher-order logic to help make sense of it, and then approach NF from the point of view of term-rewriting to note that it has some nice properties. Speaker: Daniel Paulusma Joint work with Serge Gaspers and Shenwei Huang The Colouring problem is to decide if the vertices of a graph can be coloured with at most k colours for a given integer k, such that no two adjacent vertices are coloured alike.
We survey known results in both directions and discuss a number of new results for the second direction. In our talk we focus on the graph colouring techniques behind the polynomial-time result which have a wider applicability. Level 5 Student Presentations 19 April, Details are as follows:. In the stable marriage problem with incomplete lists we have a set of men and women who each express preference over each other.
Much work for this problem has been concerned with finding a stable matching, however in many cases we would rather find a matching including the maximum number of agents. In this case we want to find a matching of maximum cardinality that is the "least-unstable", either having as few blocking pairs or blocking agents as possible for a matching of that cardinality, unfortunately the problem of finding such a matching is NP-Hard. In this talk we will present our IP model for finding solutions to this problem, as well as our theoretical results for this problem.
We will also present empirical results based on random instances of this problem evaluated with our IP model. Mouse Maze is a game about a mouse named Squeaky who is attempting to escape an n by n grid containing obstacles placed by users, where n is any integer greater than 2. The goal of Mouse Maze is to find a maze that confounds Squeaky for as long as possible. Squeaky follows a deterministic rule which has been proven to always lead him to escape from the maze, meaning an upper bound on the number of turns Squeaky can be confounded for must exist.
Finding optimal mazes on larger grids is challenging because of the enormous number of possible obstacle configurations users can create. This talk discusses the viability of integer programming and constraint programming approaches to this problem. Brute force techniques are also discussed and compared to the integer and constraint programming approaches. Level 5 Student Presentations 18 April, The Capacitated Vehicle Routing Problem with Time Windows is a very important logistic problem because our economy is becoming increasingly globally connected in particular in the wake and rise of online trading.
Nowadays customers can order goods at any time from anywhere in the world and the order must be delivered to the customer within days. This trend puts significant strains on supply chains. Logistic businesses need to keep transportation costs low in order to stay competitive. Businesses use computer models and programs to effectively route their delivery vehicles to customers.
Various approaches have been proposed and used to solve this problem. Our work focuses on Constraint Programming models because of their high flexibility and readability. Constraint Programming models can be adapted and changed with relative ease on a day to day basis allowing businesses to react quickly to exceptional circumstances. When considering this problem, one often seeks a stable matching, meaning that there is no incentive for a pair not in the matching to break away from the given assignment.
If preference lists are allowed to express indifference between agents, stable matchings can have different sizes and the task of finding the maximum weakly stable matching is NP-hard. This talk provides an overview of diverse challenges designers of sensor systems have to face in terms of energy availability and harsh environmental conditions, etc. We demonstrate how our approach can be used to address the identified challenges and highlight opportunities for applying Formal Methods to the domain of WSN. However, scale of CPS makes it computationally intractable to use traditional formal techniques to verify how they can self-adapt.
This talk proposes a modelling language to represent self-adaptive CPS that supports compositionality. Using topological relationships of CPS, such as containment and connectivity, the language allows decomposing CPS into loosely coupled components. These can affect satisfaction of certain system requirements and may be therefore responsible to self-adapt in order to satisfy these requirements. The proposed language allows a system designer to explore and verify alternative self-adaptive behaviours enacted by different CPS components at different granularities.
These self-adaptive behaviours can be formally verified independently from the rest of the system, thus enabling computationally tractable verification. The modelling language combines the concurrency abstractions of Communicating Sequential Processes CSP with specialised contracts that can express locality and self-adaptation. Properties of interest are proved formally using FDR. The feasibility of the approach is demonstrated using an example of a smart art gallery. Applying Formal Methods in Healthcare 15 March, Student-Project Allocation 06 March, In universities all over the world, students must be assigned to dissertation projects as part of their degree programmes.
In many cases students are required to rank a subset of the available projects in order of preference, and likewise, lecturers rank in order of preference those students who have applied for their projects. A centralised allocation is then conducted which gives a matching of students to projects. A key consideration is that the matching should be stable, which ensures that no student and lecturer who are not matched together have an incentive to form an assignment with one another after the matching has been announced.
In this talk we consider the case where preference lists need not be strictly ordered, and may contain ties. In this scenario stable matchings can be of different sizes and it is known that the problem of finding a maximum-sized stable matching is NP-hard. The Student-Project Allocation problem with preferences over Projects SPA-P involves sets of students, projects and lecturers, where the students and lecturers each have preferences over the projects. In this context, we typically seek a stable matching of students to projects and lecturers.
Subgraph Isomorphism in Practice 20 February, Speaker: Ciaran McCreesh The subgraph isomorphism problem is to find a little "pattern" graph inside a larger "target" graph. Despite being NP-complete, practical algorithms can often solve instances with graphs involving many thousands of vertices. I'll give an overview of the state of the art in subgraph isomorphism solving, with a particular focus on search strategies.
I'll then explain some recent work we've carried out on moving away from simple backtracking search and on introducing small amounts of randomness. This can be hugely beneficial for satisfiable instances, and thanks to the "two watched literals" scheme which is the awesomest data structure in all of computer science , we can even make this change without affecting performance on unsatisfiable instances.
Finally, I'll talk about how we verify empirically that these techniques actually give an improvement, since they have no effect on worst-case theoretical complexity. Modelling of spatial stochastic systems and analysis of their spatio-temporal properties 13 February, We have initially developed a novel process algebra, specifically tailored for modelling ecological systems, and more generally for spatial stochastic systems. These systems can be seen as a collection of agents that can interact and are spatially located. To analyse properties of the dynamics of these stochastic systems, we worked with spatio-temporal logics and statistical model checking.
We introduced the novel Three-Valued Spatio-Temporal Logic TSTL , which extends the available analysis, looking at the spatio-temporal evolution of the satisfaction probabilities of given logical formulas, estimated through statistical model checking. We have also defined an automatic procedure that verifies if TSTL results are given with sufficient precision. I will present different case studies during the talk, to show various applications of our modelling language and spatio-temporal analysis. Parameterised complexity in 3-manifold topology 30 January, Speaker: William Pettersson A topologist is, according to the joke, a mathematician who does not see the difference between a coffee cup and a donut.
Aside from as a conversation starter, topology is also useful in molecular biology protein folding , astronomy coronal loops and the shape of the universe , physics group field theories and string theory and also has applications in further fields of mathematics such as geometric group theory, algebra and knot theory. In this talk I will give some background into the world of combinatorial topology, and also detail how topologists came to eventually use parameterised complexity to help solve some of their problems.
Speaker: Michel Steuwer The Lift project aims at generating high-performance code for parallel processors from portable high-level programs. Algorithms for Maximum Common Subgraph 16 January, Probabilistic model checking for UAV controller generation 05 December, Our generated controllers can now be used within the simulation models and ultimately in UAV software. Objects as communicating automata 21 November, Speaker: Roly Perera Abstract: Behind the interfaces of most objects lurk state machines, constraining the services they offer and consume at various points in their lifecycles.
Many familiar program errors result from using objects at one point in their lifecycle as if they were at another. I will describe ongoing work, inspired by session types, which models objects as communicating automata, making their states and transitions explicit. Formal models for target counting 14 November, Speaker: Michele Sevegnani The target counting problem is the computational task of estimating the total number of observable targets within a region using local counts performed by a set of networked sensors capable of counting but not identifying targets within their sensing range.
The complexity of this problem lies in the fact that multiple sensors may be observing the same target if it is located within the intersection of their overlapping sensing ranges. This may lead to wrong estimates as duplicate observations, together with the inability to distinguish different targets, introduce over-counting. In this talk, I will present models based on bigraphs of two different algorithms that have been developed to mitigate this issue. This is ongoing work in collaboration with Sven Linker at the University of Liverpool.
Speaker: Patrick Prosser MiniZinc is a free open-source constraint modelling language and is all the rage with the young, thrusting Constraint Programmer. I'll give an introductory talk and demo of MiniZinc, solving two problems: the now famous Crystal Maze and the fantastic Abarth paint shop! Be there Data-driven, probabilistic models for software usage 24 October, Speaker: Muffy Calder In the Populations project, we have been interested in how people actually use interactive software applications apps , so that we might redesign that app, or design future apps, to better support styles of use.
We had access to the user logs of instrumented apps, for tens of thousands of users, over several years. A key question for Oana and I has been how to derive models from those user traces, which would allow us to hypothesise about and analyse user styles. In previous talks, we have focussed on our use of temporal logics for expressing hypotheses and doing the analysis.
I will outline three different probabilistic, Markovian, admixture models, and the different kinds of questions hypotheses we can ask of each one. Speaker: Ornela Dardha Multiparty Session Types MPST is a typing discipline for message-passing distributed processes that can ensure properties such as absence of communication errors and deadlocks, and protocol conformance. We address this problem by 1 developing the first encoding of a full-fledged multiparty session pi-calculus into linear pi-calculus, and 2 using the encoding as the foundation of a practical toolchain for safe multiparty programming in Scala.
Our encoding is type-preserving and operationally sound and complete. Crucially, it keeps the distributed choreographic nature of MPST, illuminating that the safety properties of multiparty sessions can be precisely represented with a decomposition into binary linear channels. Previous works have only studied the relation between limited multiparty and binary sessions via centralised orchestration means.
We exploit these results to implement an automated generation of Scala APIs for multiparty sessions, abstracting existing libraries for binary communication channels. This allows multiparty systems to be safely implemented over binary message transports, as commonly found in practice. Our implementation is the first to support distributed multiparty delegation: our encoding yields it for free, via existing mechanisms for binary delegation. This year we celebrate 20 years of Scottish Theorem Proving seminars over 40 events. Therefore it is a great opportunity for the School of Computing Science to host the first seminar of to join the on-going celebration of the 60th anniversary of Computing in Glasgow.
Theorem proving research is notably strong in Scottish universities, with active groups and researchers in at least six departments.
The Scottish Theorem Proving Seminar series provide a common venue for communication and sharing of ideas by all these researchers. At least once a year, one of the departments hosts an informal seminar for the whole Scottish theorem proving community. As well as theorem proving, other related fields are also represented, e. This is a friendly venue for PhD students and RAs to discuss their work in progress and get feedback. Registration: Attendance is free, but participants are asked to register by 18th of May on Eventbrite. Lunch, tea, coffee and celebratory cake will be provided.
The holistic view makes the formal verification look like a restrictive or specialised approach. In this way, a research challenge appears: How to leverage formal verification to cope with the holistic character of cyber-physical systems? The major focus is on modelling and analysis of complex systems.
She has single authored a monograph at Springer Verlag. Speaker: Florian Weber The protocol description language Scribble, based on session types, is used as the input format for several tools that generate code to support the correct implementation of protocol roles. Scribble describes messages abstractly independently of their underlying transport mechanism. This means that working with legacy protocols and their existing textual message formats requires parsing and formatting code to bridge the gap between abstract Scribble messages and concrete protocol messages.
We introduce a small language for defining the mapping between abstract and concrete messages. We also describe a tool that automatically generates the corresponding parsers and formatters from this language. This tool has been integrated with an existing Scribble to Java transpiler. For instance, in an object that implements file access the method to read a file should not be invoked before calling the method to open that file. Behavioural types are particularly well-suited to object-oriented programming, as they make it possible to statically guarantee that method calls happen when an object has an internal state that is safe for the method's execution.
Following the typestates approach, one may declare for each possible state of the object the set of methods that can be safely executed in that state. Several languages already associate with a class a dynamic description of objects' behaviour declaring the admissible sequences of method calls. These descriptions, herein called usages, can be used to ensure at compile time that, in a given program, all the sequences of method calls to each object follow the order declared by its usage.
To ensure usages to be followed, objects are linear to prevent interferences unexpectedly changing their state. However, the typing systems referred to above have two shortcomings. First, type checking is typically inefficient, as a method's body is checked each time that method appears in a usage. Our work addresses these weaknesses:. Thus, type-safety in our setting means no run-time errors and complete execution of objects' usages.
Client usages have another advantage: they can, not only be inferred from the code, but also be used to produce pre- and post-conditions to methods that then allow to infer usages. We are developing this work in stages. First, we define a type system only with usages and prove type-safety our enhanced version. Subsequently, we extend the type system with client usages and get a more efficient type-checking. Afterwards we infer the client usages from the code, and finally, we infer pre- and post-conditions from client usages. Our aim is to provide an approach that takes a program in a Java-like language and automatically infers class usages that describe safe orders of method calls, but also type-checks client code against usages either inferred or user-defined so as to guarantee that the whole program does not go wrong.
Main research problem addressed is how to ensure that inherently concurrent, highly distributed, software systems behave correctly. The focus is on the development of techniques, program constructions, and tools that help creating safe and well-behaved systems, provably providing correctness guarantees.
The toolbox used includes static analysis of source code, capturing defects before deployment, with decidable, low complexity, property-driven, proof systems, using behavioural descriptions of programs. FATA Seminar - Discovery and recognition of emerging activities via directional statistical models and active learning 21 February, Speaker: Lei Fang Abstract: Human activity recognition plays a significant role in enabling pervasive applications as it abstracts low-level noisy sensor data into high-level human activities, which applications can respond to.
In this paper, we identify a new research question in activity recognition -- discovering and learning unknown activities that have not been pre-defined or observed. As pervasive systems intend to be deployed in a real-world environment for a long period of time, it is infeasible, to expect users will only perform a set of pre-defined activities.
Users might perform the same activities in a different manner, or perform a new type of activity. Failing to detect or update the activity model to incorporate new patterns or activities will outdate the model and result in unsatisfactory service delivery. To address this question, we propose a solution to not only discover and learn new activities over time, but also support incremental updating the activity model by employing directional statistical model hierarchical mixtures of von Mises-Fisher Distributions and active learning strategies.
His research interests include sensor networks, sensor data processing, statistical modelling, human activity recognition, etc. He got his Ph. We consider the problem, given a set A, of determining whether A contains a sum-free subset of size at least k.
We show that this problem is NP-complete in general, but is tractable with respect to certain parameterizations; in the cases where the decision problem is tractable, we also consider the complexity of counting all sum-free subsets of size exactly k. This is joint work in progress with Andrew Treglown University of Birmingham. Typical approaches refer to the dynamical behaviour of cars via differential equations. In this way, spatial aspects of cars, like the current position and the space needed for safe braking in case of emergencies, are only available indirectly.
However, for the verification of safety properties, e. In this talk, I present an approach with the intention to simplify safety proofs by abstracting away from the concrete dynamics of cars. Within proofs, explicit assumptions about the behaviour of cars have to be used. These assumptions, e. The contributions of this work are divided into three parts.
New features & announcements
I present an abstract model of traffic on multi-lane highways which hides the dynamics and only considers a local neighbourhood of each car. Subsequently, I define and briefly explain a modal logic based on this model to specify and verify safety properties of highway traffic. Finally, I present the application of this logic in form of a case study exploring minimal constraints for controllers ensuring safety on motorways. His main research areas are the application of logics to verification and specification of computer systems, especially modal logics and their proof systems, as well as formal diagrammatic systems.
Speaker: Christopher Stone Hyper-Heuristics is a search method for selecting and generating heuristics to solve combinatorial optimisation problems taking advantage of the abundance of heuristics developed to tackle a wide range of problem classes. Unfortunately heuristics, and the solutions they operate on, tend to have their own specific representation both in terms of underlying data structure and in the taxonomy used to describe their approach. This talk will present an approach based on graphs and graph transformations able to model multiple problem classes using the same data structure.
This will include a discussion on the trade-offs of this approach and an overview of the latest empirical results. Bio: Christopher L. His main research interests are related to computational intelligence with a focus on representation of NP-Hard problems Routing, packing and scheduling , generation of heuristics and graph transformations. Speaker: Ruth Hoffman With the wide applicability of probabilistic and stochastic hybrid systems in the real world, it is now more important than ever to be able to verify these systems for safety and reliability. Hybrid systems can be found anywhere, from thermostats to processes passing messages.
We will discuss the different types of hybrid systems and their discrete abstractions. The probabilistic hybrid systems we will be focusing on are autonomous unmanned aerial vehicles. The abstracted structures allow for existing quantitative and model checking tools to verify and analyse the system. Speaker: Wei Chen Abstract: Automatic malware classifiers often perform badly on the detection of new malware, i.
We study the machine-learning-based mobile malware classifiers and reveal one reason: the input features used by these classifiers can't capture general behavioural patterns of malware instances. We extract the best-performing syntax- based features like permissions and API calls, and some semantics-based features like happen-befores and unwanted behaviours, and train classifiers using popular supervised and semi-supervised learning methods.
By comparing their classification performance on industrial datasets collected across several years, we demonstrate that using semantics- based features can dramatically improve robustness of malware classifiers. Roland C. In Wei worked with Prof. Martin Hofmann on type-based verification in Munich.
He started his current RA with Prof. David Aspinall since , focusing on learning policies for mobile security. Wei's main research interests are in formal methods, in particular, type theory, combinatorial games, and Buechi automata with their applications in program analysis and verification. He is currently working on combining formal methods and machine learning to help with mobile security. Appropriately parameterizing polynomial-time solvable problems helps reducing unattractive polynomial running times. In particular, this "FPT in P" approach sheds new light on what makes a problem far from being solvable in linear time, in the same way as classical FPT algorithms help in illuminating what makes an NP-hard problem far from being solvable in polynomial time.
Surprisingly, this very interesting research direction has been too little explored so far; the known results are rather scattered and do not systematically refer to or exploit the toolbox of parameterized algorithm design. To this end, I will outline known results, explain some of the corresponding techniques, and highlight similarities and differences to the classical design of parameterized algorithms for NP-hard problems.
Speaker: Patrick Prosser In the maximum clique problem, we are given a simple graph with vertices and edges , and we are to find a largest set of vertices such that all pairs of vertices in that set are adjacent. In the maximum weight clique problem mwc , vertices have weight, and we are to find a set of pair-wise adjacent vertices such that the sum of the weights of those vertices is as big as can be. In my talk I will present an exact algorithm for this problem, present some real-world problems one that is very close to home that are in fact instances of mwc. I'll review the current state of empirical studies of mwc and suggest future directions of study.
We use this framework to describe a generic session type system for variants of the pi-calculus. In this generic system, standard properties, including fidelity, hold at the level of the framework and are then guaranteed to hold when the generic system is instantiated. We show that our system can capture existing systems including the session type system due to Gay and Hole, a type system for progress due to Vieira and Vasconcelos and a refinement type system due to Baltazar et al. The standard fidelity property is proved at the level of the generic system, and automatically hold when the system is instantiated.
In this context we seek a stable matching of doctors to hospitals, but for some instances, such a matching may not exist. Finally, we discuss an empirical evaluation of these models applied to randomly-generated instances of the problem. We find that on average, the CP model is about 1. We further observe that the number of blocking pairs admitted by a solution is very small, i. Most popular clustering algorithms work by approximately optimising modularity. We investigate the maximum modularity of Erdos-Renyi random graphs and find there are three different phases of the likely maximum modularity.
This is joint work with Prof. Colin McDiarmid. She has a particular interest in identifying community structure in networks and also more broadly in phase transitions, random graphs, network coding and positional games. Speaker: David Manlove We consider the problem of allocating applicants to courses, where each applicant has a subset of acceptable courses that she ranks in strict order of preference.
Each applicant and course has a capacity, indicating the maximum number of courses and applicants they can be assigned to, respectively. There are also prerequisite or corequisite constraints on courses e. We consider two different ways of extending preferences over individual courses to preferences over bundles of courses.
Subject to each definition, we present algorithms and complexity results relating to the problem of computing a Pareto optimal matching of applicants to courses. Speaker: Jorge A. I will briefly overview works that address this issue and discuss promising research avenues. Speaker: Baharak Rastegari In this work we study two-sided matching markets in which the participants do not fully know their preferences but can learn their preferences by conducting costly interviews. The main goal is then to find a good strategy for interviews to be carried out in order to minimize their use, whilst leading to a stable matching.
We argue that a meaningful comparison would be against an optimal offline algorithm that has access to agents' preference orderings under complete information. This is joint work with Paul Goldberg and David Manlove. Speaker: Josue Ortega I study the many-to-many matching problem induced by the popular dating app Tinder.
I provide empirical evidence suggesting that its matching procedure is unstable, and show, in a simplified setting, that its assignments can be setwise and even pairwise blocked. Tinder's mechanism can be improved by a known two-step procedure which guarantees setwise stability whenever achievable, i. I establish a link between strong substitutability and the maximin property that connects two areas of the literature that remained unrelated, and that can be merged to obtain a useful result: deciding who proposes first generates a tradeoff between the optimality versus the simplicity and privacy of the matching.
Speaker: Ruth Hoffmann With the rising popularity of autonomous systems and their increased deployment within the public domain, ensuring the safety of these systems is crucial. Although testing is a necessary part in the process of deploying such systems, simulation and formal verification are key tools, especially at the early stages of design. Simulation allows us to view the continuous dynamics and monitor behaviour of a system. On the other hand, formal verification of autonomous systems allows for a cheap, fast, and extensive way to check for safety and correct functionality of autonomous systems, that is not possible using simulations alone.
In this talk I will demonstrate a simulation and the corresponding probabilistic model of an unmanned aerial vehicle UAV in an exemplary autonomous scenario and present results of the discrete models. Further, I discuss a possible formal framework to abstract autonomous systems using simulations to inform probabilistic models.
Speaker: Florian Weber Session types are used to describe communication protocols. The type! This problem is NP-Complete. Therefore, there are many target graphs and many patterns graphs. Prior to performing a subgraph isomorphism test between a pattern P and target T the indices are used to determine if P is trivially not in T. The BIG DATA approach is to spend considerable effort creating sophisticated indices to avoid having to resort to backtracking search i. This only pays off when the majority of decision problems are unsatisfiable, and therefore problems must exist in the easy unsat region for filtering to work.
But what happens if we take a different approach? What happens if we put little effort into creating indices and more effort into crafting smarter subgraph isomorphism algorithms? Speaker: JamesTrimble Kidney exchange schemes have been employed successfully in many countries including the UK since to increase the number of kidney transplants from living donors. It is an NP-hard cycle-packing problem to determine the largest possible set of transplants for a given pool of donors and patients.
This talk will present two pieces of work we carried out recently - the first to develop a more scalable approach to optimising the kidney-exchange problem, and the second to help in policy development. New compact integer-programming models for kidney exchange. I will briefly describe the new models we have developed, which frequently outperform the existing state of the art, and present some LP relaxation tightness results.
A simulation project which we carried out for NHS Blood and Transplant to estimate the effects of several policy options. Joint work with David Manlove. Configuring deep Spiking Neural Networks SNNs is an exciting research avenue for low power spike event based computation. However, the spike generation function is non-differentiable and therefore not directly compatible with the standard error backpropagation algorithm. In this paper, we introduce a new general backpropagation mechanism for learning synaptic weights and axonal delays which overcomes the problem of non-differentiability of the spike function and uses a temporal credit assignment policy for backpropagating error to preceding layers.
We describe and release a GPU accelerated software implementation of our method which allows training both fully connected and convolutional neural network CNN architectures. Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance.
However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks DNNs , existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate.
By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. Optimizing task-related mathematical model is one of the most fundamental methodologies in statistic and learning areas.
However, generally designed schematic iterations may hard to investigate complex data distributions in real-world applications. Recently, training deep propagations i. Unfortunately, existing networks are often built in heuristic manners, thus lack of principled interpretations and solid theoretical supports. In this work, we provide a new paradigm, named Propagation and Optimization based Deep Model PODM , to bridge the gaps between these different mechanisms i.
On the one hand, we utilize PODM as a deeply trained solver for model optimization. Different from these existing network based iterations, which often lack theoretical investigations, we provide strict convergence analysis for PODM in the challenging nonconvex and nonsmooth scenarios. On the other hand, by relaxing the model constraints and performing end-to-end training, we also develop a PODM based strategy to integrate domain knowledge formulated as models and real data distributions learned by networks , resulting in a generic ensemble framework for challenging real-world applications.
Extensive experiments verify our theoretical results and demonstrate the superiority of PODM against these state-of-the-art approaches. Despite being virtually ubiquitous, sequence-to-sequence models are challenged by their lack of diversity and inability to be externally controlled. In this paper, we speculate that a fundamental shortcoming of sequence generation models is that the decoding is done strictly from left-to-right, meaning that outputs values generated earlier have a profound effect on those generated later.
To address this issue, we propose a novel middle-out decoder architecture that begins from an initial middle-word and simultaneously expands the sequence in both directions. To facilitate information flow and maintain consistent decoding, we introduce a dual self-attention mechanism that allows us to model complex dependencies between the outputs. We illustrate the performance of our model on the task of video captioning, as well as a synthetic sequence de-noising task.
Our middle-out decoder achieves significant improvements on de-noising and competitive performance in the task of video captioning, while quantifiably improving the caption diversity. Furthermore, we perform a qualitative analysis that demonstrates our ability to effectively control the generation process of our decoder. We present a neurosymbolic framework for the lifelong learning of algorithmic tasks that mix perception and procedural reasoning. Reusing high-level concepts across domains and learning complex procedures are key challenges in lifelong learning.
We show that a program synthesis approach that combines gradient descent with combinatorial search over programs can be a more effective response to these challenges than purely neural methods. Our framework, called HOUDINI, represents neural networks as strongly typed, differentiable functional programs that use symbolic higher-order combinators to compose a library of neural functions.
Our learning algorithm consists of: 1 a symbolic program synthesizer that performs a type-directed search over parameterized programs, and decides on the library functions to reuse, and the architectures to combine them, while learning a sequence of tasks; and 2 a neural module that trains these programs using stochastic gradient descent. We evaluate HOUDINI on three benchmarks that combine perception with the algorithmic tasks of counting, summing, and shortest-path computation.
Neural models operating over structured spaces such as knowledge graphs require a continuous embedding of the discrete elements of this space such as entities as well as the relationships between them. Relational embeddings with high expressivity, however, have high model complexity, making them computationally difficult to train.
We propose a new family of embeddings for knowledge graphs that interpolate between a method with high model complexity and one, namely Holographic embeddings HolE , with low dimensionality and high training efficiency. This interpolation, termed HolEx, is achieved by concatenating several linearly perturbed copies of original HolE. We formally characterize the number of perturbed copies needed to provably recover the full entity-entity or entity-relation interaction matrix, leveraging ideas from Haar wavelets and compressed sensing.
In practice, using just a handful of Haar-based or random perturbation vectors results in a much stronger knowledge completion system. In this paper, we provide a theoretical understanding of word embedding and its dimensionality. Motivated by the unitary-invariance of word embedding, we propose the Pairwise Inner Product PIP loss, a novel metric on the dissimilarity between word embeddings. Using techniques from matrix perturbation theory, we reveal a fundamental bias-variance trade-off in dimensionality selection for word embeddings.
This bias-variance trade-off sheds light on many empirical observations which were previously unexplained, for example the existence of an optimal dimensionality. Moreover, new insights and discoveries, like when and how word embeddings are robust to over-fitting, are revealed. By optimizing over the bias-variance trade-off of the PIP loss, we can explicitly answer the open question of dimensionality selection for word embedding.
Humans have a remarkable capacity to understand the physical dynamics of objects in their environment, flexibly capturing complex structures and interactions at multiple levels of detail. Inspired by this ability, we propose a hierarchical particle-based object representation that covers a wide variety of types of three-dimensional objects, including both arbitrary rigid geometrical shapes and deformable materials.
We then describe the Hierarchical Relation Network HRN , an end-to-end differentiable neural network based on hierarchical graph convolution, that learns to predict physical dynamics in this representation. Compared to other neural network baselines, the HRN accurately handles complex collisions and nonrigid deformations, generating plausible dynamics predictions at long time scales in novel settings, and scaling to large scene configurations.
These results demonstrate an architecture with the potential to form the basis of next-generation physics predictors for use in computer vision, robotics, and quantitative cognitive science. Estimating how uncertain an AI system is in its predictions is important to improve the safety of such systems. Different actions might be taken depending on the source of the uncertainty so it is important to be able to distinguish between them. Recently, baseline tasks and metrics have been defined and several practical methods to estimate uncertainty developed. PNs do this by parameterizing a prior distribution over predictive distributions.
The vulnerability of deep image classification networks to adversarial attack is now well known, but less well understood. Via a novel experimental analysis, we illustrate some facts about deep convolutional networks for image classification that shed new light on their behaviour and how it connects to the problem of adversaries. In short, the celebrated performance of these networks and their vulnerability to adversarial attack are simply two sides of the same coin: the input image-space directions along which the networks are most vulnerable to attack are the same directions which they use to achieve their classification performance in the first place.
We develop this result in two main steps. The first uncovers the fact that classes tend to be associated with specific image-space directions. This is shown by an examination of the class-score outputs of nets as functions of 1D movements along these directions. This provides a novel perspective on the existence of universal adversarial perturbations. The second is a clear demonstration of the tight coupling between classification performance and vulnerability to adversarial attack within the spaces spanned by these directions.
Thus, our analysis resolves the apparent contradiction between accuracy and vulnerability. It provides a new perspective on much of the prior art and reveals profound implications for efforts to construct neural nets that are both accurate and robust to adversarial attack. This paper introduces versatile filters to construct efficient convolutional neural network.
Considering the demands of efficient deep learning techniques running on cost-effective hardware, a number of methods have been developed to learn compact neural networks. Most of these works aim to slim down filters in different ways, e. In contrast, we treat filters from an additive perspective.
A series of secondary filters can be derived from a primary filter. These secondary filters all inherit in the primary filter without occupying more storage, but once been unfolded in computation they could significantly enhance the capability of the filter by integrating information extracted from different receptive fields. Besides spatial versatile filters, we additionally investigate versatile filters from the channel perspective. The new techniques are general to upgrade filters in existing CNNs. Experimental results on benchmark datasets and neural networks demonstrate that CNNs constructed with our versatile filters are able to achieve comparable accuracy as that of original filters, but require less memory and FLOPs.
We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function. And we propose end-to-end training and testing algorithms for this new architecture. Compared to classical neural nets with softmax function as output activation, the surrogate with interpolating function as output activation combines advantages of both deep and manifold learning. The new framework demonstrates the following major advantages: First, it is better applicable to the case with insufficient training data. Second, it significantly improves the generalization accuracy on a wide variety of networks.
We use complex-weighted, deep networks to invert the effects of multimode optical fibre distortion of a coherent input image. The experimental data is used to train complex-weighted models with a range of regularisation approaches. A key benefit of the unitary constraint is that it allows us to learn a forward unitary model and analytically invert it to solve the inverse problem.
We demonstrate this approach, and show how it can improve performance by incorporating knowledge of the phase shift induced by the spatial light modulator. Neural networks are a powerful class of nonlinear functions that can be trained end-to-end on various applications. While the over-parametrization nature in many neural networks renders the ability to fit complex functions and the strong representation power to handle challenging tasks, it also leads to highly correlated neurons that can hurt the generalization ability and incur unnecessary computation cost.
As a result, how to regularize the network to avoid undesired representation redundancy becomes an important issue. To this end, we draw inspiration from a well-known problem in physics -- Thomson problem, where one seeks to find a state that distributes N electrons on a unit sphere as evenly as possible with minimum potential energy. In light of this intuition, we reduce the redundancy regularization problem to generic energy minimization, and propose a minimum hyperspherical energy MHE objective as generic regularization for neural networks. We also propose a few novel variants of MHE, and provide some insights from a theoretical point of view.
Finally, we apply neural networks with MHE regularization to several challenging tasks. Extensive experiments demonstrate the effectiveness of our intuition, by showing the superior performance with MHE regularization. Attention networks in multimodal learning provide an efficient way to utilize given visual information selectively. However, the computational cost to learn attention distributions for every pair of multimodal input channels is prohibitively expensive. To solve this problem, co-attention builds two separate attention distributions for each modality neglecting the interaction between multimodal inputs.
In this paper, we propose bilinear attention networks BAN that find bilinear attention distributions to utilize given vision-language information seamlessly. BAN considers bilinear interactions among two groups of input channels, while low-rank bilinear pooling extracts the joint representations for each pair of channels.
Furthermore, we propose a variant of multimodal residual networks to exploit eight-attention maps of the BAN efficiently. We quantitatively and qualitatively evaluate our model on visual question answering VQA 2. Data parallelism can boost the training speed of convolutional neural networks CNN , but could suffer from significant communication costs caused by gradient aggregation. To alleviate this problem, several scalar quantization techniques have been developed to compress the gradients.
But these techniques could perform poorly when used together with decentralized aggregation protocols like ring all-reduce RAR , mainly due to their inability to directly aggregate compressed gradients. In this paper, we empirically demonstrate the strong linear correlations between CNN gradients, and propose a gradient vector quantization technique, named GradiVeQ, to exploit these correlations through principal component analysis PCA for substantial gradient dimension reduction.
GradiveQ enables direct aggregation of compressed gradients, hence allows us to build a distributed learning system that parallelizes GradiveQ gradient compression and RAR communications. The goal of the Town Hall is to gather information from attendees on priorities for Diversity and Inclusion efforts and to address questions the attendees may have.
The Town Hall will be moderated by Susan Gonzales and the panel includes leaders of conference-affiliated affinity groups and members of the Foundation Board and the Organizing Committee.
Full table of contents
Brains are not unique in their computational abilities. Bacteria, plants, and unicellular organisms exhibit learning and plasticity; nervous systems speed-optimized information-processing that is ubiquitous across the tree of life and was already occurring at multiple scales before neurons evolved. Non-neural computation is especially critical for enabling individual cells to coordinate their activity toward the creation and repair of complex large-scale anatomies.
We have found that bioelectric signaling enables all types of cells to form networks that store pattern memories that guide large-scale growth and form. In this talk, I will introduce the basics of developmental bioelectricity, and show how novel conceptual and methodological advances have enabled rewriting pattern memories that guide morphogenesis without genomic editing. In effect, these strategies allow reprogramming the bioelectric software that implements multicellular patterning goal states.
I will show examples of applications in regenerative medicine and cognitive neuroplasticity, and illustrate future impacts on synthetic bioengineering, robotics, and machine learning. Voice cloning is a highly desired feature for personalized speech interfaces. We introduce a neural voice cloning system that learns to synthesize a person's voice from only a few audio samples. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-tuning a multi-speaker generative model. Speaker encoding is based on training a separate model to directly infer a new speaker embedding, which will be applied to a multi-speaker generative model.
In terms of naturalness of the speech and similarity to the original speaker, both approaches can achieve good performance, even with a few cloning audios. While speaker adaptation can achieve slightly better naturalness and similarity, cloning time and required memory for the speaker encoding approach are significantly less, making it more favorable for low-resource deployment.
We propose a metalearning approach for learning gradient-based reinforcement learning RL algorithms. The idea is to evolve a differentiable loss function, such that an agent, which optimizes its policy to minimize this loss, will achieve high rewards. The loss is parametrized via temporal convolutions over the agent's experience.
Because this loss is highly flexible in its ability to take into account the agent's history, it enables fast task learning. Empirical results show that our evolved policy gradient algorithm EPG achieves faster learning on several randomized environments compared to an off-the-shelf policy gradient method. We also demonstrate that EPG's learned loss can generalize to out-of-distribution test time tasks, and exhibits qualitatively different behavior from other popular metalearning algorithms.
While the problems have a long history in statistics, finite sample bounds for these problems have only been established recently. We provide optimal sample complexity algorithms for identity testing problem for all parameter ranges, and the first results for closeness testing. Our upper bounds are obtained by privatizing non-private estimators for these problems.
The non-private estimators are chosen to have small sensitivity. We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under differential privacy. We show a bound on differentially private algorithms in terms of a coupling between the two hypothesis classes we aim to test. By constructing carefully chosen priors over the hypothesis classes, and using Le Cam's two point theorem we provide a general mechanism for proving lower bounds.
We believe that the framework can be used to obtain strong lower bounds for other statistical tasks under privacy. Goal-oriented dialog has been given attention due to its numerous applications in artificial intelligence. Goal-oriented dialogue tasks occur when a questioner asks an action-oriented question and an answerer responds with the intent of letting the questioner know a correct action to take.
To ask the adequate question, deep learning and reinforcement learning have been recently applied. However, these approaches struggle to find a competent recurrent neural questioner, owing to the complexity of learning a series of sentences. Motivated by theory of mind, we propose "Answerer in Questioner's Mind" AQM , a novel information theoretic algorithm for goal-oriented dialog.
With AQM, a questioner asks and infers based on an approximated probabilistic model of the answerer. In our experiments, AQM outperforms comparative algorithms by a large margin. The focus in machine learning has branched beyond training classifiers on a single task to investigating how previously acquired knowledge in a source domain can be leveraged to facilitate learning in a related target domain, known as inductive transfer learning.
Three active lines of research have independently explored transfer learning using neural networks.
In weight transfer, a model trained on the source domain is used as an initialization point for a network to be trained on the target domain. In deep metric learning, the source domain is used to construct an embedding that captures class structure in both the source and target domains. In few-shot learning, the focus is on generalizing well in the target domain based on a limited number of labeled examples.
We compare state-of-the-art methods from these three paradigms and also explore hybrid adapted-embedding methods that use limited target-domain data to fine tune embeddings constructed from source-domain data. We conduct a systematic comparison of methods in a variety of domains, varying the number of labeled instances available in the target domain k , as well as the number of target-domain classes. We hope our results will motivate a unification of research in weight transfer, deep metric learning, and few-shot learning.
There are now several large scale deployments of differential privacy used to collect statistical information about users. However, these deployments periodically recollect the data and recompute the statistics using algorithms designed for a single use. As a result, these systems do not provide meaningful privacy guarantees over long time scales.
In this paper, we introduce a new technique for local differential privacy that makes it possible to maintain up-to-date statistics over time, with privacy guarantees that degrade only in the number of changes in the underlying distribution rather than the number of collection periods. We use our technique for tracking a changing statistic in the setting where users are partitioned into an unknown collection of groups, and at every time period each user draws a single bit from a common but changing group-specific distribution.
We also provide an application to frequency and heavy-hitter estimation. We marry two powerful ideas: deep representation learning for visual recognition and language understanding, and symbolic program execution for reasoning. Our neural-symbolic visual question answering NS-VQA system first recovers a structural scene representation from the image and a program trace from the question. It then executes the program on the scene representation to obtain an answer.
Incorporating symbolic structure as prior knowledge offers three unique advantages. First, executing programs on a symbolic space is more robust to long program traces; our model can solve complex reasoning tasks better, achieving an accuracy of Second, the model is more data- and memory-efficient: it performs well after learning on a small number of training data; it can also encode an image into a compact representation, requiring less storage than existing methods for offline question answering.
Third, symbolic program execution offers full transparency to the reasoning process; we are thus able to interpret and diagnose each execution step. Due to the inherent model uncertainty, learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning. In this paper, we propose a novel Bayesian model-agnostic meta-learning method.
The proposed method combines efficient gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. Unlike previous methods, during fast adaptation, the method is capable of learning complex uncertainty structure beyond a simple Gaussian approximation, and during meta-update, a novel Bayesian mechanism prevents meta-level overfitting.
Remaining a gradient-based method, it is also the first Bayesian model-agnostic meta-learning method applicable to various tasks including reinforcement learning. Experiment results show the accuracy and robustness of the proposed method in sinusoidal regression, image classification, active learning, and reinforcement learning. We design new differentially private algorithms for the Euclidean k-means problem, both in the centralized model and in the local model of differential privacy. In both models, our algorithms achieve significantly improved error guarantees than the previous state-of-the-art.
In addition, in the local model, our algorithm significantly reduces the number of interaction rounds. Although the problem has been widely studied in the context of differential privacy, all of the existing constructions achieve only super constant approximation factors.
We present, for the first time, efficient private algorithms for the problem with constant multiplicative error. Furthermore, we show how to modify our algorithms so they compute private coresets for k-means clustering in both models. We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution are key enablers of effective deep learning systems.
However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-supported. The reliance on hardware specific operator libraries limits the applicability of high-level graph optimizations and incurs significant engineering costs when deploying to new hardware targets.
We use learning to remove this engineering burden. We learn domain specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants. We further accelerate the search by effective model transfer across workloads.
Experimental results show that our framework delivers performance competitive with state-of-the-art hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPU. In this paper we address the text to scene image generation problem. Generative models that capture the variability in complicated scenes containing rich semantics is a grand goal of image generation. Complicated scene images contain rich visual elements, compositional visual concepts, and complicated relations between objects. Generative models, as an analysis-by-synthesis process, should encompass the following three core components: 1 the generation process that composes the scene; 2 what are the primitive visual elements and how are they composed; 3 the rendering of abstract concepts into their pixel-level realizations.
We propose PNP-Net, a variational auto-encoder framework that addresses these three challenges: it flexibly composes images with a dynamic network structure, learns a set of distribution transformers that can compose distributions based on semantics, and decodes samples from these distributions into realistic images. Given the apparent difficulty of learning models that are robust to adversarial perturbations, we propose tackling the simpler problem of developing adversarially robust features. Specifically, given a dataset and metric of interest, the goal is to return a function or multiple functions that 1 is robust to adversarial perturbations, and 2 has significant variation across the datapoints.
We establish strong connections between adversarially robust features and a natural spectral property of the geometry of the dataset and metric of interest. This connection can be leveraged to provide both robust features, and a lower bound on the robustness of any function that has significant variance across the dataset.
Finally, we provide empirical evidence that the adversarially robust features given by this spectral approach can be fruitfully leveraged to learn a robust and accurate model. A central problem to understanding intelligence is the concept of generalisation. This allows previously learnt structure to be exploited to solve tasks in novel situations differing in their particularities. We take inspiration from neuroscience, specifically the hippocampal-entorhinal system known to be important for generalisation.
We propose that to generalise structural knowledge, the representations of the structure of the world, i. We show, under these principles, artificial neural networks embedded with hierarchy and fast Hebbian memory, can learn the statistics of memories and generalise structural knowledge. Spatial neuronal representations mirroring those found in the brain emerge, suggesting spatial cognition is an instance of more general organising principles.
We further unify many entorhinal cell types as basis functions for constructing transition graphs, and show these representations effectively utilise memories. We experimentally support model assumptions, showing a preserved relationship between entorhinal grid and hippocampal place cells across environments.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a blackbox differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions.
For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models. We design differentially private learning algorithms that are agnostic to the learning model assuming access to limited amount of unlabeled public data. Our private algorithm follows the paradigm of subsample-and-aggregate, in which any generic non-private learner is trained on disjoint subsets of the private training set, then for each classification query, the votes of the resulting classifiers ensemble are aggregated in a differentially private fashion.
We show that our algorithm makes a conservative use of the privacy budget. Next, we apply the knowledge transfer technique to construct a private learner that outputs a classifier, which can be used to answer unlimited number of queries. In the PAC model, we analyze our construction and prove upper bounds on the sample complexity for both the realizable and the non-realizable cases. As in non-private sample complexity, our bounds are completely characterized by the VC dimension of the concept class.
An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: 1 population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and 2 computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables.
These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology.
We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art. In high dimensional settings, density estimation algorithms rely crucially on their inductive bias. Despite recent empirical success, the inductive bias of deep generative models is not well understood. In this paper we propose a framework to systematically investigate bias and generalization in deep generative models of images by probing the learning algorithm with carefully designed training datasets.
By measuring properties of the learned distribution, we are able to find interesting patterns of generalization. We verify that these patterns are consistent across datasets, common models and architectures. Prior work has investigated variations of prediction markets that preserve participants' differential privacy, which formed the basis of useful mechanisms for purchasing data for machine learning objectives.
Such markets required potentially unlimited financial subsidy, however, making them impractical. In this work, we design an adaptively-growing prediction market with a bounded financial subsidy, while achieving privacy, incentives to produce accurate predictions, and precision in the sense that market prices are not heavily impacted by the added privacy-preserving noise.
We briefly discuss how our mechanism can extend to the data-purchasing setting, and its relationship to traditional learning algorithms. Probability estimation is one of the fundamental tasks in statistics and machine learning. However, standard methods for probability estimation on discrete objects do not handle object structure in a satisfactory manner. In this paper, we derive a general Bayesian network formulation for probability estimation on leaf-labeled trees that enables flexible approximations which can generalize beyond observations.
We show that efficient algorithms for learning Bayesian networks can be easily extended to probability estimation on this challenging structured space. Experiments on both synthetic and real data show that our methods greatly outperform the current practice of using the empirical distribution, as well as a previous effort for probability estimation on trees. We study the problem of learning conditional generators from noisy labeled samples, where the labels are corrupted by random noise. A standard training of conditional GANs will not only produce samples with wrong labels, but also generate poor quality samples.
We consider two scenarios, depending on whether the noise model is known or not. The main idea is to corrupt the label of the generated sample before feeding to the adversarial discriminator, forcing the generator to produce samples with clean labels. This approach of passing through a matching noisy channel is justified by accompanying multiplicative approximation bounds between the loss of the RCGAN and the distance between the clean real distribution and the generator distribution. This shows that the proposed approach is robust, when used with a carefully chosen discriminator architecture, known as projection discriminator.
When the distribution of the noise is not known, we provide an extension of our architecture, which we call RCGAN-U, that learns the noise model simultaneously while training the generator. Distributed stochastic gradient descent is an important subroutine in distributed learning. A setting of particular interest is when the clients are mobile devices, where two important concerns are communication efficiency and the privacy of the clients. Several recent works have focused on reducing the communication cost or introducing privacy guarantees, but none of the proposed communication efficient methods are known to be privacy preserving and none of the known privacy mechanisms are known to be communication efficient.
To this end, we study algorithms that achieve both communication efficiency and differential privacy.
Net2Plan User’s Guide
We propose a new type of generative model for high-dimensional data that learns a manifold geometry of the data, rather than density, and can generate points evenly along this manifold. This is in contrast to existing generative models that represent data density, and are strongly affected by noise and other artifacts of data collection. We demonstrate how this approach corrects sampling biases and artifacts, thus improves several downstream data analysis tasks, such as clustering and classification.
Finally, we demonstrate that this approach is especially useful in biology where, despite the advent of single-cell technologies, rare subpopulations and gene-interaction relationships are affected by biased sampling. We show that SUGAR can generate hypothetical populations, and it is able to reveal intrinsic patterns and mutual-information relationships between genes on a single-cell RNA sequencing dataset of hematopoiesis. This paper addresses the mode collapse for generative adversarial networks GANs.
We view modes as a geometric structure of data distribution in a metric space. Under this geometric lens, we embed subsamples of the dataset from an arbitrary metric space into the L2 space, while preserving their pairwise distance distribution. Not only does this metric embedding determine the dimensionality of the latent space automatically, it also enables us to construct a mixture of Gaussians to draw latent space random vectors. We use the Gaussian mixture model in tandem with a simple augmentation of the objective function to train GANs.
Every major step of our method is supported by theoretical analysis, and our experiments on real and synthetic data confirm that the generator is able to produce samples spreading over most of the modes while avoiding unwanted samples, outperforming several recent GAN variants on a number of metrics and offering new features. Machine learning models are often susceptible to adversarial perturbations of their inputs. Even small perturbations can cause state-of-the-art classifiers with high "standard" accuracy to produce an incorrect prediction with high confidence.
To better understand this phenomenon, we study adversarially robust learning from the viewpoint of generalization. We show that already in a simple natural data model, the sample complexity of robust learning can be significantly larger than that of "standard" learning. This gap is information theoretic and holds irrespective of the training algorithm or the model family. We complement our theoretical results with experiments on popular image classification datasets and show that a similar gap exists here as well.
We postulate that the difficulty of training robust classifiers stems, at least partially, from this inherently larger sample complexity. A fundamental goal of systems neuroscience is to understand how neural activity gives rise to natural behavior. In order to achieve this goal, we must first build comprehensive models that offer quantitative descriptions of behavior.
We develop a new class of probabilistic models to tackle this challenge in the study of larval zebrafish, an important model organism for neuroscience. Larval zebrafish locomote via sequences of punctate swim bouts--brief flicks of the tail--which are naturally modeled as a marked point process. However, these sequences of swim bouts belie a set of discrete and continuous internal states, latent variables that are not captured by standard point process models.
We incorporate these variables as latent marks of a point process and explore various models for their dynamics. To infer the latent variables and fit the parameters of this model, we develop an amortized variational inference algorithm that targets the collapsed posterior distribution, analytically marginalizing out the discrete latent variables. With a dataset of over , swim bouts, we show that our models reveal interpretable discrete classes of swim bouts and continuous internal states like hunger that modulate their dynamics.
These models are a major step toward understanding the natural behavioral program of the larval zebrafish and, ultimately, its neural underpinnings. The loss functions of deep neural networks are complex and their geometric properties are not well understood. We show that the optima of these complex loss functions are in fact connected by simple curves, over which training and test accuracy are nearly constant.
We introduce a training procedure to discover these high-accuracy pathways between modes. Inspired by this new geometric insight, we also propose a new ensembling method entitled Fast Geometric Ensembling FGE. Using FGE we can train high-performing ensembles in the time required to train a single model. Adversarial sample attacks perturb benign inputs to induce DNN misbehaviors. Recent research has demonstrated the widespread presence and the devastating consequences of such attacks. Existing defense techniques either assume prior knowledge of specific attacks or may not work well on complex models due to their underlying assumptions.
Therefore, we propose a novel adversarial sample detection technique for face recognition models, based on interpretability. It features a novel bi-directional correspondence inference between attributes and internal neurons to identify neurons critical for individual attributes. The activation values of critical neurons are enhanced to amplify the reasoning part of the computation and the values of other neurons are weakened to suppress the uninterpretable part.
The classification results after such transformation are compared with those of the original model to detect adversaries. Sensory processing is often characterized as implementing probabilistic inference: networks of neurons compute posterior beliefs over unobserved causes given the sensory inputs. How these beliefs are computed and represented by neural responses is much-debated Fiser et al. Here, we show that these alternatives -- contrary to common assumptions -- are not mutually exclusive and that the very same system can be compatible with all of them.
As a central analytical result, we show that modeling neural responses in area V1 as samples from a posterior distribution over latents in a linear Gaussian model of the image implies that those neural responses form a linear Probabilistic Population Code PPC, Ma et al. In particular, the posterior distribution over some experimenter-defined variable like "orientation" is part of the exponential family with sufficient statistics that are linear in the neural sampling-based firing rates. Batch Normalization BatchNorm is a widely adopted technique that enables faster and more stable training of deep neural networks DNNs.
Despite its pervasiveness, the exact reasons for BatchNorm's effectiveness are still poorly understood. The popular belief is that this effectiveness stems from controlling the change of the layers' input distributions during training to reduce the so-called "internal covariate shift". In this work, we demonstrate that such distributional stability of layer inputs has little to do with the success of BatchNorm. Instead, we uncover a more fundamental impact of BatchNorm on the training process: it makes the optimization landscape significantly smoother.
This smoothness induces a more predictive and stable behavior of the gradients, allowing for faster training. We present a new approach for learning to solve SMT formulas. We phrase the challenge of solving SMT formulas as a tree search problem where at each step a transformation is applied to the input formula until the formula is solved. Our approach works in two phases: first, given a dataset of unsolved formulas we learn a policy that for each formula selects a suitable transformation to apply at each step in order to solve the formula, and second, we synthesize a strategy in the form of a loop-free program with branches.
This strategy is an interpretable representation of the policy decisions and is used to guide the SMT solver to decide formulas more efficiently, without requiring any modification to the solver itself and without needing to evaluate the learned policy at inference time. Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back in the past.
The most common method for training recurrent neural networks, back-propagation through time BPTT , requires credit information to be propagated backwards through every single step of the forward computation, potentially over thousands or millions of time steps. This becomes computationally expensive or even infeasible when used with long sequences. Importantly, biological brains are unlikely to perform such detailed reverse replay over very long sequences of internal states consider days, months, or years.
However, humans are often reminded of past memories or mental states which are associated with the current mental state. We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state. Based on this principle, we study a novel algorithm which only back-propagates through a few of these temporal skip connections, realized by a learned attention mechanism that associates current states with relevant past states. We demonstrate in experiments that our method matches or outperforms regular BPTT and truncated BPTT in tasks involving particularly long-term dependencies, but without requiring the biologically implausible backward replay through the whole history of states.
Training a neural network using backpropagation algorithm requires passing error gradients sequentially through the network. The backward locking prevents us from updating network layers in parallel and fully leveraging the computing resources. Recently, there are several works trying to decouple and parallelize the backpropagation algorithm.
However, all of them suffer from severe accuracy loss or memory explosion when the neural network is deep. To address these challenging issues, we propose a novel parallel-objective formulation for the objective function of the neural network. After that, we introduce features replay algorithm and prove that it is guaranteed to converge to critical points for the non-convex problem under certain conditions.
Although the recent progress is substantial, deep learning methods can be vulnerable to the maliciously generated adversarial examples. In this paper, we present a novel training procedure and a thresholding test strategy, towards robust detection of adversarial examples. In training, we propose to minimize the reverse cross-entropy RCE , which encourages a deep network to learn latent representations that better distinguish adversarial examples from normal ones.
In testing, we propose to use a thresholding strategy as the detector to filter out adversarial examples for reliable predictions. Our method is simple to implement using standard algorithms, with little extra training cost compared to the common cross-entropy minimization.