PN Publications
Below you can find my publications and preprints on Phylogenetic Networks, with links and toggleable abstracts.
To filter by a certain topic, click one of the buttons.
2026
[8]
preprint
Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth
arXiv:2605.23319, 2026.
Identifying a subset of taxa that maximizes Phylogenetic Diversity (PD) is a cornerstone of quantitative conservation planning. Traditionally, PD is defined over a phylogenetic tree in which leaves represent present-day taxa and branch lengths capture the estimated evolutionary distinctiveness. While PD maximization is computationally tractable on trees with unit costs, the problem becomes NP-hard when transitioning to phylogenetic networks or to budgeted versions in which protecting taxa incurs non-homogeneous costs. In this paper, we address these two challenges by providing definitions and a comprehensive analysis of three distinct variants of budgeted PD on networks. We conduct our study through the lens of a small structural parameter, node scanwidth (nsw), which measures the tree-likeness of a phylogenetic network. We show that two of the considered variants can be optimized in \(\mathcal{O}^*(2^{\mathsf{nsw}} \cdot B^2)\) time, where \(B\) is the budget. For the computationally harder, third variant, we provide an algorithm to compute PD scores in \(\mathcal{O}^*(3^{\mathsf{nsw}})\) time. We further contribute the first exact algorithms to compute node scanwidth, recognizing that the utility of algorithms based on nsw depends on the ability to compute nsw and its corresponding decomposition. Our approaches integrate data reduction rules, dynamic programming, and an Integer Linear Programming formulation. We validate our theoretical results through extensive experiments on highly reticulated, simulated networks containing several hundred taxa, using heterogeneous costs. Our implementation computes PD scores and optimal nsw in fractions of a second, even on the most challenging instances. Furthermore, our budgeted optimization algorithms significantly outperform existing benchmarks for computing PD on networks, which were previously limited to unit-cost scenarios. The software makes analyses even on networks with a thousand taxa tractable in practice.
@article{holtgrefe2026tractable,
title = {{Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth}},
author = {Holtgrefe, Niels and Schestag, Jannik},
year = {2026},
jorunal = {arXiv preprint},
archivePrefix = {arXiv},
eprint = {2605.23319}
}
[7]
preprint
Average-Tree Phylogenetic Diversity Parameterized by Scanwidth and Invisibility
arXiv:2604.27745, 2026.
We investigate parameterized algorithms for computing the average-tree phylogenetic diversity (APD) in rooted phylogenetic networks, studying the problem under different structural parameters that capture the deviation of a network from a tree. Our primary parameter is the scanwidth, a measure of the tree-likeness of a given directed acyclic graph. We show that a subset of taxa with maximum APD can be found in polynomial time in phylogenetic networks of scanwidth at most 2, but becomes NP-hard in networks of scanwidth 3. Further, we design an algorithm that computes the APD of a given set of taxa in \(\mathcal{O}(2^{\mathsf{sw}} n)\) time, where \(\mathsf{sw}\) denotes the scanwidth and \(n\) the number of taxa in the input network. Finally, we give a linear-time algorithm for computing the APD of a given set of taxa if the network induced by these taxa is reticulation-visible. We generalize this algorithm to still run in polynomial time if each biconnected component of the induced network has only constantly many invisible reticulations.
@article{vanIersel2026averagetree,
title = {{Average-Tree Phylogenetic Diversity Parameterized by Scanwidth and Invisibility}},
author = {van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias},
year = {2026},
archivePrefix = {arXiv},
journal = {arXiv preprint},
eprint = {2604.27745}
}
2025
[6]
preprint
PaNDA: Efficient Optimization of Phylogenetic Diversity in Networks
bioRxiv:10.1101/2025.11.14.688467, 2025.
Phylogenetic diversity plays an important role in biodiversity, conservation, and evolutionary studies by measuring the diversity of a set of taxa based on their phylogenetic relationships. In phylogenetic trees, a subset of \( k \) taxa with maximum phylogenetic diversity can be found by a simple and efficient greedy algorithm. However, this algorithmic tractability is lost when considering phylogenetic networks, which incorporate reticulate evolutionary events such as hybridization and horizontal gene transfer. To address this challenge, we introduce PaNDA (Phylogenetic Network Diversity Algorithms), the first software package and interactive graphical user-interface for exploring, visualizing and maximizing diversity in phylogenetic networks. PaNDA includes a novel algorithm to find a subset of \( k \) taxa with maximum diversity, running in polynomial time for networks of bounded scanwidth, a measure of tree-likeness of a network that grows slower than the well-known level measure. This algorithm considers the variant of phylogenetic diversity on networks in which the branch lengths of all paths from the root to the selected taxa contribute towards their diversity. We demonstrate the scalability of this algorithm on simulated networks, successfully analyzing level-15 networks with up to 200 taxa in seconds. We also provide a proof-of-concept analysis using a phylogenetic network on Xiphophorus species, illustrating how the tool can support diversity studies based on real genomic data. The software is easily installable and freely available at https://github.com/nholtgrefe/panda. Additionally, we extend the definition of phylogene- tic diversity to semi-directed phylogenetic networks, which are mixed graphs increasingly used in phylogenetic analysis to model uncertainty of the root location. We prove that finding a subset of k taxa with maximum diversity remains NP-hard on semi-directed networks, but do present a polynomial-time algorithm for networks with bounded level.
@article{holtgrefe2025panda,
title = {{PaNDA}: Efficient Optimization of Phylogenetic Diversity in Networks},
author = {Holtgrefe, Niels and van Iersel, Leo and Meuwese, Ruben and Murakami, Yukihiro and Schestag, Jannik},
year = {2025},
publisher = {Cold Spring Harbor Laboratory},
journal = {bioRxiv: 10.1101/2025.11.14.688467},
doi = {10.1101/2025.11.14.688467}
}
[5]
conference paper
Parameterized Algorithms for Diversity of Networks with Ecological Dependencies
Proceedings of the 20th International Symposium on Parameterized and Exact Computation (IPEC 2025):11:1-11:21, 2025.
For a phylogenetic tree, the phylogenetic diversity of a set \(A\) of taxa is the total weight of edges on paths to \(A\). Finding small sets of maximal diversity is crucial for conservation planning, as it indicates where limited resources can be invested most efficiently. In recent years, efficient algorithms have been developed to find sets of taxa that maximize phylogenetic diversity either in a phylogenetic network or in a phylogenetic tree subject to ecological constraints, such as a food web. However, these aspects have mostly been studied independently. Since both factors are biologically important, it seems natural to consider them together. In this paper, we introduce decision problems where, given a phylogenetic network, a food web, and integers \(k\) and \(D\), the task is to find a set of \(k\) taxa with phylogenetic diversity of at least \(D\) under the maximize-all-paths measure while also satisfying viability conditions within the food web. We investigate the parameterized complexity of these problems and present several fixed-parameter tractable algorithms. Specifically, we provide a complete complexity dichotomy characterizing which combinations of parameters lead to W[1]-hardness and which admit fixed-parameter tractable algorithms. Our primary methodological contribution is a novel algorithmic framework for solving phylogenetic diversity problems in networks where dependencies impose an order, using a color-coding approach.
@inproceedings{jones2025parameterized,
title = {{Parameterized Algorithms for Diversity of Networks with Ecological Dependencies}},
author = {Jones, Mark and Schestag, Jannik},
booktitle = {Proceedings of the 20th International Symposium on Parameterized and Exact Computation (IPEC 2025)},
pages = {11:1--11:21},
year = {2025},
organization = {Schloss-Dagstuhl-Leibniz Zentrum f{"u}r Informatik},
doi = {10.4230/LIPIcs.IPEC.2025.11}
}
[4]
conference paper
Average-Tree Phylogenetic Diversity of Networks
Proceedings of the 25th International Workshop on Algorithms in Bioinformatics (WABI 2025):14:1--14:21, 2025.
Phylogenetic diversity is a measure used to quantify the biodiversity of a set of species. Here, we introduce the average-tree phylogenetic diversity score in rooted binary phylogenetic networks and consider algorithms for computing and maximizing the score on a given network. Basically, the score is the weighted average of the phylogenetic diversity scores in all trees displayed by the network, with the weights determined by the inheritance probabilities on the reticulation edges used in the embeddings. We show that computing the score of a given set of taxa in a given network is #P-hard, directly implying #P-hardness of finding a subset of \(k\) taxa achieving maximum diversity score and thereby ruling out polynomial-time algorithms for these problems unless the polynomial hierarchy collapses. However, we show that both problems can be solved efficiently if the input network is close to being a tree in the sense that its reticulation number is small. More precisely, we prove that we can solve the optimization problem in networks with \(n\) leaves and \(r\) reticulations in \(2^{\mathcal{O}(r)} \cdot n \cdot k\) time. Using experiments on data produced by simulating a reticulate-evolution process, we show that our algorithm runs efficiently on networks with hundreds of taxa and tens of reticulations.
@inproceedings{vanIersel2025averagetree,
title = {{Average-Tree Phylogenetic Diversity of Networks}},
author = {van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias},
booktitle = {Proceedings of the 25th International Workshop on Algorithms in Bioinformatics (WABI 2025)},
pages = {14:1--14:21},
year = {2025},
organization = {Schloss Dagstuhl--Leibniz-Zentrum f{"u}r Informatik},
doi = {10.4230/LIPIcs.WABI.2025.15}
}
[3]
conference paper
Phylogenetic Network Diversity Parameterized by Reticulation Number and Beyond
Proceedings of the 22nd RECOMB International Workshop on Comparative Genomics (RECOMB-CG 2025):107-130, 2025.
Network Phylogenetic Diversity (Network-PD) is a measure for the diversity of a set of species based on a rooted phylogenetic network (with branch lengths and inheritance probabilities on the reticulation edges) describing the evolution of those species. We consider the Maximize Network-PD problem: Given such a network, find \(k\) species with maximum Network-PD score. We show that this problem is fixed-parameter tractable (FPT) for binary networks, by describing an optimal algorithm running in \(\mathcal{O}(2^r \log(k)(n+r))\) time, with \(n\) the total number of species in the network and \(r\) its reticulation number. Furthermore, we show that Maximize Network-PD is NP-hard for level-1 networks, proving that, unless P=NP, the FPT approach cannot be extended by using the level as parameter instead of the reticulation number.
@inproceedings{vanIersel2025phylogenetic,
title = {{Phylogenetic Network Diversity Parameterized by Reticulation Number and Beyond}},
author = {van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias},
booktitle = {Proceedings of the 22nd RECOMB International Workshop on Comparative Genomics (RECOMB-CG 2025)},
pages = {107--130},
year = {2025},
organization = {Springer},
doi = {10.1007/978-3-031-94928-9_7}
}
2023
[2]
conference paper
Finding Degree-Constrained Acyclic Orientations
Proceedings of the 18th International Symposium on Parameterized and Exact Computation (IPEC 2023):19:1--19:14, 2023.
We consider the problem of orienting a given undirected graph into an acyclic directed graph such that the in-degree of each vertex \(v\) is contained in a prescribed list \(\lambda(v)\). Variants of this problem have been studied for a long time and with various applications, but mostly without the requirement for acyclicity. Without this requirement, the problem is closely related to the classical General Factor problem. In contrast, we show that deciding whether an acyclic orientation exists is NP-hard even in the absence of large gaps in the degree lists. On the positive side, we design parameterized algorithms for various natural parameterizations. A special case of the problem recently arose in reconstructing evolutionary histories via phylogenetic networks. Exploiting this structure, we show fixed-parameter tractability when parameterized by the treewidth of the graph, by the number of vertices with multiple allowed indegrees, and by the number of vertices whose maximum allowed indegree is at least two.
@inproceedings{garvardt2023finding,
title = {{Finding Degree-Constrained Acyclic Orientations}},
author = {Garvardt, Jaroslav and Renken, Malte and Schestag, Jannik and Weller, Mathias},
booktitle = {Proceedings of the 18th International Symposium on Parameterized and Exact Computation (IPEC 2023)},
pages = {19:1--19:14},
year = {2023},
organization = {Schloss-Dagstuhl-Leibniz Zentrum f{\"u}r Informatik},
doi = {10.4230/LIPIcs.IPEC.2023.19}
}
[1]
conference paper
How can We Maximize Phylogenetic Diversity? Parameterized Approaches for Networks
Proceedings of the 18th International Symposium on Parameterized and Exact Computation (IPEC 2023):30:1-30:12, 2023.
Phylogenetic Diversity (PD) is a measure of the overall biodiversity of a set of present-day species (taxa) within a phylogenetic tree. We consider an extension of PD to phylogenetic networks. Given a phylogenetic network with weighted edges and a subset \(S\) of leaves, the all-paths phylogenetic diversity of \(S\) is the summed weight of all edges on a path from the root to some leaf in \(S\). The problem of finding a bounded-size set \(S\) that maximizes this measure is polynomial-time solvable on trees, but NP-hard on networks. We study the latter from a parameterized perspective. While this problem is W[2]-hard with respect to the size of \(S\) (and W[1]-hard with respect to the size of the complement of \(S\)), we show that it is fixed-parameter tractable with respect to several other parameters, including the phylogenetic diversity of \(S\), the acceptable loss of phylogenetic diversity, the number of reticulations in the network, and the treewidth of the underlying graph.
@inproceedings{jones2023maximize,
title = {{How Can We Maximize Phylogenetic Diversity? Parameterized Approaches for Networks}},
author = {Jones, Mark and Schestag, Jannik},
booktitle = {Proceedings of the 18th International Symposium on Parameterized and Exact Computation (IPEC 2023)},
pages = {30:1--30:12},
year = {2023},
organization = {Schloss-Dagstuhl-Leibniz Zentrum f{"u}r Informatik},
doi = {10.4230/LIPIcs.IPEC.2023.30}
}