New articles on Quantitative Biology


[1] 2601.16384

Modeling tumor progression in heterogeneous microenvironments: A cellular automata approach

Understanding how microenvironmental heterogeneity influences tumor progression is essential for advancing both cancer biology and therapeutic strategies. In this study, we develop a cellular automata (CA) model to simulate tumor growth under varying microenvironmental conditions and genetic mutation rates, addressing a gap in existing studies that rarely integrate these two factors to explain tumor dynamics. The model explicitly incorporates the cellular heterogeneity of stem and non-stem cells, dynamic cell-cell interactions, and tumor-microenvironment crosstalk. Using computational simulations, we examine the synergistic effects of gene mutation rate, initial tumor burden, and microenvironmental state on tumor progression. Our results demonstrate that lowering the mutation rate significantly mitigates tumor expansion and preserves microenvironmental integrity. Interestingly, the initial tumor burden has a limited impact, whereas the initial condition of the microenvironment critically shapes tumor dynamics. A supportive microenvironment promotes proliferation and spatial invasion, while inhibitory conditions suppress tumor growth. These findings highlight the key role of microenvironmental modulation in tumor evolution and provide computational insights that may inform more effective cancer therapies.


[2] 2601.16593

Adaptive dynamics of eco-evolutionary repeated games: Effect of reward and punishment

Long-term evolutionary processes can strongly influence common-pool resource conservation by generating new traits or behaviours that modify the feedback between population strategies and the resource state. Here we develop an eco-evolutionary framework in which individuals repeatedly interact with the same opponent and follow direct reciprocity through reactive strategies. The strategic dynamics is coupled to a renewable common resource and analyzed using adaptive dynamics. After our exhaustive non-linear dynamical analysis of $2\times2$ strategic games, we focus on comparative and combined usefulness of institutional incentives in the form of rewards and punishments in preventing the Tragedy of the Commons even when defection dominates in the replete resource state. We also report possibility of robust stable oscillations -- emerging via Hopf bifurcation -- in resource state and population strategies.


[3] 2601.16689

Neural Agonist-Antagonist Coupling in the Absence of Mechanical Coupling after Targeted Muscle Reinnervation

Following limb amputation and targeted muscle reinnervation (TMR), nerves supplying agonist and antagonist muscles are rerouted into separate targeted muscles, disrupting natural neuromechanical coupling between muscle groups. Using high-density intramuscular microelectrode arrays in reinnervated muscles, we show that neural signals for agonist and antagonist tasks remain functionally coupled: motor units active during agonist tasks were also recruited during corresponding antagonist tasks, despite no visual feedback on coactivation being provided.


[4] 2601.16378

Cognitively-Inspired Tokens Overcome Egocentric Bias in Multimodal Models

Multimodal language models (MLMs) perform well on semantic vision-language tasks but fail at spatial reasoning that requires adopting another agent's visual perspective. These errors reflect a persistent egocentric bias and raise questions about whether current models support allocentric reasoning. Inspired by human spatial cognition, we introduce perspective tokens, specialized embeddings that encode orientation through either (1) embodied body-keypoint cues or (2) abstract representations supporting mental rotation. Integrating these tokens into LLaVA-1.5-13B yields performance on level-2 visual perspective-taking tasks. Across synthetic and naturalistic benchmarks (Isle Bricks V2, COCO, 3DSRBench), perspective tokens improve accuracy, with rotation-based tokens generalizing to non-human reference agents. Representational analyses reveal that fine-tuning enhances latent orientation sensitivity already present in the base model, suggesting that MLMs contain precursors of allocentric reasoning but lack appropriate internal structure. Overall, embedding cognitively grounded spatial structure directly into token space provides a lightweight, model-agnostic mechanism for perspective-taking and more human-like spatial reasoning.


[5] 2106.07292

Ultrafast topological data analysis reveals pandemic-scale dynamics of convergent evolution

Genome variants which re-occur independently across evolutionary lineages are key molecular signatures of adaptation. Inferring the dynamics of such genetic changes from pandemic-scale genomic datasets is now possible, which opens up unprecedented insight into evolutionary processes. However, existing approaches depend on the construction of accurate phylogenetic trees, which remains challenging at scale. Here we present EVOtRec, an organism-agnostic, fast and scalable Topological Data Analysis approach that enables the inference of convergently evolving genomic variants over time directly from topological patterns in the dataset, without requiring the construction of a phylogenetic tree. Using data from both simulations and published experiments, we show that EVOtRec can robustly identify variants under positive selection and performs orders of magnitude faster than state-of-the-art phylogeny-based approaches, with comparable results. We apply EVOtRec to three large viral genome datasets: SARS-CoV-2, influenza virus A subtype H5N1 and HIV-1. We identify key convergent genome variants and demonstrate how EVOtRec facilitates the real-time tracking of high fitness variants in large datasets with millions of genomes, including effects modulated by varying genomic backgrounds. We envision our Topological Data Analysis approach as a new framework for efficient comparative genomics.


[6] 2309.15566

Simultaneity of consciousness with physical reality: the key that unlocks the mind-matter problem

The problem of explaining the relationship between subjective experience and physical reality remains difficult and unresolved. In most explanations, consciousness is epiphenomenal, without causal power. The most notable exception is Integrated Information Theory (IIT), which provides a causal explanation for consciousness. However, IIT relies on an identity between subjectivity and a particular type of physical structure, namely with an information structure that has intrinsic causal power greater than the sum of its parts. Any theory that relies on a psycho-physical identity must eventually appeal to panpsychism, which undermines that theorys claim to be fundamental. IIT has recently pivoted towards a strong version of causal emergence, but macroscopic causal structures cannot be causally stronger than its microscopic parts without some new physical law or governing principle. The approach taken here is designed to uncover such a principle. The decisive argument is entirely deductive from initial premises that are phenomenologically certain. If correct, the arguments prove that conscious experience is sufficient to create additional degrees of causal freedom independently of the content of experience, and in a manner that is unpredictable and unobservable by any temporally sequential means. This provides a fundamental principle about consciousness, and a conceptual bridge between it and the physics describing what is experienced. The principle makes testable predictions about brain function, with notable differences from IIT, some of which are also empirically testable.


[7] 2410.18024

A mathematical framework to study organising principles in graphical representations of biochemical processes

The complexity of molecular and cellular processes forces experimental studies to focus on subsystems. To study the functioning of biological systems across levels of structural and functional organisation, we require tools to compose and organise networks with different levels of detail and abstraction. Systems Biology Graphical Notation (SBGN) is a standardised notational system that visualises biochemical processes as networks. Despite their widespread adoption, SBGN languages remain purely visual and lack an underlying mathematical framework, limiting their compositional analysis, abstraction, and integration with formal modelling approaches. SBGN comprises three complementary visual languages-Process Description (SBGN-PD), Activity Flow (SBGN-AF), and Entity Relationship (SBGN-ER)-each operating at a different level of abstraction. In this manuscript, we introduce a category-theoretic formalism for SBGN-PD, a visual language to describe biochemical processes as biochemical reaction networks. Using the theory of structured cospans, we construct a symmetric monoidal double category whose horizontal 1-morphisms correspond to SBGN-PD diagrams. We also analyse how a designated subnetwork influences the surrounding network and how external entities, in turn, affect the internal reactions of the subnetwork. Our work addresses a key gap between biological visualisation and mathematical structure. It provides precise organising principles for SBGN-PD, including compositionality, enabling the construction of large biochemical reaction networks from smaller ones, and zooming out, allowing the abstraction of detailed biochemical mechanisms while preserving their functional interfaces. Throughout the paper, the proposed framework is illustrated using standard SBGN-PD examples, demonstrating its applicability to large-scale biochemical reaction networks.


[8] 2507.13638

State Space Models Naturally Produce Time Cell and Oscillatory Behaviors and Scale to Abstract Cognitive Functions

A grand challenge in modern neuroscience is to bridge the gap between the detailed mapping of microscale neural circuits and mechanistic understanding of cognitive functions. While extensive knowledge exists about neuronal connectivity and biophysics, how these low-level phenomena eventually produce abstract behaviors remains largely unresolved. Here, we propose that a framework based on State Space Models, an emerging class of deep learning architectures, can help bridge this gap. We suggest that the differential equations governing elements in a State Space Model are conceptually consistent with the dynamics of biophysical processes, while the model offers a scalable framework to build on the dynamics to produce emergent behaviors observed in experimental neuroscience. We test this framework by training a model employing a diagonal state transition matrix on temporal discrimination tasks with reinforcement learning. Our results suggest that neural behaviors such as time cells naturally emerge from two fundamental principles: optimal pre-configuration and rotational dynamics. These features are shown mathematically to optimize history compression, and naturally generate structured temporal dynamics even prior to training, mirroring recent findings in biological circuits. We show that learning acts primarily as a selection mechanism that fine-tunes these pre-configured oscillatory modes, rather than constructing temporal codes de novo. The model can be readily scaled to abstract cognitive functions such as event counting, supporting the use of State Space Models as a computationally tractable framework for understanding neural activities.


[9] 2508.09871

Inference of germinal center evolutionary dynamics via simulation-based deep learning

B cells and the antibodies they produce are vital to health and survival, motivating research on the details of the mutational and evolutionary processes in the germinal centers (GC) from which mature B cells arise. It is known that B cells with higher affinity for their cognate antigen (Ag) will, on average, tend to have more offspring. However the exact form of this relationship between affinity and fecundity, which we call the ``affinity-fitness response function'', is not known. Here we use deep learning and simulation-based inference to learn this function from a unique experiment that replays a particular combination of GC conditions many times. All code is freely available at this https URL, while datasets and inference results can be found at this https URL.


[10] 2511.18142

SEIR models with host heterogeneity: theoretical aspects and applications to seasonal influenza dynamics

Population heterogeneity is a key factor in epidemic dynamics, influencing both transmission and final epidemic size. While heterogeneity is often modelled through age structure, spatial location, or contact patterns, differences in host susceptibility have recently gained attention, particularly during the COVID-19 pandemic. Building on the framework of Diekmann and Inaba (Journal of Mathematical Biology, 2023), we focus on the special case of SEIR epidemic models, assuming that at the epidemic start there is no pre-existing immunity. Under two distinct assumptions linking susceptibility and infectiousness, one obtains a closed system of 3 ODEs, which can be easily simulated and for which some analytical results are obtained. In particular, we proved that heterogeneity in susceptibility reduces the epidemic final size compared to homogeneous models with the same basic reproduction number $R_0$. We specialised in the case where susceptibility is distributed according to a gamma or extended Beta distribution, showing how the epidemic final size depends on the variance of the distribution. In the case of a gamma-distributed susceptibility, the resulting model consists of a system of ODEs with just one parameter more than the classical SEIR model; this makes it practical for fitting epidemic data. We illustrate its use by fitting data on seasonal influenza in Italy, and comparing the results to those obtained with simple SEIR models with pre-existing immunity.


[11] 2601.12054

Automated Place Preference Paradigm for Optogenetic Stimulation of the Pedunculopontine Nucleus Reveals Motor Arrest-Linked Preference Behavior

Understanding how the brain integrates motor suppression with motivational processes remains a fundamental question in neuroscience. The rostral Pedunculopontine nucleus, a brainstem structure involved in motor control, has been shown to induce transient motor arrest upon optogenetic or electrical stimulation. However, our current understanding of its potential role in linking motor suppression with motivational or reinforcement-related processes is still insufficient. To further explore the effects induced by PPN stimulations and infer the potential mechanism underlying its role involved in both motor and emotional regulation, we developed a fully automated, low-cost system combining real-time animal tracking with closed-loop optogenetic stimulation, using the OpenMV Cam H7 Plus and embedded neural network models. The system autonomously detects the rat's position and triggers optical stimulation upon entry into a predefined region of interest, enabling unbiased, unsupervised behavioral assays. Optogenetic activation of CaMKIIa-expressing neurons in the rostral PPN reliably induced transient motor arrest. When motor arrest was spatially paired with a defined region of interest, rats developed a robust place preference after limited training. These results suggest that rostral PPN activation can couple motor inhibition with reinforcement-related behavioral circuitry. Together, our work provides both a technical framework for scalable closed-loop neuroscience experiments and preliminary evidence that the rostral PPN may participate in coordinating motor suppression with motivational processes.


[12] 2504.03732

SAGe: A Lightweight Algorithm-Architecture Co-Design for Mitigating the Data Preparation Bottleneck in Large-Scale Genome Sequence Analysis

Genome sequence analysis, which examines the DNA sequences of organisms, drives advances in many critical medical and biotechnological fields. Given its importance and the exponentially growing volumes of genomic sequence data, there are extensive efforts to accelerate genome sequence analysis. In this work, we demonstrate a major bottleneck that greatly limits and diminishes the benefits of state-of-the-art genome sequence analysis accelerators: the data preparation bottleneck, where genomic sequence data is stored in compressed form and needs to be first decompressed and formatted before an accelerator can operate on it. To mitigate this bottleneck, we propose SAGe, an algorithm-architecture co-design for highly-compressed storage and high-performance access of large-scale genomic sequence data. The key challenge is to improve data preparation performance while maintaining high compression ratios (comparable to genomic-specific compression algorithms) at low hardware cost. We address this challenge by leveraging key properties of genomic datasets to co-design (i) a lossless (de)compression algorithm, (ii) hardware that decompresses data with lightweight operations and efficient streaming accesses, (iii) storage data layout, and (iv) interface commands to access data. SAGe is highly versatile, as it supports datasets from different sequencing technologies and species. Due to its lightweight design, SAGe can be seamlessly integrated with a broad range of hardware accelerators for genome sequence analysis to mitigate their data preparation bottlenecks. Our results demonstrate that SAGe improves the average end-to-end performance and energy efficiency of two state-of-the-art genome sequence analysis accelerators by 3.0x-32.1x and 13.0x-34.0x, respectively, compared to when the accelerators rely on state-of-the-art software and hardware decompression tools.


[13] 2506.22633

Optimizing information transmission in optogenetic Wnt signaling

Populations of cells regulate gene expression in response to external signals, but their ability to make reliable collective decisions is limited by both intrinsic noise in molecular signaling and variability between individual cells. In this work, we use optogenetic control of the canonical Wnt pathway as an example to study how reliably information about an external signal is transmitted to a population of cells, and determine an optimal encoding strategy to maximize information transmission from Wnt signals to gene expression. We find that it is possible to reach an information capacity beyond 1 bit only through an appropriate, discrete encoding of signals: using either no Wnt, a short Wnt pulse, or a sustained Wnt signal. By averaging over an increasing number of outputs, we systematically vary the effective noise in the pathway. As the effective noise decreases, the optimal encoding comprises more discrete input signals. These signals do not need to be fine-tuned to achieve near-optimal information transmission. The optimal code transitions into a continuous code in the small-noise limit, which can be shown to be consistent with the Jeffreys prior. We visualize the performance of different signal encodings using decoding maps. Our results suggest optogenetic Wnt signaling allows for regulatory control beyond a simple binary switch, and provides a framework to apply ideas from information processing to single-cell in vitro experiments.


[14] 2510.13018

Escaping Local Optima in the Waddington Landscape: A Two-Stage TRPO-PPO Approach for Single-Cell Perturbation Analysis

Modeling cellular responses to genetic and chemical perturbations remains a central challenge in single-cell biology. Existing data-driven frameworks have advanced perturbation prediction through variational autoencoders, chemically conditioned autoencoders, and large-scale transformer pretraining. However, most existing models rely exclusively on either in silico perturbation data or experimental perturbation data but rarely integrate both, limiting their ability to generalize and validate predictions across simulated and real biological contexts in a digital twin system. Moreover, the models are prone to local optima in the nonconvex Waddington landscape of cell fate decisions, where poor initialization can trap trajectories in spurious lineages. In this work, we introduce a two-stage reinforcement learning algorithm for modeling single-cell perturbation. We first compute an explicit natural gradient update using Fisher-vector products and a conjugate gradient solver, scaled by a KL trust-region constraint to provide a safe, curvature-aware first step for the policy. Starting with these preconditioned parameters, we then apply a second phase of proximal policy optimization (PPO) with a KL penalty, exploiting minibatch efficiency to refine the policy. We demonstrate that this initialization strategy substantially improves generalization on Single-cell RNA sequencing (scRNA-seq) perturbation analysis in a digital twin system.


[15] 2601.02530

Multi-scale Graph Autoregressive Modeling: Molecular Property Prediction via Next Token Prediction

We present Connection-Aware Motif Sequencing (CamS), a graph-to-sequence representation that enables decoder-only Transformers to learn molecular graphs via standard next-token prediction (NTP). For molecular property prediction, SMILES-based NTP scales well but lacks explicit topology, whereas graph-native masked modeling captures connectivity but risks disrupting the pivotal chemical details (e.g., activity cliffs). CamS bridges this gap by serializing molecular graphs into structure-rich causal sequences. CamS first mines data-driven connection-aware motifs. It then serializes motifs via scaffold-rooted breadth-first search (BFS) to establish a stable core-to-periphery order. Crucially, CamS enables hierarchical modeling by concatenating sequences from fine to coarse motif scales, allowing the model to condition global scaffolds on dense, uncorrupted local structural evidence. We instantiate CamS-LLaMA by pre-training a vanilla LLaMA backbone on CamS sequences. It achieves state-of-the-art performance on MoleculeNet and the activity-cliff benchmark MoleculeACE, outperforming both SMILES-based language models and strong graph baselines. Interpretability analysis confirms that our multi-scale causal serialization effectively drives attention toward cliff-determining differences.