New articles on Quantitative Finance


[1] 2605.13998

Synthetic American Option Pricing via Jump-HMM-Driven Heston Implied Volatility

Generating realistic synthetic option prices requires implied volatility as an input, yet implied volatility is itself derived from observed option prices, creating a circular dependency that limits synthetic data for machine-learning and risk-analysis applications. We break this circularity with a pipeline in which implied volatility emerges as an output of a structural model of equity returns. A Jump Hidden Markov Model produces multi-asset price paths with realistic stylized facts and cross-asset tail dependence; a modified Heston variance process, whose mean-reversion target depends on regime state, days to expiration, moneyness, and a market-mood indicator, converts those paths into implied-volatility paths; and a recombining binomial lattice prices American options from the resulting surface. Initializing variance at its mean-reversion target for each strike-expiration pair lets smile, skew, and term structure emerge without external calibration. We calibrate the shape function through a hierarchy spanning a parametric baseline, a globally shared neural surrogate, and a sector-specific neural surrogate fit to a multi-ticker, multi-sector option ladder. A temporal holdout on a multi-day capture isolated scheduled corporate events as the dominant source of test-time generalization error, and calendar-derived earnings-distance and same-sector peer-coupling features recovered the anticipatory portion of that signal. We then apply the framework as a synthetic-data generator on real near-the-money put and call contracts, forward-simulating price paths, and recovering path-conditional implied volatility, finite-difference American Greeks, and terminal short-premium profit and loss from one coherent simulation, and confirm cross-ticker robustness by re-running on a second underlying from a different sector and volatility regime. The framework is released as an open-source Julia package.


[2] 2605.14493

Deep Learning for Solving and Estimating Dynamic Models in Economics and Finance

This script offers an implementation-oriented introduction to deep learning methods for solving and estimating high-dimensional dynamic stochastic models in economics and finance. Its starting point is the curse of dimensionality: heterogeneous-agent economies, overlapping-generations models with aggregate risk, continuous-time models with occasionally binding constraints, climate-economy models, and macro-finance environments with many assets and frictions generate state and parameter spaces that strain classical tensor-product grid methods. The exposition is organized around four complementary methodologies. Deep Equilibrium Nets embed discrete-time equilibrium conditions into neural-network loss functions. Physics-Informed Neural Networks approximate continuous-time Hamilton--Jacobi--Bellman, Kolmogorov forward, and related partial differential equations. Deep surrogate models provide fast, differentiable approximations to expensive structural models, while Gaussian processes add a probabilistic layer that quantifies approximation uncertainty; together they support estimation, sensitivity analysis, and constrained policy design. Gaussian-process-based dynamic programming, combined with active learning and dimension reduction, extends value-function iteration to very large continuous state spaces. Applications span representative-agent and international real business cycle models, overlapping-generations and heterogeneous-agent economies, continuous-time macro-finance, structural estimation by simulated method of moments, and climate economics under uncertainty. Companion notebooks in TensorFlow and PyTorch invite hands-on experimentation. These notes are a deliberately subjective and inevitably incomplete snapshot of a rapidly evolving field, aimed at equipping PhD students and researchers to engage with this frontier hands-on.


[3] 2605.14575

The Asset Price Channel of Monetary Policy: Evidence from Regional Stock-Market Developments in the Successor States of Former Yugoslavia

The aim of this study is to empirically investigate the existence of a sectoral asset price channel of monetary policy in the region of the six republics of former Yugoslavia. The study constructs sectoral indices for the entire region, building on the idea that one regional stock exchange may provide more efficiency for the listed companies in the region, while monetary policy relevance for it may be sector-specific. We employ panel vector autoregressive model to observe impulse responses of sectoral indices to innovations in monetary policy, while then disentangle the long- from the short-run relationships per index through a Pooled Mean Group estimation. Overall, we document presence of the asset price channel in the finance and telecom sectors, likely driven by the established multinational corporate networks fostering sub-market regionalization. Yet, this is not the case for the manufacturing and electricity sectors, which may imply that local stock markets are yet too fragmented and space for a more efficient regional stock market, either in the true sense of the word or, more realistically, though enhanced regional cooperation of the stock exchanges certainly exists.


[4] 2605.12508

Interoperability Effects: Extending DeFi Lending Risk Models to Multi-Chain Environments

On-chain lending has expanded across multiple distributed ledgers as DeFi becomes increasingly multi-chain. This environment introduces novel technical and financial mechanisms, particularly cross-blockchain communication and asset transfer protocols, yet cross-chain elements remain understudied in lending protocol risk management. To address this gap, we applied panel regression fixed effects and OLS models to empirically analyze cross-blockchain interoperability solutions, using TVL and total revenue as performance proxies from October 2022 to January 2025. Our data set covers 15 decentralized lending protocols and 53 cross-chain bridges across 9 EVM-compatible blockchains, categorized as Ethereum, alternative layer-1s, and Ethereum layer-2 networks. Results reveal that cross-chain activity impacts on protocol performance. Bridge volume emerges as a critical driver, exerts a significant effect on TVL and revenue across different categories, though the direction of this effect varies heterogeneously. Increased bridge integrations are associated with decreased TVL and protocol revenue across categories, indicating liquidity escapes from those lending ecosystems. Liquidations produce heterogeneous effects across categories. New network launches do not have as significant relationships with TVL and revenue while bridge hacks show a significant and positive relationship. High R-squared values confirm meaningful explanatory power. We further show Ethereum attracts large depositors, while layer-2s skew toward retail participation. We conclude that effective DeFi risk models should incorporate cross-chain metrics and adopt a layer-aware approach to accurately reflect the evolving multi-chain landscape.


[5] 2605.13866

AI Alignment Amplifies the Role of Race, Gender, and Disability in Hiring Decisions

Humans increasingly delegate decisions to language models, yet whether these systems reproduce or reshape human patterns of discrimination remains unclear. Here we run a large-scale study to analyse whether language models use demographic information in hiring decisions. We show, across 27 models and 177 occupations, that language models give female and Black candidates hiring advantages relative to otherwise-comparable male and white candidates, while giving disabled candidates disadvantages. The differences are meaningful in magnitude: the role of race, gender, and disability status is comparable to six months to one year of additional education. Post-training alignment is the primary driver: relative to matched pre-trained models, alignment amplifies advantages for female and Black candidates by 325% and 330%, and disadvantages for disabled candidates by 171%. Compared with previous human correspondence studies, language models reverse the direction of racial discrimination, attenuate the disability penalty, and amplify the female advantage by 190%. Alignment changes how models use qualification signals: alignment increases returns to skills and work experience overall, but relatively more so for female and Black candidates. Meanwhile, the absence of qualification signals harms marginalised groups more, particularly for disabled candidates, differences that may explain the asymmetry of alignment effects across groups we observe.


[6] 2605.14976

Multi-regime Markov-switching models with time-varying transition probabilities: An application to U.S. Treasury yields

This paper studies Markov-switching (MS) models with time-varying transition probabilities (TVTP) under various specifications of the transition probability matrix. Especially, we extend the two-regime common-variance setting of the Generalized Autoregressive Score (GAS) model from (Bazzi et al., 2017) to the general $K$-regime case with regime-specific means and variances. Our study contains comprehensive Monte Carlo simulations and we developed an open-source R package, \texttt{multiregimeTVTP}, for data simulation and parameter estimation. We find that the regime means, variances, and transition probabilities are reliably recovered, whereas the TVTP driving coefficients are harder to identify. Another finding from our paper is that the GAS score coefficient appears to be statistically non-identifiable, due to a ridge in the joint likelihood surface $(\sigma^2,A)$. In addition, we find that one-step point forecasts are remarkably robust to TVTP misspecification, but filtered regime probabilities are not, so correct specification matters most for characterizing regime dynamics rather than short-horizon forecasting. An empirical application to U.S. Treasury zero-coupon yield changes at four maturities (1961-2024) shows that an exogenous specification driven by the lagged yield level dominates the constant and lagged-change models in fit, while the GAS specification fails to converge, with $\hat{A}$ collapsing to zero, reflecting the same identifiability issue observed in simulation.


[7] 2407.15536

Calibrating the Heston model with deep differential networks

We propose a gradient-based deep learning framework to calibrate the Heston option pricing model (Heston, 1993). Our neural network, henceforth deep differential network (DDN), learns both the Heston pricing formula for plain-vanilla options and the partial derivatives with respect to the model parameters. The price sensitivities estimated by the DDN are not subject to the numerical issues that can be encountered in computing the gradient of the Heston pricing function. Thus, our network is an excellent pricing engine for fast gradient-based calibrations. Extensive tests on selected equity markets show that the DDN significantly outperforms non-differential feedforward neural networks in terms of calibration accuracy. In addition, it dramatically reduces the computational time with respect to global optimizers that do not use gradient information.


[8] 2508.16919

Combining a Large Pool of Forecasts of Value-at-Risk and Expected Shortfall

We consider the combination of value-at-risk (VaR) and expected shortfall (ES) forecasts when a large pool of candidate forecasts is available. Given the limited literature in this area, we implement a variety of new combining methods. In terms of simplistic methods, in addition to the mean, we consider the median and mode. As a complement to the previously proposed performance-based weighted combinations, we use regularisation to reduce overfitting in the presence of many weights. Treating VaR and ES forecasts jointly as interval forecasts allows the application of adapted interval forecast combination methods, including trimmed means and a mixtures approach based on inferred probability distributions. In an empirical study involving 90 forecasting methods, trimmed mean combinations, the mixtures method, and performance-based weighting delivered particularly strong results. However, greater forecasting accuracy resulted for a pool of just six methods, chosen to ensure diversity, with performance-based weighting producing the best overall performance.


[9] 2604.24366

The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book

We study the microstructure of Polymarket, the largest on-chain prediction market, using a continuous tick-level archive of the public order-book feed (30 billion events over 52 days) joined to the authoritative on-chain trade record. On a pre-registered stratified panel of 600 markets we report eight stylized facts: a longshot spread premium; a depth profile closer to uniform than to top-of-book; a null block-clock alignment effect; broad maker-wallet diversity with a concentrated tail; category-conditional effective-spread differences; a sub-50 ms median archive-ingestion delay with a multi-second tail; a self-counterparty wash share with median 1% and a 22% upper tail (well below Cong et al. 2023's 25-70% for unregulated crypto venues -- a sanity bound, not an apples-to-apples reference); and a cross-sectional depth profile explained by market duration, price level, and volume, with no residual time-to-close effect. The paper also contributes a measurement result: trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), well below the ~80% Lee-Ready accuracy on Nasdaq. The effective half-spread changes sign between feed- and on-chain trade directions on 67%/50% of markets across two 7-day windows; Kyle's lambda on 60%/43%. Microstructure work on Polymarket therefore needs to source trade direction from on-chain OrderFilled events; we release a replication package that performs the join.


[10] 2605.00493

ForesightFlow: An Information Leakage Score Framework for Prediction Markets

ForesightFlow is an Information Leakage Score (ILS) framework for detecting informed trading on decentralized prediction markets. For an event-resolved binary market, the score quantifies the fraction of the terminal information move priced in before the public news event. Three operational scope conditions (edge effect, non-trivial total move, anchor sensitivity) are stated as preconditions for interpretation. The score admits a Murphy-decomposition reading that connects label generation to the proper-scoring-rule literature. A pilot empirical evaluation surfaces three findings. First, a resolution-anchored proxy for the public-event timestamp does not separate event-resolved markets from a matched control population (Mann-Whitney p = 1e-6, separation reversed), demonstrating that proxy quality is itself a binding constraint. Second, the article-derived timestamp on a single high-stakes case shifts the score by 0.444 in magnitude relative to the proxy and lies on the opposite side of zero. Third, an audit of the publicly documented Polymarket insider record reveals that documented cases are systematically deadline-resolved, falling outside the original ILS scope (0 of 24 FFIC inventory markets satisfied original scope conditions). This last finding motivates a deadline-ILS extension introduced in Section 7, anchored at the public-event timestamp rather than the news timestamp, and equipped with a per-category exponential hazard baseline for the time-to-event distribution. The extension closes the gap between the methodology and the population in which insider trading has been empirically documented. An end-to-end evaluation of the extension on the 2026 U.S.-Iran conflict cluster is reported in a companion paper. We release the FFIC inventory, the resolution-typology classification of the 911,237-market corpus, and all code at this http URL.


[11] 2605.02286

Empirical Evaluation of Deadline-Resolved Information Leakage on Documented Polymarket Insider Cases

This paper reports an end-to-end empirical evaluation of the deadline-Information Leakage Score (ILS-dl) extension introduced in the companion methodology paper. The deadline-ILS extends the original ILS to deadline-resolved prediction-market contracts, the dominant structural form of publicly documented insider trading on Polymarket. We anchor the evaluation in the 2026 U.S.-Iran conflict cluster of the ForesightFlow Insider Cases (FFIC) inventory, the largest documented deadline cluster. The evaluation has four parts: per-category exponential-hazard estimation, a single-case ILS-dl computation, cross-market wallet analysis, and methodological refinements. Hazard-rate estimation produces an adequate exponential fit for military-geopolitics markets (KS p = 0.426, half-life 2.9 days, n = 18) and a preliminary fit for corporate-disclosure markets (n = 5). The regulatory-decision category is rejected as bimodal (p = 0.023). On the largest applicable FFIC contract ("US forces enter Iran by April 30," $269M volume), the article-derived public-event timestamp yields ILS-dl = +0.113 versus a resolution-anchored proxy value of -0.331: a 0.444 shift in magnitude on opposite sides of zero, demonstrating that the extension distinguishes signal from proxy artefact. Pre-event drift is mild, and short-window variants (30-min, 2-hour) are exactly zero. Cross-market wallet analysis identifies 332 wallets active in both major Iran-cluster markets, but the available trade history covers only the resolution-settlement window. v2 (May 2026) corrects the hazard fit to the full Tier-3 population; the v1 estimate lies inside the v2 95% CI.


[12] 2605.02287

Per-Market Information Leakage and Order-Flow Skill: Two Methodological Lenses on Informed Trading in Decentralized Prediction Markets

April 2026 saw notable methodological convergence in the academic study of informed trading on decentralized prediction markets. Three approaches surfaced almost simultaneously: Mitts and Ofir (2026) apply a composite screen to over 210,000 wallet-market pairs; Gomez-Cram et al. (2026) apply an event-level sign-randomization test to Polymarket's complete transaction history, classifying 3.14% of accounts as "skilled winners" and separately flagging 1,950 accounts as "insiders" via a lifecycle heuristic; Nechepurenko (2026) develops the Information Leakage Score (ILS) framework, which quantifies per-market information front-loading at an article-derived public-event timestamp. This paper provides a methodological comparison. The central claim is that these are three distinct layers of detection, not competing methods on a single layer. Sign-randomization is best understood as an account-level test of persistent directional skill conditional on opportunity selection -- not a direct test of insider trading, and not a per-market measure. The heuristic insider flag is separate from the skill classifier, applies to a population the classifier excludes by design, and has unknown precision. The Polymarket sample pools politics, sports, crypto, and other categories with different information technologies, so a platform-wide "skilled winner" classification is mechanism-ambiguous. The January 2026 U.S.-Venezuela operation cluster, where the DOJ indictment of Master Sergeant Gannon Van Dyke provides a rare external enforcement benchmark, illustrates how the layers stack: lifecycle heuristics identify suspicious accounts; legal investigation addresses non-public-information possession; per-market scoring would quantify how much information was leaked into each contract. A combined pipeline gains in precision because each layer filters a different dimension.


[13] 2605.10060

Skill Premia and Pre-Marital Investments in Marriage Markets

I study a decentralized marriage market with search frictions, costly pre-marital skill investments, and non-transferable utility. Despite a symmetric environment, the market can exhibit asymmetric equilibria, with one gender investing more in skills than the other; in some environments, the asymmetric equilibrium is unique. A microfounded model of household utility maximization shows that this transition from a unique symmetric equilibrium to a unique asymmetric equilibrium can be driven by rising labor-market wages for high-skilled workers: as the skill premium rises, one gender ends up fully investing while the other invests substantially less.


[14] 2605.12698

Optimal investment and Pension policy in Pay-As-You-Go systems under forward utility and ageing population

This paper investigates optimal investment and pension policies in a Pay-As-You-Go (PAYG) system supplemented by a buffer fund used as an intergenerational risk-sharing mechanism. The social planner's preference criterion is represented by non-zero volatility forward Constant Relative Risk Aversion (CRRA) utilities, and explicitly accounts for both sustainability and adequacy constraints. The optimal policies are characterized in closed form, and an in-depth analysis of the impact of preference sensitivities on the pension scheme is conducted. A detailed numerical analysis is performed to evaluate the sustainability and benefit adequacy of this hybrid PAYG buffer fund arrangement under a range of demographic, financial, and macroeconomic scenarios.


[15] 2410.02091

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Generative artificial intelligence (AI) facilitates content production and enhances ideation capabilities, which can significantly influence developer productivity and participation in software development. To explore its impact on collaborative open-source software (OSS) development, we investigate the role of GitHub Copilot, a generative AI pair programmer, in OSS development where multiple distributed developers voluntarily collaborate. Using GitHub's proprietary Copilot usage data, combined with public OSS project data obtained from GitHub, we find that Copilot use increases project-level code contributions by 5.9%. This gain is driven by a 3.4% rise in developer coding participation and a 2.1% increase in individual productivity. However, Copilot use also leads to an increase in coordination time by 8% due to more code discussions. This reveals an important tradeoff: While AI expands who can contribute and how much they contribute, it slows coordination in collective development efforts. Despite this tension, the combined effect of these two competing forces remains positive, indicating a net gain in overall project-level timely merge of code contributions from using AI pair programmers. Interestingly, we also find the effects differ across developer roles. Peripheral developers show relatively smaller increases in project-level code contributions and experience larger increases in coordination time than core developers. In summary, our study underscores the dual role of AI pair programmers in affecting project-level code contributions and coordination time in OSS development. Our findings on the differential effects between core and peripheral developers also provide important implications for the structure of OSS communities in the long run.


[16] 2603.06875

Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy

Attention heads retrieve: given a query, they return a weighted average of stored values. We showed that this computation is one step of gradient descent on the modern Hopfield energy, and that Langevin sampling from the corresponding Boltzmann distribution yielded stochastic attention, a training-free sampler controlled by a single temperature parameter. Lowering the temperature gave exact retrieval; raising it gave open-ended generation. Because the energy gradient equals the attention map, no score network, training loop, or learned model was required, making the approach particularly suited to the low-data regime where learned generative models are starved of training signal. We derived an entropy inflection condition that identified the retrieval-to-generation transition temperature for any memory geometry and validated the sampler on five domains spanning two orders of magnitude in dimension. A single Boolean mask on the attention softmax, identical to the causal mask used in transformers but applied along the memory axis rather than the sequence axis, turned the sampler into a zero-shot class-conditional generator on Olivetti faces with no retraining and no learned classifier. On MNIST digit images, stochastic attention produced samples that were markedly more novel and more diverse than the best learned baseline while matching a Metropolis-corrected gold standard. On protein sequences from a small Pfam family, the generation regime preserved amino acid composition far more faithfully than a variational autoencoder at matched novelty, indicating that the training-free score function retained family-level fidelity that learned models lost. A denoising diffusion baseline failed across all memory sizes tested, producing samples indistinguishable from isotropic noise. The approach required no architectural changes to the underlying attention mechanism.