Skip to content

Commit 9c733f3

Browse files
committed
added manenti2025beyond
1 parent f2d9ef1 commit 9c733f3

1 file changed

Lines changed: 20 additions & 0 deletions

File tree

_data/publications.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,23 @@
11
---
2+
- title: "Beyond Softmax: A Natural Parameterization for Categorical Random Variables"
3+
venue: Preprint
4+
year: 2025
5+
authors:
6+
- id:amanenti
7+
- id:calippi
8+
keywords:
9+
- probabilistic modeling
10+
- gradient-based optimization
11+
- graph structure learning
12+
- latent random variables
13+
abstract: Latent categorical variables are frequently found in deep learning architectures. They can model actions in discrete reinforcement-learning environments, represent categories in latent-variable models, or express relations in graph neural networks. Despite their widespread use, their discrete nature poses significant challenges to gradient-descent learning algorithms. While a substantial body of work has offered improved gradient estimation techniques, we take a complementary approach. Specifically, we: 1) revisit the ubiquitous softmax function and demonstrate its limitations from an information-geometric perspective; 2) replace the softmax with the catnat function, a function composed by a sequence of hierarchical binary splits; we prove that this choice offers significant advantages to gradient descent due to the resulting diagonal Fisher Information Matrix. A rich set of experiments — including graph structure learning, variational autoencoders, and reinforcement learning — empirically show that the proposed function improves the learning efficiency and yields models characterized by consistently higher test performance. Catnat is simple to implement and seamlessly integrates into existing codebases. Moreover, it remains compatible with standard training stabilization techniques and, as such, offers a better alternative to the softmax function.
14+
bibtex: >
15+
@article{manenti2025beyond,
16+
title={Beyond Softmax: A Natural Parameterization for Categorical Random Variables},
17+
author={Alessandro Manenti and Cesare Alippi},
18+
journal={arXiv preprint arXiv:2509.24728},
19+
year={2025},
20+
}
221
- title: "Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games"
322
venue: To appear in Advances in Neural Information Processing Systems
423
year: 2025
@@ -138,6 +157,7 @@
138157
- graph structure learning
139158
- graph neural networks
140159
- model calibration
160+
- probabilistic modeling
141161
abstract: Within a prediction task, Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task. In this paper, we demonstrate that minimization of a point-prediction loss function, e.g., the mean absolute error, does not guarantee proper learning of the latent relational information and its associated uncertainty. Conversely, we prove that a suitable loss function on the stochastic model outputs simultaneously grants (i) the unknown adjacency matrix latent distribution and (ii) optimal performance on the prediction task. Finally, we propose a sampling-based method that solves this joint learning task. Empirical results validate our theoretical claims and demonstrate the effectiveness of the proposed approach.
142162
bibtex: >
143163
@inproceedings{manenti2025learning,

0 commit comments

Comments
 (0)