Generative Monoculture and Fairness in LLMs

Overview

This project investigates generative monoculture in large language models — the phenomenon where model outputs become systematically less diverse than the underlying training data, raising concerns for alignment and equitable representation.

Key Contributions

Formalized diversity collapse using distributional dispersion metrics, providing a rigorous framework for measuring output homogenization across model generations.
Proposed a group-aware fairness definition to detect when diversity loss disproportionately affects certain demographic or cultural groups.
Analyzed the implications of monoculture for alignment and representation in generative models, highlighting risks in downstream applications where output variety matters.

Generative Monoculture and Fairness in LLMs

Overview

Key Contributions

Report