Pre-computation in progress... First run takes several hours. Check container logs.

Historical Trends in Gender Representation

Female representation has quadrupled over eight decades, rising from approximately 10% in 1945 to 40.7% in 2024. However, the gap between first and last authorship (the 'leaky pipeline') remains constant at approximately 10 percentage points.

% Female authors per year
Methodology & formula
What it shows: female author percentage per year (1945-2024), calculated for each selectable LLM model. Each author-article pair from article_authors is linked to publication year via the deduplicated pmid_year table.
Formula: % F = COUNT(gender='f') / COUNT(gender IN ('m','f','other')) × 100
SELECT py.year, aa."[llm_column]", COUNT(*) FROM article_authors aa JOIN pmid_year py ON aa.pmid = py.pmid WHERE aa."[llm_column]" IS NOT NULL GROUP BY py.year, aa."[llm_column]"
Authorship positions: first vs last author
Methodology & formula
What it shows: female percentage by authorship position (first, last, all) over time. For each article, author_order = 1 is the first author, author_order = MAX is the last author. Single-author articles are classified as "solo".
Leaky pipeline: the gap between first and last author quantifies female attrition in senior positions.
SELECT aa.pmid, py.year, aa.author_order, aa."gender" FROM article_authors aa JOIN pmid_year py ON aa.pmid = py.pmid -- Grouped by pmid in Python to determine max_order per article -- Position = 'first' if order=1, 'last' if order=max, else 'middle' -- % female = COUNT(gender='f') / COUNT(gender IN ('m','f')) × 100