ICLR, NeurIPS & ICML Conference Analysis: Authors and Citations by Acceptance Type

Analysis Date: December 12, 2025
Data Sources: ICLR, NeurIPS, and ICML paper lists

This analysis examines three of the most prestigious machine learning conferences—ICLR (International Conference on Learning Representations), NeurIPS (Neural Information Processing Systems), and ICML (International Conference on Machine Learning)—to understand how author counts and citation patterns vary across different acceptance types (Oral, Spotlight, Poster, and Rejected papers).

1. Author Count Analysis

We first examine how the number of authors varies across different acceptance types. This can reveal interesting patterns about collaboration in accepted vs. rejected papers.

Key Finding: The analysis shows that accepted papers (Oral, Spotlight, Poster) tend to have similar author counts, while rejected papers may show different patterns. Collaboration appears to be important across all acceptance types.

1.1 Median Author Count Analysis

While mean author counts provide one perspective, median values can be more robust to outliers and better represent typical collaboration patterns. Below we examine median author counts across conferences and acceptance types.

Key Finding: Comparing mean and median author counts reveals whether collaboration patterns are consistent or if there are outliers (e.g., very large author teams) that skew the averages. The median provides a more typical view of collaboration size.

ICLR Statistics

Status	Count	Avg Authors	Median Authors	Avg Citations	Median Citations	Max Citations
Oral	490	5.46	5.0	452.9	23.5	62,867
Spotlight	1,143	5.47	5.0	84.0	15.0	8,832
Poster	9,290	5.01	4.0	121.6	12.0	47,498
Reject	17,379	4.46	4.0	23.7	1.0	19,629

Additional Insights for ICLR:
• 244 papers (1.1%) have 1,000+ citations
• 511 papers (2.3%) have 500+ citations
• 2,262 papers (10.2%) have 100+ citations
• Average authors per paper: 5.01
• Papers span years 2013 to 2026

NeurIPS Statistics

Status	Count	Avg Authors	Median Authors	Avg Citations	Median Citations	Max Citations
Oral	610	5.00	4.0	390.4	54.0	47,439
Spotlight	2,685	4.83	4.0	192.9	13.0	184,033
Poster	27,550	4.47	4.0	109.8	14.0	82,368
Reject	1,213	5.07	4.0	10.9	0.0	1,293

Additional Insights for NeurIPS:
• 532 papers (1.7%) have 1,000+ citations
• 1,095 papers (3.5%) have 500+ citations
• 5,093 papers (16.5%) have 100+ citations
• Average authors per paper: 4.50
• Papers span years 1987 to 2025

ICML Statistics

Status	Count	Avg Authors	Median Authors	Avg Citations	Median Citations	Max Citations
Oral	2,098	4.18	4.0	213.1	48.0	35,305
Spotlight	2,548	4.40	4.0	66.8	24.0	8,716
Poster	9,684	4.53	4.0	90.1	6.0	62,227
Reject	185	4.59	4.0	-0.3	0.0	0

Additional Insights for ICML:
• 217 papers (1.5%) have 1,000+ citations
• 451 papers (3.1%) have 500+ citations
• 2,097 papers (14.6%) have 100+ citations
• Average authors per paper: 4.45
• Papers span years 2013 to 2025

2. Citation Analysis

Citation counts are a key metric of research impact. We analyze how citations vary by acceptance type, which can provide insights into the relationship between peer review decisions and long-term impact.

Key Finding: Oral presentations typically receive the highest citations, followed by Spotlight and Poster presentations. However, there are notable exceptions, and citation patterns evolve over time as papers accumulate citations.

3. Top Cited Papers by Year

Below we highlight the most highly cited papers from each year in both conferences. These papers represent some of the most influential work in machine learning and have had significant impact on the field.

ICLR - Top Cited Papers by Year

Year 2026

#1: CyberV: A Cybernetic Framework for Enhancing Logical Reasoning in Video Understanding
Authors: Jiahao Meng, Shuyang Sun, Tan Yue et al. | Citations: -1 | Status: Withdraw
#2: SMART-3D: Scaling Masked AutoRegressive Transformer for Efficient 3D Shape Generation
Authors: Shentong Mo, Yufei Guo | Citations: -1 | Status: Withdraw
#3: SSTP: Efficient Sample Selection for Trajectory Prediction
Authors: Ruining Yang, Yi Xu, Yun Fu et al. | Citations: -1 | Status: Withdraw

Year 2025

#1: KAN: Kolmogorov–Arnold Networks
Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya et al. | Citations: 1,156 | Status: Oral [OpenReview]
#2: SAM 2: Segment Anything in Images and Videos
Authors: Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu et al. | Citations: 846 | Status: Oral [OpenReview]
#3: WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Authors: Haipeng Luo, Qingfeng Sun, Can Xu et al. | Citations: 414 | Status: Oral [OpenReview]

Year 2024

#1: YOLOV6: A SINGLE-STAGE OBJECT DETECTION FRAMEWORK FOR INDUSTRIAL APPLICATIONS
Authors: Chuyi Li, Bo Zhang, Lulu Li et al. | Citations: 3,106 | Status: Withdraw
#2: MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Authors: Deyao Zhu, Jun Chen, Xiaoqian Shen et al. | Citations: 2,918 | Status: Poster [OpenReview]
#3: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Authors: Dustin Podell, Zion English, Kyle Lacey et al. | Citations: 2,203 | Status: Spotlight [OpenReview]

Year 2023

#1: ReAct: Synergizing Reasoning and Acting in Language Models
Authors: Shunyu Yao, Jeffrey Zhao, Dian Yu et al. | Citations: 2,772 | Status: Top-5% [OpenReview]
#2: DreamFusion: Text-to-3D using 2D Diffusion
Authors: Ben Poole, Ajay Jain, Jonathan T. Barron et al. | Citations: 2,361 | Status: Top-5% [OpenReview]
#3: DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Authors: Hao Zhang, Feng Li, Shilong Liu et al. | Citations: 1,881 | Status: Poster [OpenReview]

Year 2022

#1: LoRA: Low-Rank Adaptation of Large Language Models
Authors: Edward J Hu, yelong shen, Phillip Wallis et al. | Citations: 14,155 | Status: Poster [OpenReview]
#2: Finetuned Language Models are Zero-Shot Learners
Authors: Jason Wei, Maarten Bosma, Vincent Zhao et al. | Citations: 3,956 | Status: Oral [OpenReview]
#3: BEiT: BERT Pre-Training of Image Transformers
Authors: Hangbo Bao, Li Dong, Songhao Piao et al. | Citations: 3,437 | Status: Oral [OpenReview]

Year 2021

#1: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Authors: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov et al. | Citations: 62,867 | Status: Oral [OpenReview]
#2: Denoising Diffusion Implicit Models
Authors: Jiaming Song, Chenlin Meng, Stefano Ermon | Citations: 8,188 | Status: Poster [OpenReview]
#3: Score-Based Generative Modeling through Stochastic Differential Equations
Authors: Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma et al. | Citations: 7,224 | Status: Oral [OpenReview]

Year 2020

#1: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Authors: Zhenzhong Lan, Mingda Chen, Sebastian Goodman et al. | Citations: 8,832 | Status: Spotlight
#2: BERTScore: Evaluating Text Generation with BERT
Authors: Tianyi Zhang*, Varsha Kishore*, Felix Wu* et al. | Citations: 6,747 | Status: Poster
#3: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Authors: Kevin Clark, Minh-Thang Luong, Quoc V. Le et al. | Citations: 4,882 | Status: Poster

Year 2019

#1: Decoupled Weight Decay Regularization
Authors: Ilya Loshchilov, Frank Hutter | Citations: 27,890 | Status: Poster [OpenReview]
#2: How Powerful are Graph Neural Networks?
Authors: Keyulu Xu*, Weihua Hu*, Jure Leskovec et al. | Citations: 10,619 | Status: Oral [OpenReview]
#3: GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Authors: Alex Wang, Amanpreet Singh, Julian Michael et al. | Citations: 8,516 | Status: Poster [OpenReview]

Year 2018

#1: Towards Deep Learning Models Resistant to Adversarial Attacks
Authors: Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt et al. | Citations: 15,225 | Status: Poster [OpenReview]
#2: Graph Attention Networks
Authors: Petar Veličković, Guillem Cucurull, Arantxa Casanova et al. | Citations: 14,887 | Status: Poster [OpenReview]
#3: mixup: Beyond Empirical Risk Minimization
Authors: Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin et al. | Citations: 12,891 | Status: Poster [OpenReview]

Year 2017

#1: Semi-Supervised Classification with Graph Convolutional Networks
Authors: Thomas N. Kipf, Max Welling | Citations: 45,220 | Status: Poster
#2: SGDR: Stochastic Gradient Descent with Warm Restarts
Authors: Ilya Loshchilov, Frank Hutter | Citations: 10,671 | Status: Poster
#3: Adversarial examples in the physical world
Authors: Alexey Kurakin, Ian J. Goodfellow, Samy Bengio | Citations: 7,407 | Status: Workshop

Year 2014

#1: Auto-Encoding Variational Bayes
Authors: Diederik P. Kingma, Max Welling | Citations: 45,209 | Status: Poster
#2: Intriguing properties of neural networks
Authors: Joan Bruna, Christian Szegedy, Ilya Sutskever et al. | Citations: 19,316 | Status: Poster
#3: Network In Network
Authors: Min Lin, Qiang Chen, Shuicheng Yan | Citations: 10,288 | Status: Poster

Year 2013

#1: Efficient Estimation of Word Representations in Vector Space
Authors: Tomas Mikolov, Kai Chen, Greg Corrado et al. | Citations: 47,498 | Status: Poster
#2: Deep Learning for Detecting Robotic Grasps
Authors: Ian Lenz, Honglak Lee, Ashutosh Saxena | Citations: 2,071 | Status: Oral
#3: Zero-Shot Learning Through Cross-Modal Transfer
Authors: Richard Socher, Milind Ganjoo, Hamsa Sridhar et al. | Citations: 1,867 | Status: Oral

NeurIPS - Top Cited Papers by Year

Year 2025

#1: How Well Can Differential Privacy Be Audited in One Run?
Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett | Citations: -1 | Status: Spotlight
#2: Uncertainty-Sensitive Privileged Learning
Authors: Fan-Ming Luo, Lei Yuan, Yang Yu | Citations: -1 | Status: Poster
#3: KL Penalty Control via Perturbation for Direct Preference Optimization
Authors: Sangkyu Lee, Janghoon Han, Hosung Song et al. | Citations: -1 | Status: Poster

Year 2024

#1: YOLOv10: Real-Time End-to-End Object Detection
Authors: Ao Wang, Hui Chen, Lihao Liu et al. | Citations: 1,650 | Status: Poster [OpenReview]
#2: VMamba: Visual State Space Model
Authors: Yue Liu, Yunjie Tian, Yuzhong Zhao et al. | Citations: 1,477 | Status: Spotlight [OpenReview]
#3: CogVLM: Visual Expert for Pretrained Language Models
Authors: Weihan Wang, Qingsong Lv, Wenmeng Yu et al. | Citations: 713 | Status: Poster [OpenReview]

Year 2023

#1: Visual Instruction Tuning
Authors: Haotian Liu, Chunyuan Li, Qingyang Wu et al. | Citations: 6,760 | Status: Oral [OpenReview]
#2: Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Authors: Lianmin Zheng, Wei-Lin Chiang, Ying Sheng et al. | Citations: 3,519 | Status: Poster [OpenReview]
#3: Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell et al. | Citations: 3,284 | Status: Oral [OpenReview]

Year 2022

#1: Training language models to follow instructions with human feedback
Authors: Long Ouyang, Jeffrey Wu, Xu Jiang et al. | Citations: 13,970 | Status: Poster [OpenReview]
#2: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Authors: Jason Wei, Xuezhi Wang, Dale Schuurmans et al. | Citations: 13,586 | Status: Poster [OpenReview]
#3: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Authors: Chitwan Saharia, William Chan, Saurabh Saxena et al. | Citations: 6,404 | Status: Poster [OpenReview]

Year 2021

#1: Diffusion Models Beat GANs on Image Synthesis
Authors: Prafulla Dhariwal, Alexander Quinn Nichol | Citations: 8,580 | Status: Spotlight [OpenReview]
#2: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Authors: Enze Xie, Wenhai Wang, Zhiding Yu et al. | Citations: 6,146 | Status: Poster [OpenReview]
#3: MLP-Mixer: An all-MLP Architecture for Vision
Authors: Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov et al. | Citations: 3,342 | Status: Poster [OpenReview]

Year 2020

#1: Language Models are Few-Shot Learners
Authors: Tom Brown, Benjamin Mann, Nick Ryder et al. | Citations: 47,439 | Status: Oral
#2: Denoising Diffusion Probabilistic Models
Authors: Jonathan Ho, Ajay N. Jain, Pieter Abbeel | Citations: 23,050 | Status: Poster
#3: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Authors: Patrick Lewis, Ethan Perez, Aleksandra Piktus et al. | Citations: 8,434 | Status: Poster

Year 2019

#1: PyTorch: An Imperative Style, High-Performance Deep Learning Library
Authors: Adam Paszke, Sam Gross, Francisco Massa et al. | Citations: 58,109 | Status: Poster
#2: XLNet: Generalized Autoregressive Pretraining for Language Understanding
Authors: Zhilin Yang, Zihang Dai, Yiming Yang et al. | Citations: 11,352 | Status: Oral
#3: Generative Modeling by Estimating Gradients of the Data Distribution
Authors: Yang Song, Stefano Ermon | Citations: 4,614 | Status: Oral

Year 2018

#1: Neural Ordinary Differential Equations
Authors: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt et al. | Citations: 6,824 | Status: Oral
#2: CatBoost: unbiased boosting with categorical features
Authors: Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev et al. | Citations: 5,655 | Status: Poster
#3: Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Authors: Arthur Jacot, Franck Gabriel, Clement Hongler | Citations: 4,205 | Status: Spotlight

Year 2017

#1: Attention is All you Need
Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar et al. | Citations: 184,033 | Status: Spotlight
#2: A Unified Approach to Interpreting Model Predictions
Authors: Scott M Lundberg, Su-In Lee | Citations: 36,005 | Status: Poster
#3: Inductive Representation Learning on Large Graphs
Authors: Will Hamilton, Zhitao Ying, Jure Leskovec | Citations: 20,519 | Status: Poster

Year 2016

#1: Improved Techniques for Training GANs
Authors: Tim Salimans, Ian Goodfellow, Wojciech Zaremba et al. | Citations: 12,136 | Status: Poster
#2: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Authors: Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst | Citations: 10,837 | Status: Poster
#3: Matching Networks for One Shot Learning
Authors: Oriol Vinyals, Charles Blundell, Timothy Lillicrap et al. | Citations: 9,403 | Status: Poster

Year 2015

#1: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Authors: Shaoqing Ren, Kaiming He, Ross Girshick et al. | Citations: 54,380 | Status: Poster
#2: Spatial Transformer Networks
Authors: Max Jaderberg, Karen Simonyan, Andrew Zisserman et al. | Citations: 10,066 | Status: Spotlight
#3: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Authors: Xingjian SHI, Zhourong Chen, Hao Wang et al. | Citations: 9,108 | Status: Poster

Year 2014

#1: Generative Adversarial Nets
Authors: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza et al. | Citations: 82,368 | Status: Poster
#2: Sequence to Sequence Learning with Neural Networks
Authors: Ilya Sutskever, Oriol Vinyals, Quoc V. Le | Citations: 29,204 | Status: Oral
#3: How transferable are features in deep neural networks?
Authors: Jason Yosinski, Jeff Clune, Yoshua Bengio et al. | Citations: 11,843 | Status: Oral

Year 2013

#1: Distributed Representations of Words and Phrases and their Compositionality
Authors: Tomas Mikolov, Ilya Sutskever, Kai Chen et al. | Citations: 47,769 | Status: Poster
#2: Translating Embeddings for Modeling Multi-relational Data
Authors: Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran et al. | Citations: 10,650 | Status: Poster
#3: Sinkhorn Distances: Lightspeed Computation of Optimal Transport
Authors: Marco Cuturi | Citations: 5,433 | Status: Spotlight

Year 2012

#1: ImageNet Classification with Deep Convolutional Neural Networks
Authors: Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton | Citations: 37,554 | Status: Poster
#2: Practical Bayesian Optimization of Machine Learning Algorithms
Authors: Jasper Snoek, Hugo Larochelle, Ryan P. Adams | Citations: 11,616 | Status: Poster
#3: Large Scale Distributed Deep Networks
Authors: Jeffrey Dean, Greg Corrado, Rajat Monga et al. | Citations: 5,206 | Status: Poster

Year 2011

#1: Algorithms for Hyper-Parameter Optimization
Authors: James S. Bergstra, Rémi Bardenet, Yoshua Bengio et al. | Citations: 7,578 | Status: Poster
#2: Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Authors: Philipp Krähenbühl, Vladlen Koltun | Citations: 4,311 | Status: Poster
#3: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Authors: Benjamin Recht, Christopher Re, Stephen Wright et al. | Citations: 2,860 | Status: Poster

Year 2010

#1: Efficient and Robust Feature Selection via Joint ℓ2,1-Norms Minimization
Authors: Feiping Nie, Heng Huang, Xiao Cai et al. | Citations: 2,605 | Status: Poster
#2: Double Q-learning
Authors: Hado V. Hasselt | Citations: 2,469 | Status: Poster
#3: Online Learning for Latent Dirichlet Allocation
Authors: Matthew Hoffman, Francis R. Bach, David M. Blei | Citations: 2,378 | Status: Poster

Year 2009

#1: Reading Tea Leaves: How Humans Interpret Topic Models
Authors: Jonathan Chang, Sean Gerrish, Chong Wang et al. | Citations: 3,625 | Status: Poster
#2: Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Opt...
Authors: John Wright, Arvind Ganesh, Shankar Rao et al. | Citations: 2,272 | Status: Poster
#3: Fast Image Deconvolution using Hyper-Laplacian Priors
Authors: Dilip Krishnan, Rob Fergus | Citations: 1,804 | Status: Poster

Year 2008

#1: Spectral Hashing
Authors: Yair Weiss, Antonio Torralba, Rob Fergus | Citations: 3,380 | Status: Poster
#2: Mixed Membership Stochastic Blockmodels
Authors: Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg et al. | Citations: 2,757 | Status: Poster
#3: Near-optimal Regret Bounds for Reinforcement Learning
Authors: Peter Auer, Thomas Jaksch, Ronald Ortner | Citations: 1,685 | Status: Poster

Year 2007

#1: Probabilistic Matrix Factorization
Authors: Andriy Mnih, Ruslan Salakhutdinov | Citations: 5,774 | Status: Poster
#2: Random Features for Large-Scale Kernel Machines
Authors: Ali Rahimi, Benjamin Recht | Citations: 5,406 | Status: Poster
#3: Supervised Topic Models
Authors: Jon D. Mcauliffe, David M. Blei | Citations: 2,369 | Status: Poster

Year 2006

#1: Greedy Layer-Wise Training of Deep Networks
Authors: Yoshua Bengio, Pascal Lamblin, Dan Popovici et al. | Citations: 7,432 | Status: Poster
#2: Graph-Based Visual Saliency
Authors: Jonathan Harel, Christof Koch, Pietro Perona | Citations: 4,852 | Status: Poster
#3: Efficient sparse coding algorithms
Authors: Honglak Lee, Alexis Battle, Rajat Raina et al. | Citations: 3,573 | Status: Poster

Year 2005

#1: Distance Metric Learning for Large Margin Nearest Neighbor Classification
Authors: Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul | Citations: 5,111 | Status: Poster
#2: Laplacian Score for Feature Selection
Authors: Xiaofei He, Deng Cai, Partha Niyogi | Citations: 2,902 | Status: Poster
#3: Sparse Gaussian Processes using Pseudo-inputs
Authors: Edward Snelson, Zoubin Ghahramani | Citations: 2,419 | Status: Poster

Year 2004

#1: Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes
Authors: Yee W. Teh, Michael I. Jordan, Matthew J. Beal et al. | Citations: 5,575 | Status: Poster
#2: Neighbourhood Components Analysis
Authors: Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis et al. | Citations: 3,088 | Status: Poster
#3: Self-Tuning Spectral Clustering
Authors: Lihi Zelnik-manor, Pietro Perona | Citations: 2,974 | Status: Poster

Year 2003

#1: Locality Preserving Projections
Authors: Xiaofei He, Partha Niyogi | Citations: 5,752 | Status: Poster
#2: Learning with Local and Global Consistency
Authors: Dengyong Zhou, Olivier Bousquet, Thomas N. Lal et al. | Citations: 5,697 | Status: Poster
#3: Online Passive-Aggressive Algorithms
Authors: Shai Shalev-shwartz, Koby Crammer, Ofer Dekel et al. | Citations: 2,702 | Status: Poster

Year 2002

#1: Distance Metric Learning with Application to Clustering with Side-Information
Authors: Eric P. Xing, Michael I. Jordan, Stuart Russell et al. | Citations: 4,140 | Status: Poster
#2: Stochastic Neighbor Embedding
Authors: Geoffrey E. Hinton, Sam T. Roweis | Citations: 2,468 | Status: Poster
#3: Support Vector Machines for Multiple-Instance Learning
Authors: Stuart Andrews, Ioannis Tsochantaridis, Thomas Hofmann | Citations: 2,143 | Status: Poster

Year 2001

#1: Latent Dirichlet Allocation
Authors: David M. Blei, Andrew Y. Ng, Michael I. Jordan | Citations: 57,563 | Status: Poster
#2: On Spectral Clustering: Analysis and an algorithm
Authors: Andrew Y. Ng, Michael I. Jordan, Yair Weiss | Citations: 13,104 | Status: Poster
#3: Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
Authors: Mikhail Belkin, Partha Niyogi | Citations: 6,423 | Status: Poster

Year 2000

#1: A Neural Probabilistic Language Model
Authors: Yoshua Bengio, Réjean Ducharme, Pascal Vincent | Citations: 12,414 | Status: Poster
#2: Algorithms for Non-negative Matrix Factorization
Authors: Daniel D. Lee, H. Sebastian Seung | Citations: 12,304 | Status: Poster
#3: Using the Nyström Method to Speed Up Kernel Machines
Authors: Christopher K. I. Williams, Matthias Seeger | Citations: 3,227 | Status: Poster

Year 1999

#1: Policy Gradient Methods for Reinforcement Learning with Function Approximation
Authors: Richard S. Sutton, David A. McAllester, Satinder P. Singh et al. | Citations: 9,674 | Status: Poster
#2: Actor-Critic Algorithms
Authors: Vijay R. Konda, John N. Tsitsiklis | Citations: 4,302 | Status: Poster
#3: Support Vector Method for Novelty Detection
Authors: Bernhard Schölkopf, Robert C. Williamson, Alex J. Smola et al. | Citations: 3,405 | Status: Poster

Year 1998

#1: Exploiting Generative Models in Discriminative Classifiers
Authors: Tommi Jaakkola, David Haussler | Citations: 2,117 | Status: Poster
#2: Kernel PCA and De-Noising in Feature Spaces
Authors: Sebastian Mika, Bernhard Schölkopf, Alex J. Smola et al. | Citations: 1,517 | Status: Poster
#3: Semi-Supervised Support Vector Machines
Authors: Kristin P. Bennett, Ayhan Demiriz | Citations: 1,327 | Status: Poster

Year 1997

#1: Classification by Pairwise Coupling
Authors: Trevor Hastie, Robert Tibshirani | Citations: 2,175 | Status: Poster
#2: A Framework for Multiple-Instance Learning
Authors: Oded Maron, Tomás Lozano-Pérez | Citations: 1,970 | Status: Poster
#3: EM Algorithms for PCA and SPCA
Authors: Sam T. Roweis | Citations: 1,482 | Status: Poster

Year 1996

#1: Support Vector Regression Machines
Authors: Harris Drucker, Christopher J. C. Burges, Linda Kaufman et al. | Citations: 7,656 | Status: Poster
#2: Support Vector Method for Function Approximation, Regression Estimation and Signal Processing
Authors: Vladimir Vapnik, Steven E. Golowich, Alex J. Smola | Citations: 4,569 | Status: Poster
#3: Analysis of Temporal-Diffference Learning with Function Approximation
Authors: John N. Tsitsiklis, Benjamin Van Roy | Citations: 2,360 | Status: Poster

Year 1995

#1: A New Learning Algorithm for Blind Signal Separation
Authors: Shun-ichi Amari, Andrzej Cichocki, Howard Hua Yang | Citations: 3,114 | Status: Poster
#2: Independent Component Analysis of Electroencephalographic Data
Authors: Scott Makeig, Anthony J. Bell, Tzyy-Ping Jung et al. | Citations: 3,088 | Status: Poster
#3: Gaussian Processes for Regression
Authors: Christopher K. I. Williams, Carl Edward Rasmussen | Citations: 2,176 | Status: Poster

Year 1994

#1: Neural Network Ensembles, Cross Validation, and Active Learning
Authors: Anders Krogh, Jesper Vedelsby | Citations: 3,155 | Status: Poster
#2: A Growing Neural Gas Network Learns Topologies
Authors: Bernd Fritzke | Citations: 2,799 | Status: Poster
#3: Active Learning with Statistical Models
Authors: David A. Cohn, Zoubin Ghahramani, Michael I. Jordan | Citations: 2,672 | Status: Poster

Year 1993

#1: Signature Verification using a "Siamese" Time Delay Neural Network
Authors: Jane Bromley, Isabelle Guyon, Yann LeCun et al. | Citations: 5,632 | Status: Poster
#2: Autoencoders, Minimum Description Length and Helmholtz Free Energy
Authors: Geoffrey E. Hinton, Richard S. Zemel | Citations: 2,080 | Status: Poster
#3: Convergence of Stochastic Iterative Dynamic Programming Algorithms
Authors: Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh | Citations: 1,427 | Status: Poster

Year 1992

#1: Second order derivatives for network pruning: Optimal Brain Surgeon
Authors: Babak Hassibi, David G. Stork | Citations: 2,686 | Status: Poster
#2: Feudal Reinforcement Learning
Authors: Peter Dayan, Geoffrey E. Hinton | Citations: 1,090 | Status: Poster
#3: An Information-Theoretic Approach to Deciphering the Hippocampal Code
Authors: William E. Skaggs, Bruce L. McNaughton, Katalin M. Gothard | Citations: 785 | Status: Poster

Year 1991

#1: A Simple Weight Decay Can Improve Generalization
Authors: Anders Krogh, John A. Hertz | Citations: 2,721 | Status: Poster
#2: Principles of Risk Minimization for Learning Theory
Authors: V. Vapnik | Citations: 1,601 | Status: Poster
#3: Practical Issues in Temporal Difference Learning
Authors: Gerald Tesauro | Citations: 1,497 | Status: Poster

Year 1990

#1: Generalization by Weight-Elimination with Application to Forecasting
Authors: Andreas S. Weigend, David E. Rumelhart, Bernardo A. Huberman | Citations: 1,052 | Status: Poster
#2: SEXNET: A Neural Network Identifies Sex From Human Faces
Authors: B.A. Golomb, D.T. Lawrence, T.J. Sejnowski | Citations: 770 | Status: Poster
#3: Back Propagation is Sensitive to Initial Conditions
Authors: John F. Kolen, Jordan B. Pollack | Citations: 568 | Status: Poster

Year 1989

#1: Handwritten Digit Recognition with a Back-Propagation Network
Authors: Yann LeCun, Bernhard E. Boser, John S. Denker et al. | Citations: 7,148 | Status: Poster
#2: Optimal Brain Damage
Authors: Yann LeCun, John S. Denker, Sara A. Solla | Citations: 6,639 | Status: Poster
#3: The Cascade-Correlation Learning Architecture
Authors: Scott E. Fahlman, Christian Lebiere | Citations: 4,783 | Status: Poster

Year 1988

#1: ALVINN: An Autonomous Land Vehicle in a Neural Network
Authors: Dean A. Pomerleau | Citations: 3,234 | Status: Poster
#2: What Size Net Gives Valid Generalization?
Authors: Eric B. Baum, David Haussler | Citations: 2,715 | Status: Poster
#3: Training a 3-Node Neural Network is NP-Complete
Authors: Avrim Blum, Ronald L. Rivest | Citations: 1,476 | Status: Poster

Year 1987

#1: Generalization of Back propagation to Recurrent and Higher Order Neural Networks
Authors: Fernando J. Pineda | Citations: 1,697 | Status: Poster
#2: How Neural Nets Work
Authors: Alan Lapedes, Robert Farber | Citations: 772 | Status: Poster
#3: Supervised Learning of Probability Distributions by Neural Networks
Authors: Eric B. Baum, Frank Wilczek | Citations: 374 | Status: Poster

ICML - Top Cited Papers by Year

Year 2025

#1: Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Authors: Yu Sun, Xinhao Li, Karan Dalal et al. | Citations: 90 | Status: Spotlight [OpenReview]
#2: AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
Authors: Anselm Paulus, Arman Zharmagambetov, Chuan Guo et al. | Citations: 88 | Status: Poster [OpenReview]
#3: rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Authors: Xinyu Guan, Li Lyna Zhang, Yifei Liu et al. | Citations: 78 | Status: Oral [OpenReview]

Year 2024

#1: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Authors: Patrick Esser, Sumith Kulal, Andreas Blattmann et al. | Citations: 1,056 | Status: Oral [OpenReview]
#2: MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Authors: Weihao Yu, Zhengyuan Yang, Linjie Li et al. | Citations: 666 | Status: Poster [OpenReview]
#3: Improving Factuality and Reasoning in Language Models through Multiagent Debate
Authors: Yilun Du, Shuang Li, Antonio Torralba et al. | Citations: 620 | Status: Poster [OpenReview]

Year 2023

#1: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language M...
Authors: Junnan Li, Dongxu Li, Silvio Savarese et al. | Citations: 5,839 | Status: Poster [OpenReview]
#2: Robust Speech Recognition via Large-Scale Weak Supervision
Authors: Alec Radford, Jong Wook Kim, Tao Xu et al. | Citations: 4,570 | Status: Poster [OpenReview]
#3: PaLM-E: An Embodied Multimodal Language Model
Authors: Danny Driess, Fei Xia, Mehdi S. M. Sajjadi et al. | Citations: 1,902 | Status: Poster [OpenReview]

Year 2022

#1: BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Gen...
Authors: Junnan Li, Dongxu Li, Caiming Xiong et al. | Citations: 5,196 | Status: Spotlight
#2: GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Authors: Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh et al. | Citations: 3,944 | Status: Spotlight
#3: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Authors: Tian Zhou, Ziqing Ma, Qingsong Wen et al. | Citations: 2,128 | Status: Spotlight

Year 2021

#1: Learning Transferable Visual Models From Natural Language Supervision
Authors: Alec Radford, Jong Wook Kim, Chris Hallacy et al. | Citations: 35,305 | Status: Oral
#2: Training data-efficient image transformers & distillation through attention
Authors: Hugo Touvron, Matthieu Cord, Matthijs Douze et al. | Citations: 8,716 | Status: Spotlight
#3: Zero-Shot Text-to-Image Generation
Authors: Aditya Ramesh, Mikhail Pavlov, Gabriel Goh et al. | Citations: 6,333 | Status: Spotlight

Year 2020

#1: A Simple Framework for Contrastive Learning of Visual Representations
Authors: Ting Chen, Simon Kornblith, Mohammad Norouzi et al. | Citations: 24,175 | Status: Poster
#2: SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
Authors: Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri et al. | Citations: 3,685 | Status: Poster
#3: PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Authors: Jingqing Zhang, Yao Zhao, Mohammad Saleh et al. | Citations: 2,551 | Status: Poster

Year 2019

#1: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Authors: Mingxing Tan, Quoc Le | Citations: 28,829 | Status: Oral
#2: Self-Attention Generative Adversarial Networks
Authors: Han Zhang, Ian Goodfellow, Dimitris Metaxas et al. | Citations: 5,274 | Status: Oral
#3: Parameter-Efficient Transfer Learning for NLP
Authors: Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski et al. | Citations: 5,271 | Status: Oral

Year 2018

#1: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Authors: Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel et al. | Citations: 11,352 | Status: Oral
#2: Addressing Function Approximation Error in Actor-Critic Methods
Authors: Scott Fujimoto, Herke Hoof, David Meger | Citations: 7,345 | Status: Oral
#3: Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Authors: Anish Athalye, Nicholas Carlini, David Wagner | Citations: 3,860 | Status: Oral

Year 2017

#1: Wasserstein Generative Adversarial Networks
Authors: Martin Arjovsky, Soumith Chintala, Léon Bottou | Citations: 19,098 | Status: Poster
#2: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Authors: Chelsea Finn, Pieter Abbeel, Sergey Levine | Citations: 15,661 | Status: Poster
#3: Neural Message Passing for Quantum Chemistry
Authors: Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley et al. | Citations: 10,347 | Status: Poster

Year 2016

#1: Asynchronous Methods for Deep Reinforcement Learning
Authors: Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza et al. | Citations: 13,016 | Status: Poster
#2: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Authors: Yarin Gal, Zoubin Ghahramani | Citations: 12,567 | Status: Poster
#3: Dueling Network Architectures for Deep Reinforcement Learning
Authors: Ziyu Wang, Tom Schaul, Matteo Hessel et al. | Citations: 5,833 | Status: Poster

Year 2015

#1: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Authors: Sergey Ioffe, Christian Szegedy | Citations: 62,227 | Status: Poster
#2: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Authors: Kelvin Xu, Jimmy Ba, Ryan Kiros et al. | Citations: 13,573 | Status: Poster
#3: Trust Region Policy Optimization
Authors: John Schulman, Sergey Levine, Pieter Abbeel et al. | Citations: 9,809 | Status: Poster

Year 2014

#1: Distributed Representations of Sentences and Documents
Authors: Quoc Le, Tomas Mikolov | Citations: 13,722 | Status: Poster
#2: Stochastic Backpropagation and Approximate Inference in Deep Generative Models
Authors: Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra | Citations: 6,349 | Status: Poster
#3: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Authors: Jeff Donahue, Yangqing Jia, Oriol Vinyals et al. | Citations: 6,288 | Status: Poster

Year 2013

#1: On the difficulty of training recurrent neural networks
Authors: Razvan Pascanu, Tomas Mikolov, Yoshua Bengio | Citations: 8,376 | Status: Poster
#2: On the importance of initialization and momentum in deep learning
Authors: Ilya Sutskever, James Martens, George Dahl et al. | Citations: 6,940 | Status: Poster
#3: Regularization of Neural Networks using DropConnect
Authors: Li Wan, Matthew Zeiler, Sixin Zhang et al. | Citations: 3,524 | Status: Poster

4. Combined Analysis

By comparing ICLR, NeurIPS, and ICML together, we can see broader trends in the machine learning research community. All three conferences show similar patterns, suggesting these findings are generalizable across top-tier ML venues.

Conclusion

This analysis reveals important patterns in how acceptance type relates to both collaboration (author counts) and impact (citations). While oral presentations tend to receive more citations on average, poster presentations make up the majority of accepted papers and include many highly-cited works. The data suggests that acceptance type is not the sole determinant of future impact, and that collaboration (as measured by author count) is important across all acceptance types.

Note: Citation counts are from Google Scholar and represent cumulative citations up to the time of data collection. Older papers have had more time to accumulate citations, which may affect the analysis.