ICLR, NeurIPS & ICML Conference Analysis: Authors and Citations by Acceptance Type

Analysis Date: December 12, 2025
Data Sources: ICLR, NeurIPS, and ICML paper lists

This analysis examines three of the most prestigious machine learning conferences—ICLR (International Conference on Learning Representations), NeurIPS (Neural Information Processing Systems), and ICML (International Conference on Machine Learning)—to understand how author counts and citation patterns vary across different acceptance types (Oral, Spotlight, Poster, and Rejected papers).

1. Author Count Analysis

We first examine how the number of authors varies across different acceptance types. This can reveal interesting patterns about collaboration in accepted vs. rejected papers.

Author Analysis
Key Finding: The analysis shows that accepted papers (Oral, Spotlight, Poster) tend to have similar author counts, while rejected papers may show different patterns. Collaboration appears to be important across all acceptance types.

1.1 Median Author Count Analysis

While mean author counts provide one perspective, median values can be more robust to outliers and better represent typical collaboration patterns. Below we examine median author counts across conferences and acceptance types.

Median Author Analysis
Key Finding: Comparing mean and median author counts reveals whether collaboration patterns are consistent or if there are outliers (e.g., very large author teams) that skew the averages. The median provides a more typical view of collaboration size.

ICLR Statistics

StatusCountAvg AuthorsMedian AuthorsAvg CitationsMedian CitationsMax Citations
Oral4905.465.0452.923.562,867
Spotlight1,1435.475.084.015.08,832
Poster9,2905.014.0121.612.047,498
Reject17,3794.464.023.71.019,629
Additional Insights for ICLR:
• 244 papers (1.1%) have 1,000+ citations
• 511 papers (2.3%) have 500+ citations
• 2,262 papers (10.2%) have 100+ citations
• Average authors per paper: 5.01
• Papers span years 2013 to 2026

NeurIPS Statistics

StatusCountAvg AuthorsMedian AuthorsAvg CitationsMedian CitationsMax Citations
Oral6105.004.0390.454.047,439
Spotlight2,6854.834.0192.913.0184,033
Poster27,5504.474.0109.814.082,368
Reject1,2135.074.010.90.01,293
Additional Insights for NeurIPS:
• 532 papers (1.7%) have 1,000+ citations
• 1,095 papers (3.5%) have 500+ citations
• 5,093 papers (16.5%) have 100+ citations
• Average authors per paper: 4.50
• Papers span years 1987 to 2025

ICML Statistics

StatusCountAvg AuthorsMedian AuthorsAvg CitationsMedian CitationsMax Citations
Oral2,0984.184.0213.148.035,305
Spotlight2,5484.404.066.824.08,716
Poster9,6844.534.090.16.062,227
Reject1854.594.0-0.30.00
Additional Insights for ICML:
• 217 papers (1.5%) have 1,000+ citations
• 451 papers (3.1%) have 500+ citations
• 2,097 papers (14.6%) have 100+ citations
• Average authors per paper: 4.45
• Papers span years 2013 to 2025

2. Citation Analysis

Citation counts are a key metric of research impact. We analyze how citations vary by acceptance type, which can provide insights into the relationship between peer review decisions and long-term impact.

Citation Analysis
Key Finding: Oral presentations typically receive the highest citations, followed by Spotlight and Poster presentations. However, there are notable exceptions, and citation patterns evolve over time as papers accumulate citations.

3. Top Cited Papers by Year

Below we highlight the most highly cited papers from each year in both conferences. These papers represent some of the most influential work in machine learning and have had significant impact on the field.

Top Papers Timeline

ICLR - Top Cited Papers by Year

Year 2026

  • #1: CyberV: A Cybernetic Framework for Enhancing Logical Reasoning in Video Understanding
    Authors: Jiahao Meng, Shuyang Sun, Tan Yue et al. | Citations: -1 | Status: Withdraw
  • #2: SMART-3D: Scaling Masked AutoRegressive Transformer for Efficient 3D Shape Generation
    Authors: Shentong Mo, Yufei Guo | Citations: -1 | Status: Withdraw
  • #3: SSTP: Efficient Sample Selection for Trajectory Prediction
    Authors: Ruining Yang, Yi Xu, Yun Fu et al. | Citations: -1 | Status: Withdraw

Year 2025

  • #1: KAN: Kolmogorov–Arnold Networks
    Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya et al. | Citations: 1,156 | Status: Oral [OpenReview]
  • #2: SAM 2: Segment Anything in Images and Videos
    Authors: Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu et al. | Citations: 846 | Status: Oral [OpenReview]
  • #3: WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
    Authors: Haipeng Luo, Qingfeng Sun, Can Xu et al. | Citations: 414 | Status: Oral [OpenReview]

Year 2024

  • #1: YOLOV6: A SINGLE-STAGE OBJECT DETECTION FRAMEWORK FOR INDUSTRIAL APPLICATIONS
    Authors: Chuyi Li, Bo Zhang, Lulu Li et al. | Citations: 3,106 | Status: Withdraw
  • #2: MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
    Authors: Deyao Zhu, Jun Chen, Xiaoqian Shen et al. | Citations: 2,918 | Status: Poster [OpenReview]
  • #3: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
    Authors: Dustin Podell, Zion English, Kyle Lacey et al. | Citations: 2,203 | Status: Spotlight [OpenReview]

Year 2023

  • #1: ReAct: Synergizing Reasoning and Acting in Language Models
    Authors: Shunyu Yao, Jeffrey Zhao, Dian Yu et al. | Citations: 2,772 | Status: Top-5% [OpenReview]
  • #2: DreamFusion: Text-to-3D using 2D Diffusion
    Authors: Ben Poole, Ajay Jain, Jonathan T. Barron et al. | Citations: 2,361 | Status: Top-5% [OpenReview]
  • #3: DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
    Authors: Hao Zhang, Feng Li, Shilong Liu et al. | Citations: 1,881 | Status: Poster [OpenReview]

Year 2022

  • #1: LoRA: Low-Rank Adaptation of Large Language Models
    Authors: Edward J Hu, yelong shen, Phillip Wallis et al. | Citations: 14,155 | Status: Poster [OpenReview]
  • #2: Finetuned Language Models are Zero-Shot Learners
    Authors: Jason Wei, Maarten Bosma, Vincent Zhao et al. | Citations: 3,956 | Status: Oral [OpenReview]
  • #3: BEiT: BERT Pre-Training of Image Transformers
    Authors: Hangbo Bao, Li Dong, Songhao Piao et al. | Citations: 3,437 | Status: Oral [OpenReview]

Year 2021

  • #1: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
    Authors: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov et al. | Citations: 62,867 | Status: Oral [OpenReview]
  • #2: Denoising Diffusion Implicit Models
    Authors: Jiaming Song, Chenlin Meng, Stefano Ermon | Citations: 8,188 | Status: Poster [OpenReview]
  • #3: Score-Based Generative Modeling through Stochastic Differential Equations
    Authors: Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma et al. | Citations: 7,224 | Status: Oral [OpenReview]

Year 2020

  • #1: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
    Authors: Zhenzhong Lan, Mingda Chen, Sebastian Goodman et al. | Citations: 8,832 | Status: Spotlight
  • #2: BERTScore: Evaluating Text Generation with BERT
    Authors: Tianyi Zhang*, Varsha Kishore*, Felix Wu* et al. | Citations: 6,747 | Status: Poster
  • #3: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
    Authors: Kevin Clark, Minh-Thang Luong, Quoc V. Le et al. | Citations: 4,882 | Status: Poster

Year 2019

  • #1: Decoupled Weight Decay Regularization
    Authors: Ilya Loshchilov, Frank Hutter | Citations: 27,890 | Status: Poster [OpenReview]
  • #2: How Powerful are Graph Neural Networks?
    Authors: Keyulu Xu*, Weihua Hu*, Jure Leskovec et al. | Citations: 10,619 | Status: Oral [OpenReview]
  • #3: GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
    Authors: Alex Wang, Amanpreet Singh, Julian Michael et al. | Citations: 8,516 | Status: Poster [OpenReview]

Year 2018

  • #1: Towards Deep Learning Models Resistant to Adversarial Attacks
    Authors: Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt et al. | Citations: 15,225 | Status: Poster [OpenReview]
  • #2: Graph Attention Networks
    Authors: Petar Veličković, Guillem Cucurull, Arantxa Casanova et al. | Citations: 14,887 | Status: Poster [OpenReview]
  • #3: mixup: Beyond Empirical Risk Minimization
    Authors: Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin et al. | Citations: 12,891 | Status: Poster [OpenReview]

Year 2017

  • #1: Semi-Supervised Classification with Graph Convolutional Networks
    Authors: Thomas N. Kipf, Max Welling | Citations: 45,220 | Status: Poster
  • #2: SGDR: Stochastic Gradient Descent with Warm Restarts
    Authors: Ilya Loshchilov, Frank Hutter | Citations: 10,671 | Status: Poster
  • #3: Adversarial examples in the physical world
    Authors: Alexey Kurakin, Ian J. Goodfellow, Samy Bengio | Citations: 7,407 | Status: Workshop

Year 2014

  • #1: Auto-Encoding Variational Bayes
    Authors: Diederik P. Kingma, Max Welling | Citations: 45,209 | Status: Poster
  • #2: Intriguing properties of neural networks
    Authors: Joan Bruna, Christian Szegedy, Ilya Sutskever et al. | Citations: 19,316 | Status: Poster
  • #3: Network In Network
    Authors: Min Lin, Qiang Chen, Shuicheng Yan | Citations: 10,288 | Status: Poster

Year 2013

  • #1: Efficient Estimation of Word Representations in Vector Space
    Authors: Tomas Mikolov, Kai Chen, Greg Corrado et al. | Citations: 47,498 | Status: Poster
  • #2: Deep Learning for Detecting Robotic Grasps
    Authors: Ian Lenz, Honglak Lee, Ashutosh Saxena | Citations: 2,071 | Status: Oral
  • #3: Zero-Shot Learning Through Cross-Modal Transfer
    Authors: Richard Socher, Milind Ganjoo, Hamsa Sridhar et al. | Citations: 1,867 | Status: Oral

NeurIPS - Top Cited Papers by Year

Year 2025

  • #1: How Well Can Differential Privacy Be Audited in One Run?
    Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett | Citations: -1 | Status: Spotlight
  • #2: Uncertainty-Sensitive Privileged Learning
    Authors: Fan-Ming Luo, Lei Yuan, Yang Yu | Citations: -1 | Status: Poster
  • #3: KL Penalty Control via Perturbation for Direct Preference Optimization
    Authors: Sangkyu Lee, Janghoon Han, Hosung Song et al. | Citations: -1 | Status: Poster

Year 2024

  • #1: YOLOv10: Real-Time End-to-End Object Detection
    Authors: Ao Wang, Hui Chen, Lihao Liu et al. | Citations: 1,650 | Status: Poster [OpenReview]
  • #2: VMamba: Visual State Space Model
    Authors: Yue Liu, Yunjie Tian, Yuzhong Zhao et al. | Citations: 1,477 | Status: Spotlight [OpenReview]
  • #3: CogVLM: Visual Expert for Pretrained Language Models
    Authors: Weihan Wang, Qingsong Lv, Wenmeng Yu et al. | Citations: 713 | Status: Poster [OpenReview]

Year 2023

  • #1: Visual Instruction Tuning
    Authors: Haotian Liu, Chunyuan Li, Qingyang Wu et al. | Citations: 6,760 | Status: Oral [OpenReview]
  • #2: Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
    Authors: Lianmin Zheng, Wei-Lin Chiang, Ying Sheng et al. | Citations: 3,519 | Status: Poster [OpenReview]
  • #3: Direct Preference Optimization: Your Language Model is Secretly a Reward Model
    Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell et al. | Citations: 3,284 | Status: Oral [OpenReview]

Year 2022

  • #1: Training language models to follow instructions with human feedback
    Authors: Long Ouyang, Jeffrey Wu, Xu Jiang et al. | Citations: 13,970 | Status: Poster [OpenReview]
  • #2: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Authors: Jason Wei, Xuezhi Wang, Dale Schuurmans et al. | Citations: 13,586 | Status: Poster [OpenReview]
  • #3: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
    Authors: Chitwan Saharia, William Chan, Saurabh Saxena et al. | Citations: 6,404 | Status: Poster [OpenReview]

Year 2021

  • #1: Diffusion Models Beat GANs on Image Synthesis
    Authors: Prafulla Dhariwal, Alexander Quinn Nichol | Citations: 8,580 | Status: Spotlight [OpenReview]
  • #2: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
    Authors: Enze Xie, Wenhai Wang, Zhiding Yu et al. | Citations: 6,146 | Status: Poster [OpenReview]
  • #3: MLP-Mixer: An all-MLP Architecture for Vision
    Authors: Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov et al. | Citations: 3,342 | Status: Poster [OpenReview]

Year 2020

  • #1: Language Models are Few-Shot Learners
    Authors: Tom Brown, Benjamin Mann, Nick Ryder et al. | Citations: 47,439 | Status: Oral
  • #2: Denoising Diffusion Probabilistic Models
    Authors: Jonathan Ho, Ajay N. Jain, Pieter Abbeel | Citations: 23,050 | Status: Poster
  • #3: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
    Authors: Patrick Lewis, Ethan Perez, Aleksandra Piktus et al. | Citations: 8,434 | Status: Poster

Year 2019

  • #1: PyTorch: An Imperative Style, High-Performance Deep Learning Library
    Authors: Adam Paszke, Sam Gross, Francisco Massa et al. | Citations: 58,109 | Status: Poster
  • #2: XLNet: Generalized Autoregressive Pretraining for Language Understanding
    Authors: Zhilin Yang, Zihang Dai, Yiming Yang et al. | Citations: 11,352 | Status: Oral
  • #3: Generative Modeling by Estimating Gradients of the Data Distribution
    Authors: Yang Song, Stefano Ermon | Citations: 4,614 | Status: Oral

Year 2018

  • #1: Neural Ordinary Differential Equations
    Authors: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt et al. | Citations: 6,824 | Status: Oral
  • #2: CatBoost: unbiased boosting with categorical features
    Authors: Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev et al. | Citations: 5,655 | Status: Poster
  • #3: Neural Tangent Kernel: Convergence and Generalization in Neural Networks
    Authors: Arthur Jacot, Franck Gabriel, Clement Hongler | Citations: 4,205 | Status: Spotlight

Year 2017

  • #1: Attention is All you Need
    Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar et al. | Citations: 184,033 | Status: Spotlight
  • #2: A Unified Approach to Interpreting Model Predictions
    Authors: Scott M Lundberg, Su-In Lee | Citations: 36,005 | Status: Poster
  • #3: Inductive Representation Learning on Large Graphs
    Authors: Will Hamilton, Zhitao Ying, Jure Leskovec | Citations: 20,519 | Status: Poster

Year 2016

  • #1: Improved Techniques for Training GANs
    Authors: Tim Salimans, Ian Goodfellow, Wojciech Zaremba et al. | Citations: 12,136 | Status: Poster
  • #2: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
    Authors: Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst | Citations: 10,837 | Status: Poster
  • #3: Matching Networks for One Shot Learning
    Authors: Oriol Vinyals, Charles Blundell, Timothy Lillicrap et al. | Citations: 9,403 | Status: Poster

Year 2015

  • #1: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    Authors: Shaoqing Ren, Kaiming He, Ross Girshick et al. | Citations: 54,380 | Status: Poster
  • #2: Spatial Transformer Networks
    Authors: Max Jaderberg, Karen Simonyan, Andrew Zisserman et al. | Citations: 10,066 | Status: Spotlight
  • #3: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
    Authors: Xingjian SHI, Zhourong Chen, Hao Wang et al. | Citations: 9,108 | Status: Poster

Year 2014

  • #1: Generative Adversarial Nets
    Authors: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza et al. | Citations: 82,368 | Status: Poster
  • #2: Sequence to Sequence Learning with Neural Networks
    Authors: Ilya Sutskever, Oriol Vinyals, Quoc V. Le | Citations: 29,204 | Status: Oral
  • #3: How transferable are features in deep neural networks?
    Authors: Jason Yosinski, Jeff Clune, Yoshua Bengio et al. | Citations: 11,843 | Status: Oral

Year 2013

  • #1: Distributed Representations of Words and Phrases and their Compositionality
    Authors: Tomas Mikolov, Ilya Sutskever, Kai Chen et al. | Citations: 47,769 | Status: Poster
  • #2: Translating Embeddings for Modeling Multi-relational Data
    Authors: Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran et al. | Citations: 10,650 | Status: Poster
  • #3: Sinkhorn Distances: Lightspeed Computation of Optimal Transport
    Authors: Marco Cuturi | Citations: 5,433 | Status: Spotlight

Year 2012

  • #1: ImageNet Classification with Deep Convolutional Neural Networks
    Authors: Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton | Citations: 37,554 | Status: Poster
  • #2: Practical Bayesian Optimization of Machine Learning Algorithms
    Authors: Jasper Snoek, Hugo Larochelle, Ryan P. Adams | Citations: 11,616 | Status: Poster
  • #3: Large Scale Distributed Deep Networks
    Authors: Jeffrey Dean, Greg Corrado, Rajat Monga et al. | Citations: 5,206 | Status: Poster

Year 2011

  • #1: Algorithms for Hyper-Parameter Optimization
    Authors: James S. Bergstra, Rémi Bardenet, Yoshua Bengio et al. | Citations: 7,578 | Status: Poster
  • #2: Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
    Authors: Philipp Krähenbühl, Vladlen Koltun | Citations: 4,311 | Status: Poster
  • #3: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
    Authors: Benjamin Recht, Christopher Re, Stephen Wright et al. | Citations: 2,860 | Status: Poster

Year 2010

  • #1: Efficient and Robust Feature Selection via Joint ℓ2,1-Norms Minimization
    Authors: Feiping Nie, Heng Huang, Xiao Cai et al. | Citations: 2,605 | Status: Poster
  • #2: Double Q-learning
    Authors: Hado V. Hasselt | Citations: 2,469 | Status: Poster
  • #3: Online Learning for Latent Dirichlet Allocation
    Authors: Matthew Hoffman, Francis R. Bach, David M. Blei | Citations: 2,378 | Status: Poster

Year 2009

  • #1: Reading Tea Leaves: How Humans Interpret Topic Models
    Authors: Jonathan Chang, Sean Gerrish, Chong Wang et al. | Citations: 3,625 | Status: Poster
  • #2: Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Opt...
    Authors: John Wright, Arvind Ganesh, Shankar Rao et al. | Citations: 2,272 | Status: Poster
  • #3: Fast Image Deconvolution using Hyper-Laplacian Priors
    Authors: Dilip Krishnan, Rob Fergus | Citations: 1,804 | Status: Poster

Year 2008

  • #1: Spectral Hashing
    Authors: Yair Weiss, Antonio Torralba, Rob Fergus | Citations: 3,380 | Status: Poster
  • #2: Mixed Membership Stochastic Blockmodels
    Authors: Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg et al. | Citations: 2,757 | Status: Poster
  • #3: Near-optimal Regret Bounds for Reinforcement Learning
    Authors: Peter Auer, Thomas Jaksch, Ronald Ortner | Citations: 1,685 | Status: Poster

Year 2007

  • #1: Probabilistic Matrix Factorization
    Authors: Andriy Mnih, Ruslan Salakhutdinov | Citations: 5,774 | Status: Poster
  • #2: Random Features for Large-Scale Kernel Machines
    Authors: Ali Rahimi, Benjamin Recht | Citations: 5,406 | Status: Poster
  • #3: Supervised Topic Models
    Authors: Jon D. Mcauliffe, David M. Blei | Citations: 2,369 | Status: Poster

Year 2006

  • #1: Greedy Layer-Wise Training of Deep Networks
    Authors: Yoshua Bengio, Pascal Lamblin, Dan Popovici et al. | Citations: 7,432 | Status: Poster
  • #2: Graph-Based Visual Saliency
    Authors: Jonathan Harel, Christof Koch, Pietro Perona | Citations: 4,852 | Status: Poster
  • #3: Efficient sparse coding algorithms
    Authors: Honglak Lee, Alexis Battle, Rajat Raina et al. | Citations: 3,573 | Status: Poster

Year 2005

  • #1: Distance Metric Learning for Large Margin Nearest Neighbor Classification
    Authors: Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul | Citations: 5,111 | Status: Poster
  • #2: Laplacian Score for Feature Selection
    Authors: Xiaofei He, Deng Cai, Partha Niyogi | Citations: 2,902 | Status: Poster
  • #3: Sparse Gaussian Processes using Pseudo-inputs
    Authors: Edward Snelson, Zoubin Ghahramani | Citations: 2,419 | Status: Poster

Year 2004

  • #1: Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes
    Authors: Yee W. Teh, Michael I. Jordan, Matthew J. Beal et al. | Citations: 5,575 | Status: Poster
  • #2: Neighbourhood Components Analysis
    Authors: Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis et al. | Citations: 3,088 | Status: Poster
  • #3: Self-Tuning Spectral Clustering
    Authors: Lihi Zelnik-manor, Pietro Perona | Citations: 2,974 | Status: Poster

Year 2003

  • #1: Locality Preserving Projections
    Authors: Xiaofei He, Partha Niyogi | Citations: 5,752 | Status: Poster
  • #2: Learning with Local and Global Consistency
    Authors: Dengyong Zhou, Olivier Bousquet, Thomas N. Lal et al. | Citations: 5,697 | Status: Poster
  • #3: Online Passive-Aggressive Algorithms
    Authors: Shai Shalev-shwartz, Koby Crammer, Ofer Dekel et al. | Citations: 2,702 | Status: Poster

Year 2002

  • #1: Distance Metric Learning with Application to Clustering with Side-Information
    Authors: Eric P. Xing, Michael I. Jordan, Stuart Russell et al. | Citations: 4,140 | Status: Poster
  • #2: Stochastic Neighbor Embedding
    Authors: Geoffrey E. Hinton, Sam T. Roweis | Citations: 2,468 | Status: Poster
  • #3: Support Vector Machines for Multiple-Instance Learning
    Authors: Stuart Andrews, Ioannis Tsochantaridis, Thomas Hofmann | Citations: 2,143 | Status: Poster

Year 2001

  • #1: Latent Dirichlet Allocation
    Authors: David M. Blei, Andrew Y. Ng, Michael I. Jordan | Citations: 57,563 | Status: Poster
  • #2: On Spectral Clustering: Analysis and an algorithm
    Authors: Andrew Y. Ng, Michael I. Jordan, Yair Weiss | Citations: 13,104 | Status: Poster
  • #3: Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
    Authors: Mikhail Belkin, Partha Niyogi | Citations: 6,423 | Status: Poster

Year 2000

  • #1: A Neural Probabilistic Language Model
    Authors: Yoshua Bengio, Réjean Ducharme, Pascal Vincent | Citations: 12,414 | Status: Poster
  • #2: Algorithms for Non-negative Matrix Factorization
    Authors: Daniel D. Lee, H. Sebastian Seung | Citations: 12,304 | Status: Poster
  • #3: Using the Nyström Method to Speed Up Kernel Machines
    Authors: Christopher K. I. Williams, Matthias Seeger | Citations: 3,227 | Status: Poster

Year 1999

  • #1: Policy Gradient Methods for Reinforcement Learning with Function Approximation
    Authors: Richard S. Sutton, David A. McAllester, Satinder P. Singh et al. | Citations: 9,674 | Status: Poster
  • #2: Actor-Critic Algorithms
    Authors: Vijay R. Konda, John N. Tsitsiklis | Citations: 4,302 | Status: Poster
  • #3: Support Vector Method for Novelty Detection
    Authors: Bernhard Schölkopf, Robert C. Williamson, Alex J. Smola et al. | Citations: 3,405 | Status: Poster

Year 1998

  • #1: Exploiting Generative Models in Discriminative Classifiers
    Authors: Tommi Jaakkola, David Haussler | Citations: 2,117 | Status: Poster
  • #2: Kernel PCA and De-Noising in Feature Spaces
    Authors: Sebastian Mika, Bernhard Schölkopf, Alex J. Smola et al. | Citations: 1,517 | Status: Poster
  • #3: Semi-Supervised Support Vector Machines
    Authors: Kristin P. Bennett, Ayhan Demiriz | Citations: 1,327 | Status: Poster

Year 1997

  • #1: Classification by Pairwise Coupling
    Authors: Trevor Hastie, Robert Tibshirani | Citations: 2,175 | Status: Poster
  • #2: A Framework for Multiple-Instance Learning
    Authors: Oded Maron, Tomás Lozano-Pérez | Citations: 1,970 | Status: Poster
  • #3: EM Algorithms for PCA and SPCA
    Authors: Sam T. Roweis | Citations: 1,482 | Status: Poster

Year 1996

  • #1: Support Vector Regression Machines
    Authors: Harris Drucker, Christopher J. C. Burges, Linda Kaufman et al. | Citations: 7,656 | Status: Poster
  • #2: Support Vector Method for Function Approximation, Regression Estimation and Signal Processing
    Authors: Vladimir Vapnik, Steven E. Golowich, Alex J. Smola | Citations: 4,569 | Status: Poster
  • #3: Analysis of Temporal-Diffference Learning with Function Approximation
    Authors: John N. Tsitsiklis, Benjamin Van Roy | Citations: 2,360 | Status: Poster

Year 1995

  • #1: A New Learning Algorithm for Blind Signal Separation
    Authors: Shun-ichi Amari, Andrzej Cichocki, Howard Hua Yang | Citations: 3,114 | Status: Poster
  • #2: Independent Component Analysis of Electroencephalographic Data
    Authors: Scott Makeig, Anthony J. Bell, Tzyy-Ping Jung et al. | Citations: 3,088 | Status: Poster
  • #3: Gaussian Processes for Regression
    Authors: Christopher K. I. Williams, Carl Edward Rasmussen | Citations: 2,176 | Status: Poster

Year 1994

  • #1: Neural Network Ensembles, Cross Validation, and Active Learning
    Authors: Anders Krogh, Jesper Vedelsby | Citations: 3,155 | Status: Poster
  • #2: A Growing Neural Gas Network Learns Topologies
    Authors: Bernd Fritzke | Citations: 2,799 | Status: Poster
  • #3: Active Learning with Statistical Models
    Authors: David A. Cohn, Zoubin Ghahramani, Michael I. Jordan | Citations: 2,672 | Status: Poster

Year 1993

  • #1: Signature Verification using a "Siamese" Time Delay Neural Network
    Authors: Jane Bromley, Isabelle Guyon, Yann LeCun et al. | Citations: 5,632 | Status: Poster
  • #2: Autoencoders, Minimum Description Length and Helmholtz Free Energy
    Authors: Geoffrey E. Hinton, Richard S. Zemel | Citations: 2,080 | Status: Poster
  • #3: Convergence of Stochastic Iterative Dynamic Programming Algorithms
    Authors: Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh | Citations: 1,427 | Status: Poster

Year 1992

  • #1: Second order derivatives for network pruning: Optimal Brain Surgeon
    Authors: Babak Hassibi, David G. Stork | Citations: 2,686 | Status: Poster
  • #2: Feudal Reinforcement Learning
    Authors: Peter Dayan, Geoffrey E. Hinton | Citations: 1,090 | Status: Poster
  • #3: An Information-Theoretic Approach to Deciphering the Hippocampal Code
    Authors: William E. Skaggs, Bruce L. McNaughton, Katalin M. Gothard | Citations: 785 | Status: Poster

Year 1991

  • #1: A Simple Weight Decay Can Improve Generalization
    Authors: Anders Krogh, John A. Hertz | Citations: 2,721 | Status: Poster
  • #2: Principles of Risk Minimization for Learning Theory
    Authors: V. Vapnik | Citations: 1,601 | Status: Poster
  • #3: Practical Issues in Temporal Difference Learning
    Authors: Gerald Tesauro | Citations: 1,497 | Status: Poster

Year 1990

  • #1: Generalization by Weight-Elimination with Application to Forecasting
    Authors: Andreas S. Weigend, David E. Rumelhart, Bernardo A. Huberman | Citations: 1,052 | Status: Poster
  • #2: SEXNET: A Neural Network Identifies Sex From Human Faces
    Authors: B.A. Golomb, D.T. Lawrence, T.J. Sejnowski | Citations: 770 | Status: Poster
  • #3: Back Propagation is Sensitive to Initial Conditions
    Authors: John F. Kolen, Jordan B. Pollack | Citations: 568 | Status: Poster

Year 1989

  • #1: Handwritten Digit Recognition with a Back-Propagation Network
    Authors: Yann LeCun, Bernhard E. Boser, John S. Denker et al. | Citations: 7,148 | Status: Poster
  • #2: Optimal Brain Damage
    Authors: Yann LeCun, John S. Denker, Sara A. Solla | Citations: 6,639 | Status: Poster
  • #3: The Cascade-Correlation Learning Architecture
    Authors: Scott E. Fahlman, Christian Lebiere | Citations: 4,783 | Status: Poster

Year 1988

  • #1: ALVINN: An Autonomous Land Vehicle in a Neural Network
    Authors: Dean A. Pomerleau | Citations: 3,234 | Status: Poster
  • #2: What Size Net Gives Valid Generalization?
    Authors: Eric B. Baum, David Haussler | Citations: 2,715 | Status: Poster
  • #3: Training a 3-Node Neural Network is NP-Complete
    Authors: Avrim Blum, Ronald L. Rivest | Citations: 1,476 | Status: Poster

Year 1987

  • #1: Generalization of Back propagation to Recurrent and Higher Order Neural Networks
    Authors: Fernando J. Pineda | Citations: 1,697 | Status: Poster
  • #2: How Neural Nets Work
    Authors: Alan Lapedes, Robert Farber | Citations: 772 | Status: Poster
  • #3: Supervised Learning of Probability Distributions by Neural Networks
    Authors: Eric B. Baum, Frank Wilczek | Citations: 374 | Status: Poster

ICML - Top Cited Papers by Year

Year 2025

  • #1: Learning to (Learn at Test Time): RNNs with Expressive Hidden States
    Authors: Yu Sun, Xinhao Li, Karan Dalal et al. | Citations: 90 | Status: Spotlight [OpenReview]
  • #2: AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
    Authors: Anselm Paulus, Arman Zharmagambetov, Chuan Guo et al. | Citations: 88 | Status: Poster [OpenReview]
  • #3: rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
    Authors: Xinyu Guan, Li Lyna Zhang, Yifei Liu et al. | Citations: 78 | Status: Oral [OpenReview]

Year 2024

  • #1: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
    Authors: Patrick Esser, Sumith Kulal, Andreas Blattmann et al. | Citations: 1,056 | Status: Oral [OpenReview]
  • #2: MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
    Authors: Weihao Yu, Zhengyuan Yang, Linjie Li et al. | Citations: 666 | Status: Poster [OpenReview]
  • #3: Improving Factuality and Reasoning in Language Models through Multiagent Debate
    Authors: Yilun Du, Shuang Li, Antonio Torralba et al. | Citations: 620 | Status: Poster [OpenReview]

Year 2023

  • #1: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language M...
    Authors: Junnan Li, Dongxu Li, Silvio Savarese et al. | Citations: 5,839 | Status: Poster [OpenReview]
  • #2: Robust Speech Recognition via Large-Scale Weak Supervision
    Authors: Alec Radford, Jong Wook Kim, Tao Xu et al. | Citations: 4,570 | Status: Poster [OpenReview]
  • #3: PaLM-E: An Embodied Multimodal Language Model
    Authors: Danny Driess, Fei Xia, Mehdi S. M. Sajjadi et al. | Citations: 1,902 | Status: Poster [OpenReview]

Year 2022

  • #1: BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Gen...
    Authors: Junnan Li, Dongxu Li, Caiming Xiong et al. | Citations: 5,196 | Status: Spotlight
  • #2: GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
    Authors: Alexander Quinn Nichol, Prafulla Dhariwal, Aditya Ramesh et al. | Citations: 3,944 | Status: Spotlight
  • #3: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
    Authors: Tian Zhou, Ziqing Ma, Qingsong Wen et al. | Citations: 2,128 | Status: Spotlight

Year 2021

  • #1: Learning Transferable Visual Models From Natural Language Supervision
    Authors: Alec Radford, Jong Wook Kim, Chris Hallacy et al. | Citations: 35,305 | Status: Oral
  • #2: Training data-efficient image transformers & distillation through attention
    Authors: Hugo Touvron, Matthieu Cord, Matthijs Douze et al. | Citations: 8,716 | Status: Spotlight
  • #3: Zero-Shot Text-to-Image Generation
    Authors: Aditya Ramesh, Mikhail Pavlov, Gabriel Goh et al. | Citations: 6,333 | Status: Spotlight

Year 2020

  • #1: A Simple Framework for Contrastive Learning of Visual Representations
    Authors: Ting Chen, Simon Kornblith, Mohammad Norouzi et al. | Citations: 24,175 | Status: Poster
  • #2: SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
    Authors: Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri et al. | Citations: 3,685 | Status: Poster
  • #3: PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
    Authors: Jingqing Zhang, Yao Zhao, Mohammad Saleh et al. | Citations: 2,551 | Status: Poster

Year 2019

  • #1: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
    Authors: Mingxing Tan, Quoc Le | Citations: 28,829 | Status: Oral
  • #2: Self-Attention Generative Adversarial Networks
    Authors: Han Zhang, Ian Goodfellow, Dimitris Metaxas et al. | Citations: 5,274 | Status: Oral
  • #3: Parameter-Efficient Transfer Learning for NLP
    Authors: Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski et al. | Citations: 5,271 | Status: Oral

Year 2018

  • #1: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
    Authors: Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel et al. | Citations: 11,352 | Status: Oral
  • #2: Addressing Function Approximation Error in Actor-Critic Methods
    Authors: Scott Fujimoto, Herke Hoof, David Meger | Citations: 7,345 | Status: Oral
  • #3: Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
    Authors: Anish Athalye, Nicholas Carlini, David Wagner | Citations: 3,860 | Status: Oral

Year 2017

  • #1: Wasserstein Generative Adversarial Networks
    Authors: Martin Arjovsky, Soumith Chintala, Léon Bottou | Citations: 19,098 | Status: Poster
  • #2: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
    Authors: Chelsea Finn, Pieter Abbeel, Sergey Levine | Citations: 15,661 | Status: Poster
  • #3: Neural Message Passing for Quantum Chemistry
    Authors: Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley et al. | Citations: 10,347 | Status: Poster

Year 2016

  • #1: Asynchronous Methods for Deep Reinforcement Learning
    Authors: Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza et al. | Citations: 13,016 | Status: Poster
  • #2: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
    Authors: Yarin Gal, Zoubin Ghahramani | Citations: 12,567 | Status: Poster
  • #3: Dueling Network Architectures for Deep Reinforcement Learning
    Authors: Ziyu Wang, Tom Schaul, Matteo Hessel et al. | Citations: 5,833 | Status: Poster

Year 2015

  • #1: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
    Authors: Sergey Ioffe, Christian Szegedy | Citations: 62,227 | Status: Poster
  • #2: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
    Authors: Kelvin Xu, Jimmy Ba, Ryan Kiros et al. | Citations: 13,573 | Status: Poster
  • #3: Trust Region Policy Optimization
    Authors: John Schulman, Sergey Levine, Pieter Abbeel et al. | Citations: 9,809 | Status: Poster

Year 2014

  • #1: Distributed Representations of Sentences and Documents
    Authors: Quoc Le, Tomas Mikolov | Citations: 13,722 | Status: Poster
  • #2: Stochastic Backpropagation and Approximate Inference in Deep Generative Models
    Authors: Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra | Citations: 6,349 | Status: Poster
  • #3: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
    Authors: Jeff Donahue, Yangqing Jia, Oriol Vinyals et al. | Citations: 6,288 | Status: Poster

Year 2013

  • #1: On the difficulty of training recurrent neural networks
    Authors: Razvan Pascanu, Tomas Mikolov, Yoshua Bengio | Citations: 8,376 | Status: Poster
  • #2: On the importance of initialization and momentum in deep learning
    Authors: Ilya Sutskever, James Martens, George Dahl et al. | Citations: 6,940 | Status: Poster
  • #3: Regularization of Neural Networks using DropConnect
    Authors: Li Wan, Matthew Zeiler, Sixin Zhang et al. | Citations: 3,524 | Status: Poster

4. Combined Analysis

By comparing ICLR, NeurIPS, and ICML together, we can see broader trends in the machine learning research community. All three conferences show similar patterns, suggesting these findings are generalizable across top-tier ML venues.

Conclusion

This analysis reveals important patterns in how acceptance type relates to both collaboration (author counts) and impact (citations). While oral presentations tend to receive more citations on average, poster presentations make up the majority of accepted papers and include many highly-cited works. The data suggests that acceptance type is not the sole determinant of future impact, and that collaboration (as measured by author count) is important across all acceptance types.

Note: Citation counts are from Google Scholar and represent cumulative citations up to the time of data collection. Older papers have had more time to accumulate citations, which may affect the analysis.