最近要读的:
- Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning
- Benchmarking Large Language Model Capabilities for Conditional Generation
- Generating Benchmarks for Factuality Evaluation of Language Models
- Copy Is All You Need
- TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
- AlpaGasus: Training A Better Alpaca with Fewer Data
- ARB: Advanced Reasoning Benchmark for Large Language Models
- Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
- Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
- Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models
- SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
- Don’t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
- Language Models can Solve Computer Tasks
- Memory Augmented Large Language Models are Computationally Universal
https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4
Compression for AGI - Jack Rae https://www.youtube.com/watch?v=dO4TPJkeaaU
- the architecture does not adapt in any way to the information content of its input
- minimum length description / hutter prize / bellard.org/nncp / 1948香农提出语言模型
https://bigscience.huggingface.co/
评价库:https://github.com/EleutherAI/lm-evaluation-harness
生成评价API(Critique: A Simple Evaluation API for Text):https://docs.inspiredco.ai/critique/
Huggingface的博客:
- https://huggingface.co/blog/red-teaming
- https://huggingface.co/blog/dialog-agents
- https://huggingface.co/blog/rlhf
OpenAI Alignment Leader - Jan Leike的博客 https://aligned.substack.com/p/ai-assisted-human-feedback
【涌现能力的“另一面”,RLHF的局限性】Jacob Steinhardt - Aligning ML Systems with Human Intent https://www.youtube.com/watch?v=uPH1xIiGZ4o
Emergent Deception and Emergent Optimization https://bounded-regret.ghost.io/emergent-deception-optimization/
https://jingfengyang.github.io/gpt
https://bounded-regret.ghost.io/more-is-different-for-ai/
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis https://arxiv.org/abs/2203.13474
Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right https://arxiv.org/pdf/2104.08315.pdf
Self-critiquing models for assisting human evaluators https://arxiv.org/pdf/2206.05802.pdf
How Many Data Points is a Prompt Worth? https://arxiv.org/pdf/2103.08493.pdf
Yao Fu的三篇系列文章
- https://yaofu.notion.site/e1cd16d1fae84f87aeddf872c838e07c
- https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tracing-Emergent-Abilities-of-Language-Models-to-their-Sources-b9a57ac0fcf74f30a1ab9e3e36fa1dc1
- https://yaofu.notion.site/GPT-3-5-360081d91ec245f29029d37b54573756
Yoav Goldberg的remark:https://gist.github.com/yoavg/59d174608e92e845c8994ac2e234c8a9 以及 https://gist.github.com/yoavg/001cca8ab6de3f20650192da17117292
Talking About Large Language Models https://arxiv.org/pdf/2212.03551.pdf
Teaching Models to Express Their Uncertainty in Words https://openreview.net/forum?id=8s8K2UZGTZ
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks https://openreview.net/forum?id=Gn49ByHQG9
Hierarchical Transformers Are More Efficient Language Models https://openreview.net/forum?id=7gBvhddzs2L
Large-Scale Study of Curiosity-Driven Learning https://arxiv.org/abs/1808.04355
Quantifying Uncertainty in Foundation Models via Ensembles https://openreview.net/forum?id=LpBlkATV24M
Decision Transformer: Reinforcement Learning via Sequence Modeling https://openreview.net/forum?id=gaCGNwsWITG
GPT-NeoX-20B: An Open-Source Autoregressive Language Model https://openreview.net/forum?id=HL7IhzS8W5
Analysis and Mitigation of Dataset Artifacts in OpenAI GPT-3 https://openreview.net/forum?id=uUZBsl5U4up
Regularizing Black-box Models for Improved Interpretability https://openreview.net/forum?id=S1xCuTNYDr
Pre-Trained Language Models for Interactive Decision-Making https://openreview.net/forum?id=FWMQYjFso-a
Exploring Length Generalization in Large Language Models https://openreview.net/forum?id=zSkYVeX7bC4
Solving Quantitative Reasoning Problems with Language Models https://openreview.net/forum?id=IFXTZERXdM7
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small https://openreview.net/forum?id=rvi3Wa768B-
Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations https://openreview.net/forum?id=nntzzF9KolL
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models https://openreview.net/forum?id=u46CbCaLufp
Red Teaming Language Models with Language Models https://openreview.net/forum?id=S5ZMSUX_gq
Language Models (Mostly) Know What They Know https://arxiv.org/pdf/2207.05221.pdf
Emergent Abilities of Large Language Models https://arxiv.org/abs/2206.07682
参考文章:
Memorizing Transformers https://arxiv.org/abs/2203.08913
Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right https://aclanthology.org/2021.emnlp-main.564/
Frequency Effects on Syntactic Rule Learning in Transformers https://aclanthology.org/2021.emnlp-main.72/
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks https://arxiv.org/abs/2010.03648
Improving language models by retrieving from trillions of tokens https://arxiv.org/abs/2112.04426
Semi-Supervised Learning for Natural Language https://www-cs.stanford.edu/~pliang/papers/meng-thesis.pdf
Phase transitions in artificial intelligence systems https://www.sciencedirect.com/science/article/abs/pii/0004370287900336