GPT Training
by Jerry Su, post on Feb 20, 2024Teacher Forcing
Exposure bias
- 最优路径。一步错,步步错
Scheduled sampling
- 实现修正纠错。开始更大概率选取ground truth作为target,随着时间更多概率选取模型predict结果作为target,最终逐渐使 …
Nucleus Sampling Top-p Sampling
by Jerry Su, post on Feb 20, 20241. 温度调节(Temperature Scaling)
-
为了调整概率分布的“锐利度”,可以引入一个温度参数(Temperature)。温度较高时,概率分布变得更加平坦,增加了低概 …
SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions
by Jerry Su, post on Apr 29, 2023Self-Instruct is a framework that helps language models improve their ability to follow natural language instructions. It does this by using the model’s own generations to create a large collection of instructional data. With Self-Instruct, it is possible to improve the instruction-following capabilities of language models without relying on …