view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20, 2024 • 81
DolphinLabeled Datasets Collection Eric Hartford has added labels to help you filter datasets, for your pleasure. • 5 items • Updated Jan 6 • 14
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3, 2024 • 50