matlok
's Collections
Papers - Text - Research
updated
An Interdisciplinary Comparison of Sequence Modeling Methods for
Next-Element Prediction
Paper
•
1811.00062
•
Published
•
2
mT5: A massively multilingual pre-trained text-to-text transformer
Paper
•
2010.11934
•
Published
•
4
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large
Language Model Guidance
Paper
•
2310.10021
•
Published
•
2
Gemma: Open Models Based on Gemini Research and Technology
Paper
•
2403.08295
•
Published
•
47
Scan and Snap: Understanding Training Dynamics and Token Composition in
1-layer Transformer
Paper
•
2305.16380
•
Published
•
4
Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning
Paper
•
2310.20587
•
Published
•
16
Structural Similarities Between Language Models and Neural Response
Measurements
Paper
•
2306.01930
•
Published
•
2
Contrastive Decoding Improves Reasoning in Large Language Models
Paper
•
2309.09117
•
Published
•
37
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper
•
2402.06925
•
Published
•
1
In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
Paper
•
2311.06668
•
Published
•
5