Luca Soldaini
soldni
AI & ML interests
question answering, information retrieval, scientific document processing
Organizations
soldni's activity
Add proper library name
#3 opened 16 days ago
by
osanseviero
accidentally released?
1
#1 opened 26 days ago
by
Fizzarolli
What is the total # tokens after sampling proportion? 1.7T or 1.65T
3
#36 opened 4 months ago
by
ivanzhouyq
v1_7 update
#28 opened 5 months ago
by
kylel
Does allenai/c4 and the subset C4 in allenai/dolma is the same dataset?
4
#10 opened 6 months ago
by
speiqin
Can't download two files
1
#19 opened 8 months ago
by
mrgorjan
Prompting to OLMo
2
#8 opened 8 months ago
by
herambpatil2004
Update README.md
#10 opened 11 months ago
by
Muennighoff
Add download instructions
#8 opened 11 months ago
by
Muennighoff
Fix size
#9 opened 11 months ago
by
Muennighoff
sample for analysis?
1
#1 opened about 1 year ago
by
KnutJaegersberg
Semantic Scholar API metadata for this dataset?
2
#1 opened about 1 year ago
by
MicPie
How to generate one token after the other with Scibert?
1
#4 opened over 1 year ago
by
junoriosity
Update README.md
#6 opened almost 2 years ago
by
thefuzz
updated dataset_infos for version 0.3.0
#2 opened almost 2 years ago
by
soldni
updated to dataset v 0.3 + added test split
2
#1 opened almost 2 years ago
by
soldni
Update config.json
1
#3 opened about 2 years ago
by
johngiorgi
Licensing
1
#1 opened about 2 years ago
by
Jacobw