Re-Punctuate:

Re-Punctuate is a T5 model that attempts to correct Capitalization and Punctuations in the sentences.

DataSet:

DialogSum dataset (115056 Records) was used to fine-tune the model for Punctuation and Capitalization correction.

Usage:

from transformers import T5Tokenizer, TFT5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained('SJ-Ray/Re-Punctuate')
model = TFT5ForConditionalGeneration.from_pretrained('SJ-Ray/Re-Punctuate')

input_text = 'the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination'
inputs = tokenizer.encode("punctuate: " + input_text, return_tensors="tf") 
result = model.generate(inputs)

decoded_output = tokenizer.decode(result[0], skip_special_tokens=True)
print(decoded_output)

Example:

Input: the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination
Output: The story of this brave, brilliant athlete, whose very being was questioned so publicly, is one that still captures the imagination.

Connect on: LinkedIn : Suraj Kumar

Downloads last month
141
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using SJ-Ray/Re-Punctuate 2