Spaces:
Paused
A newer version of the Gradio SDK is available:
5.15.0
title: thread-gpt
app_file: app.py
sdk: gradio
sdk_version: 4.4.1
ThreadGPT
Struggling to keep up with the latest AI research papers? ThreadGPT is here to help. It seamlessly transforms complex academic papers into concise, easy-to-understand threads. Not only does it summarize the text, but it also includes relevant figures, tables, and visuals from the papers directly into the threads. π§΅β¨π
Gradio App UI
Examples of threads generated by ThreadGPT (@paper_threadoor)
π οΈ Installation
Clone the repo
git clone https://github.com/wiskojo/thread-gpt
Install dependencies
# Install PyTorch, torchvision, and torchaudio
# Please refer to the official PyTorch website (https://pytorch.org) for the installation command that matches your system. Example:
pip install torch==2.0.0 torchvision==0.15.1
# Install all other dependencies
pip install -r requirements.txt
Configure environment variables
Copy the .env.template
file and fill in your OPENAI_API_KEY
.
cp .env.template .env
π Getting Started
Before proceeding, please ensure that all the installation steps have been successfully completed.
π¨ Cost Warning
Please be aware that usage of GPT-4 with the assistant API can incur high costs. Make sure to monitor your usage and understand the pricing details provided by OpenAI before proceeding.
Gradio
python app.py
CLI
𧡠Create Thread
To create a thread, you can either provide a URL to a file or a local path to a file. Use the following commands:
# For a URL
python thread.py <URL_TO_PDF>
# For a local file
python thread.py <LOCAL_PATH_TO_PDF>
By default, you will find all outputs under ./data/<PDF_NAME>
. It will have the following structure.
./data/<PDF_NAME>/
βββ figures/
β βββ <figure_1_name>.jpg
β βββ <figure_2_name>.png
β βββ ...
βββ <PDF_NAME>.pdf
βββ results.json
βββ thread.json
βββ processed_thread.json
βββ processed_thread.md
The final output for user consumption is located at ./data/<PDF_NAME>/processed_thread.md
. This file is formatted in Markdown and can be conveniently viewed using any Markdown editor.
All Contents
figures/
: This directory contains all the figures, tables, and visuals that have been extracted from the paper.<PDF_NAME>.pdf
: This is the original PDF file.results.json
: This file contains the results of the layout parsing. It includes an index of all figures, their paths, and captions that were passed to OpenAI.thread.json
: This file contains the raw thread that was generated by OpenAI before any post-processing was done.processed_thread.json
: This file is a post-processed version ofthread.json
. The post-processing includes steps such as removing source annotations and duplicate figures.processed_thread.md
: This is a markdown version ofprocessed_thread.json
. It is the final output provided for user consumption.
π¨ Share Thread
To actually share the thread on X/Twitter, you need to set up the credentials in the .env
file. This requires creating a developer account and filling in your CONSUMER_KEY
, CONSUMER_SECRET
, ACCESS_KEY
, and ACCESS_SECRET
. Then run this command on the created JSON file:
python tweet.py ./data/<PDF_NAME>/processed_thread.json
π§ Customize Assistant
ThreadGPT utilizes OpenAI's assistant API. To customize the assistant's behavior, you need to modify the create_assistant.py
file. This script has defaults for the prompt, name, tools, and model (gpt-4-1106-preview
). You can customize these parameters to your liking.