Pandalyst: A large language model for mastering data analysis using pandas

🐱 Github Repo

What is Pandalyst

  • Pandalyst is a general large language model specifically trained to process and analyze data using the pandas library.

How is Pandalyst

  • Pandalyst has strong generalization capabilities for data tables in different fields and different data analysis needs.

Why is Pandalyst

  • Pandalyst is open source and free to use, and its small parameter size (7B/13B) allows us to easily deploy it on local PC.
  • Pandalyst can handle complex data tables (multiple columns and multiple rows), allowing us to enter enough context to describe our table in detail.
  • Pandalyst has very competitive performance, significantly outperforming models of the same size and even outperforming some of the strongest closed-source models.

News

  • 🔥[2023/10/15] Now we can plot 📈! and much more powerful! We released Pandalyst-7B-V1.2, which was trained on CodeLlama-7b-Python and it surpasses ChatGPT-3.5 (2023/06/13), Pandalyst-7B-V1.1 and WizardCoder-Python-13B-V1.0 in our PandaTest_V1.0.
  • 🤖️[2023/09/30] We released Pandalyst-7B-V1.1 , which was trained on CodeLlama-7b-Python and achieves the 76.1 exec@1 in our PandaTest_V1.0 and surpasses WizardCoder-Python-13B-V1.0 and ChatGPT-3.5 (2023/06/13).
Model Checkpoint Support plot License
🔥Pandalyst-7B-V1.2 🤗 HF Link Llama2
Pandalyst-7B-V1.1 🤗 HF Link Llama2

Usage and Human evaluation

Please refer to Github.

Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pipizhao/Pandalyst_13B_V1.0

Quantizations
3 models