File size: 5,341 Bytes
3b1b106 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
---
extra_gated_prompt: |-
By accessing TabPFN, you agree to:
1. Not use the model in ways that could harm individuals or communities
2. Comply with all applicable laws and regulations
3. Properly cite the model and its creators in any resulting publications
4. Report any discovered vulnerabilities or safety concerns to Prior Labs
extra_gated_fields:
Organization:
type: text
required: true
description: Company or institution you represent
Role:
type: text
required: true
description: Your role in the organization
Country:
type: country
required: true
description: Country where you or your organization is based
Intended Use:
type: select
required: true
options:
- Academic Research
- Education/Teaching
- Commercial Evaluation
- Non-profit Use
- Personal Learning
- label: Other
value: other
description: Primary intended use of TabPFN
Industry:
type: select
required: true
options:
- Healthcare/Life Sciences
- Financial Services
- Technology
- Education
- Manufacturing
- Research Institution
- label: Other
value: other
description: Your industry sector
Dataset Size:
type: select
required: true
options:
- <1000 rows
- 1000-10000 rows
- 10000-100000 rows
- '>100000 rows'
description: Typical size of datasets you plan to use
License Agreement:
type: checkbox
required: true
label: >-
I agree to the terms of the non-commercial license for research and
evaluation
Contact Permission:
type: checkbox
required: false
label: Prior Labs may contact me about my use case and provide support (optional)
pipeline_tag: tabular-classification
---
# Model Card for TabPFN-v2
TabPFN is a transformer-based foundation model for tabular data that leverages prior-data based learning to achieve strong performance on small tabular datasets without requiring task-specific training.
## Model Details
### Model Description
TabPFN is a novel approach to tabular data modeling that uses transformer architectures combined with prior knowledge injection to create a foundation model specifically designed for tabular data tasks.
- **Developed by:** Prior Labs
- **Model type:** Transformer-based foundation model for tabular data
- **Language(s):** Python
- **License:** Dual licensing - Open source for research/non-commercial use
- **Finetuned from model:** Custom architecture, trained from scratch
### Model Sources
- **Repository:** https://github.com/priorlabs/tabpfn
- **Paper:** [More Information Needed]
- **Demo:** Available via API access
## Uses
### Direct Use
TabPFN can be directly used for:
- Classification tasks on small to medium-sized tabular datasets
- Automated machine learning workflows
- Quick prototyping and baseline model creation
- Transfer learning applications for tabular data
### Downstream Use
The model can be used as:
- A feature extractor for downstream tasks
- A foundation for transfer learning on domain-specific tabular data
- A component in automated ML pipelines
- A baseline model for benchmarking
### Out-of-Scope Use
- The model is not designed for:
- Very large datasets (currently optimized for smaller datasets)
- Non-tabular data formats
- Time series forecasting
- Direct regression tasks
## Bias, Risks, and Limitations
- Performance may vary based on dataset size and characteristics
- Model behavior heavily depends on the quality and representativeness of training data
- May not perform optimally on highly imbalanced datasets
- Resource intensive for very large datasets
### Recommendations
- Use on datasets with clear structure and well-defined features
- Validate model outputs especially for sensitive applications
- Consider dataset size limitations when applying the model
- Monitor performance across different subgroups in the data
## How to Get Started with the Model
```python
from tabpfn import TabPFNClassifier
# Initialize model
classifier = TabPFNClassifier()
# Fit and predict
classifier.fit(X_train, y_train)
predictions = classifier.predict(X_test)
```
## Training Details
### Training Data
[More Information Needed]
### Training Procedure
#### Training Hyperparameters
- **Training regime:** Mixed precision training
## Evaluation
### Testing Data, Factors & Metrics
#### Metrics
- Classification accuracy
- F1 score
- ROC-AUC
- Precision-Recall curves
### Results
[More Information Needed]
## Environmental Impact
- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications
### Model Architecture and Objective
TabPFN uses a transformer-based architecture specifically designed for tabular data processing, with modifications to handle varying input sizes and feature types.
### Compute Infrastructure
#### Hardware
Recommended minimum specifications:
- CPU: Modern multi-core processor
- RAM: 16GB+
- GPU: Optional, CPU inference supported
#### Software
- Python 3.7+
- Key dependencies: PyTorch, NumPy, Pandas
## Model Card Contact
For more information, contact Prior Labs. |