Fine-Tuning Flux.1 Using AI Toolkit

#196
by exnrt - opened

Prerequisites:

  • Hardware: A powerful GPU with ample VRAM, such as an NVIDIA A100 or RTX 4090.
  • Software:
    • Python 3.8 or later
    • PyTorch 1.12 or later
    • Hugging Face Transformers
    • AI Toolkit (available on GitHub)
  • Dataset: A collection of images and corresponding text prompts that align with your desired fine-tuning goals.

For Details Check:

A Detailed Guide on Fine-tune FLUX.1 using AI Toolkit

Steps:

  1. Set Up the Environment:

    • Clone the AI Toolkit repository from GitHub:
      git clone https://github.com/ostris/ai-toolkit
      
    • Install required dependencies:
      cd ai-toolkit
      pip install -r requirements.txt
      
  2. Prepare the Dataset:

    • Organize your images and text prompts into a structured format, such as JSON or CSV or Directory.
    • Ensure that the image filenames match the corresponding text prompts.
    • Consider using a data augmentation technique to increase the diversity of your training data.
  3. Configure the Training Script:

    • Open the train_lora_flux_24gb.py script in your preferred text editor.
    • Modify the following parameters:
      • model_path: Path to the pre-trained Flux.1 model.
      • data_path: Path to your dataset.
      • output_dir: Directory where the fine-tuned model will be saved.
      • train_batch_size: Batch size for training.
      • eval_batch_size: Batch size for evaluation.
      • num_epochs: Number of training epochs.
      • learning_rate: Learning rate.
      • lora_rank: Rank of the LoRA layers.
      • lora_alpha: Scaling factor for the LoRA layers.
      • save_every: Frequency of saving checkpoints.
  4. Start the Training:

    • Run the training script:
      python train_lora_flux_24gb.py
      
    • The training process may take several hours or even days, depending on the size of your dataset and the hardware you're using.
  5. Evaluate the Fine-Tuned Model:

    • Generate images using the fine-tuned model and compare the results to the original model.
    • Assess the quality of the generated images based on your specific requirements.

Additional Tips:

  • Experiment with different hyperparameters: Adjust the learning rate, batch size, and other parameters to optimize the training process.
  • Consider using a learning rate scheduler: This can help prevent overfitting and improve convergence.
  • Monitor the training progress: Keep an eye on the loss function and evaluation metrics to ensure that the model is learning effectively.
  • Share your fine-tuned model: Contribute to the community by sharing your trained model with others.

By following these steps and incorporating the additional tips, you can effectively fine-tune Flux.1 using the AI Toolkit to achieve your desired image generation goals.

What is the recommended dataset size for fine-tuning Flux.1 effectively?

The recommended dataset size for fine-tuning Flux.1 can vary based on the complexity and goals of the project. However, starting with a small, high-quality dataset of 10 to 15 well-prepared images is often enough for simpler tasks like personalized image generation. This dataset should contain diverse images (varied angles, lighting, and expressions) to ensure better fine-tuning results.

For more complex models or broader applications, a larger dataset may be required to capture a wider range of features.

hey @exnrt - would love to get your feedback on our fine-tuning on astria.ai - i think the results are really good

how to load the flux model after finetune using ai-toolkit? In other words, how to merge lora checkpoint to the original flux dev model?

I have Fine-tuned the flux model on Google Collab. Now i have been wondering how to get the output without fine tuning it again

It generates me only GAUSSIAN BLURRED images all the time :(

@clui are you doing it correctly ?

i guess. flux-schnell works great but dev is still generating only GB images.
Maybe I should have train it longer but i supose images should be generated from very begining
Its all the same only guidance scale and sample steps changed (1>4, 4>20)
any ideas?

Sign up or log in to comment