Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,112 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Text-to-Video with LTX-Video Lora Model (Pixel Art Style)
|
2 |
+
|
3 |
+
This document provides a step-by-step guide to generating videos from text prompts using the `LTX-Video` model from Hugging Face's `diffusers` library. The model is fine-tuned with LoRA weights for the "Pixel Art" style, as demonstrated in this example.
|
4 |
+
|
5 |
+
## Dataset
|
6 |
+
This model is fine-tuned using the following dataset:
|
7 |
+
|
8 |
+
https://huggingface.co/datasets/svjack/test-HunyuanVideo-pixelart-videos
|
9 |
+
|
10 |
+
## Installation
|
11 |
+
|
12 |
+
First, ensure you have the necessary libraries installed. You can install them using pip:
|
13 |
+
|
14 |
+
```bash
|
15 |
+
pip install torch diffusers safetensors peft
|
16 |
+
```
|
17 |
+
|
18 |
+
## Usage
|
19 |
+
|
20 |
+
Below is a complete example of how to generate a video from a text prompt using the `LTX-Video` model with the "Pixel Art" style.
|
21 |
+
|
22 |
+
### Step 1: Import Required Libraries
|
23 |
+
|
24 |
+
```python
|
25 |
+
import torch
|
26 |
+
from diffusers import LTXPipeline
|
27 |
+
from diffusers.utils import export_to_video
|
28 |
+
```
|
29 |
+
|
30 |
+
### Step 2: Load the Model and LoRA Weights
|
31 |
+
|
32 |
+
```python
|
33 |
+
# Load the LTX-Video model with bfloat16 precision
|
34 |
+
pipe = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16)
|
35 |
+
|
36 |
+
# Load LoRA weights for the "Pixel Art" style
|
37 |
+
pipe.load_lora_weights("ltx_pixel_pytorch_lora_weights.safetensors", "pixel")
|
38 |
+
|
39 |
+
# Set the adapter with a strength of 2.0
|
40 |
+
pipe.set_adapters("pixel", 2.0)
|
41 |
+
|
42 |
+
# Move the model to the GPU for faster inference
|
43 |
+
pipe.to("cuda")
|
44 |
+
```
|
45 |
+
|
46 |
+
### Step 3: Define the Prompt and Generate the Video
|
47 |
+
|
48 |
+
```python
|
49 |
+
# Define the text prompt
|
50 |
+
prompt = "In the style of Pixel, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."
|
51 |
+
|
52 |
+
# Define the negative prompt to avoid undesirable qualities
|
53 |
+
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"
|
54 |
+
|
55 |
+
# Generate the video
|
56 |
+
video = pipe(
|
57 |
+
prompt=prompt,
|
58 |
+
negative_prompt=negative_prompt,
|
59 |
+
width=704,
|
60 |
+
height=480,
|
61 |
+
num_frames=161,
|
62 |
+
num_inference_steps=50,
|
63 |
+
).frames[0]
|
64 |
+
|
65 |
+
# Export the video to a file
|
66 |
+
export_to_video(video, "output.mp4", fps=24)
|
67 |
+
```
|
68 |
+
|
69 |
+
### Step 4: Display the Generated Video
|
70 |
+
|
71 |
+
```python
|
72 |
+
# Display the generated video in a Jupyter notebook
|
73 |
+
from IPython import display
|
74 |
+
display.Video("output.mp4")
|
75 |
+
```
|
76 |
+
|
77 |
+
## Example Prompts
|
78 |
+
|
79 |
+
### Lora Prefix
|
80 |
+
```txt
|
81 |
+
In the style of Pixel,
|
82 |
+
```
|
83 |
+
|
84 |
+
Here are three example prompts that you can use to generate different videos:
|
85 |
+
|
86 |
+
1. **Forest Scene:**
|
87 |
+
```python
|
88 |
+
prompt = "In the style of Pixel, Golden light filters through the canopy, illuminating soft moss and fallen leaves. Wildflowers bloom nearby, and glowing fireflies hover in the air. A gentle stream flows in the background, its murmur blending with birdsong. The scene radiates tranquility and natural charm."
|
89 |
+
```
|
90 |
+
|
91 |
+
|
92 |
+
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/ZxYuud6JxlZTwDRRjVkIh.mp4"></video>
|
93 |
+
|
94 |
+
2. **Castle Scene:**
|
95 |
+
```python
|
96 |
+
prompt = "In the style of Pixel, the video shifts to a majestic castle under a starry sky. Silvery moonlight bathes the ancient stone walls, casting soft shadows on the surrounding landscape. Towering spires rise into the night, their peaks adorned with glowing orbs that mimic the stars above. A tranquil moat reflects the shimmering heavens, its surface rippling gently in the cool breeze. Fireflies dance around the castle’s ivy-covered arches, adding a touch of magic to the scene. In the distance, a faint aurora paints the horizon with hues of green and purple, blending seamlessly with the celestial tapestry. The scene exudes an aura of timeless wonder and serene beauty."
|
97 |
+
```
|
98 |
+
|
99 |
+
|
100 |
+
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/0GLhV64zxj2rYkq06cE3S.mp4"></video>
|
101 |
+
|
102 |
+
3. **Urban Scene:**
|
103 |
+
```python
|
104 |
+
prompt = "In the style of Pixel, the video showcases a vibrant urban landscape. The city skyline is dominated by towering skyscrapers, their glass facades reflecting the sunlight. The streets are bustling with activity, filled with cars, buses, and pedestrians. Parks and green spaces are scattered throughout, offering a refreshing contrast to the concrete jungle. The architecture is a mix of modern and historic buildings, each telling a story of the city's evolution. The overall scene is alive with energy, capturing the essence of urban life."
|
105 |
+
```
|
106 |
+
|
107 |
+
|
108 |
+
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/yV68f1k9rVXRnyUj5a7u6.mp4"></video>
|
109 |
+
|
110 |
+
## Conclusion
|
111 |
+
|
112 |
+
This guide demonstrates how to generate videos from text prompts using the `LTX-Video` model with the "Pixel Art" style. By adjusting the prompts and parameters, you can create a wide variety of pixel art video content tailored to your needs.
|