Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,33 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
## What is this?
|
11 |
+
|
12 |
+
As you can see, this place is called `CyberHarem`, a centralized repository for anime waifu images dataset and LoRA models.
|
13 |
+
|
14 |
+
It's an interesting experiment where **all the datasets, models, model previews, and models published to [civitai](https://civitai.com) are fully auto-generated without any human intervention**. For this purpose, we've done a lot of tech and data preparation, which you can find in our [Organization - DeepGHS](https://huggingface.co/deepghs) and the code on [Github - DeepGHS](https://github.com/deepghs).
|
15 |
+
|
16 |
+
Currently, we have collected databases of several popular mobile games' characters (see [Supported Games of GChar Library](https://narugo1992.github.io/gchar/main/best_practice/supported/index.html#supported-games)) and crawled datasets of female characters from these games for training. In the future, we may include more characters, not just limited to mobile games, but also from anime series.
|
17 |
+
|
18 |
+
## Where does the dataset come from? What's the format?
|
19 |
+
|
20 |
+
* The dataset is automatically crawled from various major image websites like [ZeroChan](https://zerochan.net), [Anime-Pictures](https://anime-pictures.net/), [Danbooru](https://danbooru.donmai.us/), [Rule34](https://rule34.xxx/), etc. (see [Supported Sites of GChar Library](https://narugo1992.github.io/gchar/main/best_practice/supported/index.html#supported-sites))
|
21 |
+
* In each dataset repository, there are both original data packs and images resized and aligned to a uniform size, along with image tags generated using the [SmilingWolf/wd-v1-4-convnextv2-tagger-v2](https://huggingface.co/SmilingWolf/wd-v1-4-convnextv2-tagger-v2) model.
|
22 |
+
|
23 |
+
## How are the models trained? What's the format?
|
24 |
+
|
25 |
+
LoRA models are trained in batch with corresponding datasets. We use [7eu7d7](https://github.com/7eu7d7)'s [HCP-Diffusion](https://github.com/7eu7d7/HCP-Diffusion) training framework for the process.
|
26 |
+
|
27 |
+
## How to use a11111's WebUI to generate images of anime waifus?
|
28 |
+
|
29 |
+
1. Go to the model repository.
|
30 |
+
2. Check the Model Card and choose a step that looks good visually.
|
31 |
+
3. Click on the right side's Download to download the model package. The package contains two files: a `.pt` file and a `.safetensors` format LoRA file.
|
32 |
+
4. **You need to use both of these models simultaneously. Put the `pt` file in the `embedding` path and use the `safetensors` file as LoRA mount.**
|
33 |
+
5. Use the trigger words (provided in the Model Card) and prompt text to generate images.
|
34 |
+
|
35 |
+
## Why do some preview images not look very much like the original characters?
|
36 |
+
|
37 |
+
The prompt texts used in the preview images are **automatically generated** using clustering algorithms based on the feature information extracted from the training dataset. The seed for generating images is also randomly generated, and **the images are not selected or modified** in any way, so there is a probability of such issues.
|
38 |
+
|
39 |
+
In reality, according to our internal tests, most models that have this issue perform better in actual use than what you see in the preview images. **The only thing you might need to do is fine-tune the tags you use a bit.**
|