narugo commited on
Commit
3b40049
·
1 Parent(s): 2a4040c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -7,4 +7,33 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pinned: false
8
  ---
9
 
10
+ ## What is this?
11
+
12
+ As you can see, this place is called `CyberHarem`, a centralized repository for anime waifu images dataset and LoRA models.
13
+
14
+ It's an interesting experiment where **all the datasets, models, model previews, and models published to [civitai](https://civitai.com) are fully auto-generated without any human intervention**. For this purpose, we've done a lot of tech and data preparation, which you can find in our [Organization - DeepGHS](https://huggingface.co/deepghs) and the code on [Github - DeepGHS](https://github.com/deepghs).
15
+
16
+ Currently, we have collected databases of several popular mobile games' characters (see [Supported Games of GChar Library](https://narugo1992.github.io/gchar/main/best_practice/supported/index.html#supported-games)) and crawled datasets of female characters from these games for training. In the future, we may include more characters, not just limited to mobile games, but also from anime series.
17
+
18
+ ## Where does the dataset come from? What's the format?
19
+
20
+ * The dataset is automatically crawled from various major image websites like [ZeroChan](https://zerochan.net), [Anime-Pictures](https://anime-pictures.net/), [Danbooru](https://danbooru.donmai.us/), [Rule34](https://rule34.xxx/), etc. (see [Supported Sites of GChar Library](https://narugo1992.github.io/gchar/main/best_practice/supported/index.html#supported-sites))
21
+ * In each dataset repository, there are both original data packs and images resized and aligned to a uniform size, along with image tags generated using the [SmilingWolf/wd-v1-4-convnextv2-tagger-v2](https://huggingface.co/SmilingWolf/wd-v1-4-convnextv2-tagger-v2) model.
22
+
23
+ ## How are the models trained? What's the format?
24
+
25
+ LoRA models are trained in batch with corresponding datasets. We use [7eu7d7](https://github.com/7eu7d7)'s [HCP-Diffusion](https://github.com/7eu7d7/HCP-Diffusion) training framework for the process.
26
+
27
+ ## How to use a11111's WebUI to generate images of anime waifus?
28
+
29
+ 1. Go to the model repository.
30
+ 2. Check the Model Card and choose a step that looks good visually.
31
+ 3. Click on the right side's Download to download the model package. The package contains two files: a `.pt` file and a `.safetensors` format LoRA file.
32
+ 4. **You need to use both of these models simultaneously. Put the `pt` file in the `embedding` path and use the `safetensors` file as LoRA mount.**
33
+ 5. Use the trigger words (provided in the Model Card) and prompt text to generate images.
34
+
35
+ ## Why do some preview images not look very much like the original characters?
36
+
37
+ The prompt texts used in the preview images are **automatically generated** using clustering algorithms based on the feature information extracted from the training dataset. The seed for generating images is also randomly generated, and **the images are not selected or modified** in any way, so there is a probability of such issues.
38
+
39
+ In reality, according to our internal tests, most models that have this issue perform better in actual use than what you see in the preview images. **The only thing you might need to do is fine-tune the tags you use a bit.**