Recommended Training Parameters: 1 Epochs and Repeats 1.1 Epochs: Number of dataset image training cycles. We suggest 10 for beginners. The value can be raised if the training seems insufficient due to a small dataset or lowered if the dataset is huge. 1.2 Repeats: Number of times an image is learned. Higher values lead to better effects and more complex image compositions. Setting it too high may increase the risk of overfitting. Therefore, we suggest using 10 to achieve good training results while minimizing the chance of overfitting. Note: You can increase the epochs and repeats if the training results do not resemble. 2 Learning Rate and Optimizer: 2.1 learning_rate (Overall Learning Rate): Degree of change in each repeat. Higher values mean faster learning but may cause model crashes or inability to converge. Lower values mean slower learning but may achieve optimal state. This value becomes ineffective after setting separate learning rates for U-Net and Text Encoder. November 15, 2023 2:13 PM (GMT+8)

2.2 unet_lr (U-Net Learning Rate): U-Net guides noise images generated by random seeds to determine denoising direction, find areas needing change, and provide the required data. Higher values mean faster fitting but risk missing details, while lower values cause underfitting and no resemblance among generated images and materials. The value is set accordingly based on the model type and dataset. We suggest 0.0002 for character training. 2.3 text_encoder_lr (Text Encoder Learning Rate): It converts tags to embedding form for U-Net to understand. Since the text encoder of SDXL is already well-trained, there is usually no need for further training, and default values are fine unless there are special needs. 2.4 Optimizer: An algorithm in deep learning that adjusts model parameters to minimize the loss function. During neural network training, the optimizer updates the model's weight based on the gradient information of the loss function so the model can better fit the training data. The default optimizer, AdamW, can be used for SDXL training, and other optimizers, like the easy-to-use Prodigy with adaptive learning rates, can also be chosen based on specific requirements. 2.5 lr_scheduler (Learning Rate Scheduler Settings): Refers to a strategy or algorithm for dynamically adjusting the learning rate during training. Choosing Constant is sufficient under normal circumstances.

https://media.discordapp.net/attachments/1174196809497849966/1174197353037713429/image.png?ex=65826700&is=656ff200&hm=69a1ae84035fca2f96d5b4777a59bf566622a061af93cf0dad89c26a3766892b&=&format=webp&quality=lossless&width=550&height=260

3 Network Settings: 3.1 network_dim (Network Dimension): Closely related to the size of the trained LoRA. For SDXL, a 32dim LoRA is 200M, a 16dim LoRA is 100M, and an 8dim LoRA is 50M. For characters, selecting 8dim is sufficient. 3.2 network_alpha: Typically set as half or a quarter of the dim value. If the dim is set as 8, then the alpha can be set as 4.

https://media.discordapp.net/attachments/1174196809497849966/1174197402232688660/image.png?ex=6582670c&is=656ff20c&hm=b4bb928d66a25e20c4c78edfa67e9f75898f65b08abeadd820d05afe67cae66e&=&format=webp&quality=lossless&width=550&height=131

4 Other settings: 4.1 Resolution: Training resolution can be non-square but must be multiples of 64. For SDXL, we suggest 10241024 or 1024768. 4.2 enable_bucket (Bucket): If the images' resolution is not unified, please turn on this parameter. It will automatically classify the resolution of the training set and create a bucket to store images for each resolution or similar resolution before the training starts. This saves time on unifying the resolution in the early stage. If the images' resolution has already been unified, there is no need to turn it on. 4.3 noise_offset and multires_noise_iterations: Both noise offsets improve the situation where the generated image is overly bright or too dark. If there are no excessively bright or dark images in the training set, they can be turned off. If turned on, we suggest using multires_noise_iterations with a value of 6-10. 4.4 multires_noise_discount: Needs turning on with the multires_noise_iterations mentioned above, and a value of 0.3-0.8 is recommended. 4.5 clip_skip: Specifies which text encoder layer's output to use counting from the last. Usually, the default value is fine.

https://media.discordapp.net/attachments/1174196809497849966/1174197454070087711/image.png?ex=65826718&is=656ff218&hm=90cf2518f26bdcb2965e7f469970516d005fc85c921025548a2e0379a1a36d52&=&format=webp&quality=lossless&width=550&height=236