FormAI Mixer Training Tips (Private Beta)

Mixers allow you to train FormAI based on your own images. The better the source images, the better the Mixer will perform.

Mixers are FormAI's approach to enhancing a base AI model with additional information. Mixer's are tailored for specific use cases, such as generating images that capture a particular individual's face, a unique object, or a distinct aesthetic style not present in the original model.

Only train Mixers with images you own. Training with copyrighted material may result in deletion of your Mixer. AI Credits will not be credited for deleted Mixers.


Currently, the Mixer API provides three distinct modes: Face, Object, and Style, each with its own set of guidelines and recommendations to ensure optimal results. The golden rule of Mixer training is garbage in, garbage out. If you upload low resolution, low quality images the output will also be low quality.

For all model training image upload resolution should exceed 1024 x 1024, with the face or object area occupying at least 512 x 512 pixels. While the system will accept smaller images, they will yield poor results.

  • Avoid duplicates or resized versions of the same image. 
  • Don't train with AI images.

Face Mixer Training


Face Mixers are designed to integrate a specific person's face into a base model, enabling image generation featuring that individual.

Requirements:
  • Face Mixers are specifically developed for human faces. For non-human faces, try our Object Mixer Mode.
  • Upload photos showing only one face to avoid confusion during training. Uploading more than one face will likely cause very poor outputs.
  • Photos should be upright with a tilt angle under 45 degrees.
  • Clear facial features are crucial. Ensure good lighting and visibility of the face, preferably forward-facing or slightly turned with both eyes visible. Blurry, low quality images are 
  • A minimum of four photos is required to train a face mixer, but we suggest at least 15 images.

Want the best possible quality for your Mixer? Have your headshots done with professional lighting. Shoot every image on a different background with a different outfit so that the end result is at least 15 images, all with slightly different angles/


Suggestions:
  • Avoid repetitive expressions and exaggerated facial expressions.
  • Exclude photos where the face is obscured by objects like glasses (unless desired), hands, or other items.
  • Diverse clothing, hairstyles, and subtle expressions enhance the mixer's versatility.
  • Varied, contrasting backgrounds in your dataset improve subject-background differentiation.

Object Mixer Training


This mode allows you to introduce a new object into a Stable Diffusion base model for image generation.

Requirements:
  • Focus on a single object in your uploaded images, do not combine concepts together in your training.
  • Use straightforward, concrete terms for your object prompts (e.g., "cat" or "car").
  • Ensure the object contrasts well against the background.

Suggestions:
  • More high-quality images enhance the final model. Consider uploading six to ten or more images.
  • Include variations in lighting and angle to increase model flexibility.
  • Use contrasting backgrounds to distinguish the object from its surroundings.

Style Mixer Training


This mode enables the introduction of a new aesthetic style into the base Felli model, allowing for style-specific image generation.

Requirements:
  • At least ten example images are necessary to define a style.
  • Images may be center cropped during training; central placement of important elements is recommended.
  • Choose aesthetic styles that can be generalized across various subjects, like oil painting or line art.

Suggestions:
  • Use high-quality, high dynamic range images for training.
  • Include a variety of subjects in your style dataset to prevent a narrow focus (e.g., training with mostly bikes may bias the model towards bike images).
  • Larger datasets yield better results; 20-30 images for specific styles like oil painting and 50-60 for broader concepts like 'dark and moody'.