Exploring Flux.1-dev on Google Colab: Free AI Image Generation
Stuck behind the Paywall, Read it for Free
Before we dive into this exciting journey, I’d like to give credit to the brilliant minds behind the code that powers this project. Special thanks to camenduru for providing and publishing this amazing code on Google Colab, which makes it so accessible to all of us.
Now, Imagine being able to bring vivid imagery to life just by describing it in words. Generative AI, powered by advanced diffusion models, has made this dream a reality. Whether it’s a serene landscape, a futuristic cityscape, or an emotional human moment, these tools allow anyone to create breathtaking visuals with minimal effort. In this guide, we explore the power of Generative AI by crafting various artworks for free using Google Colab. Let’s dive into this fascinating journey of art and AI!
If you’re excited by what you’re reading, go ahead and clap 50 times 👏 to show some love! Also, don’t forget to follow me on Medium for more exciting content on Generative AI and creative tech. Let’s explore the endless possibilities together! 🎨✨
Let’s Begin with a Few Samples to Spark Your Curiosity!
Anime Themed Image Regenerated using Flux.1-dev Model on Google Colab- Free! (Not sure where my spects are gone?)
Image Generated using Flux.1-dev Model with the following prompt: A beautiful Indian girl getting married to the man of her dreams.
Again an Image Generated using Flux.1-dev Model with the above defined prompt: A beautiful Indian girl getting married to the man of her dreams. Trying to understand the consistency
Image Generated using Flux.1-dev Model with the following prompt: Futuristic cityscapes What do you think about the above generated images? I know its astonishing 😍 This is the Power of Generative AI which is revolutionizing creativity by enabling users to generate high-quality images from textual prompts. One of the most exciting advancements in this space is the Flux Model, a powerful tool for generating anime-style images using a combination of state-of-the-art AI techniques like diffusion models, VAE, and CLIP.
In this article, I’ll guide you through running the Flux Model on Google Colab — completely free of cost! By the end of this tutorial, you’ll have the tools to generate your own masterpieces.
Target of this article MidJourney vs. Flux. Overview of Available Flux Models in the Market. The purpose of components like UNET, VAE, CLIP, and T5 Tokenizer. How to set up the Flux Model on Google Colab. Running the image regeneration pipeline. Running the image generation pipeline. MidJourney vs. Flux Flux AI is praised for its flexibility, especially for users who prefer a high level of control over image generation and customization. It also benefits from being open-source, which makes it an attractive option for developers and technical users looking for more control over their workflows.
On the other hand, Midjourney offers a more streamlined, user-friendly experience with powerful outputs for artistic and creative purposes. It’s best suited for users who want a quick, high-quality result without needing to fine-tune parameters.
Both models are strong contenders, but your choice depends on your specific needs: Flux for flexibility and photorealism, or Midjourney for ease of use and artistic flair.
Image shows the detailed comparison of MidJourney vs Flux Overview of Available Flux Models in the Market Flux AI models are known for their ability to generate high-quality, photorealistic images with a focus on detail and realism. Flux.1 Dev is great for those needing high fidelity, while Flux.1 Pro targets professionals with added features for advanced projects. Flux.1 Schnell offers a faster alternative with a slight quality trade-off, ideal for users who prioritize speed over ultra-high fidelity. These models are ideal for creating realistic, detailed images for a variety of applications
Image with detailed comparison betwen all the available Flux Models in the Market The purpose of components like UNET, VAE, CLIP and T5 Tokenizer
Image: Tried explaining all these major components here in the above table (Let me know if you guys need seperate article on the deep dive of these concepts) How to set up the Flux Model on Google Colab. Step 1: Setting Up the Environment on Google Colab Open Google Colab and create a new notebook. Change the runtime type to T4 GPU — Very Crucial Step!
Image: Change runtime type to T4 GPU 3. Copy and paste the code below into a code cell.
Change the working directory to '/content' in the Colab environment
%cd /content
Clone the TotoroUI repository from GitHub, switching to the 'totoro3' branch
!git clone -b totoro3 https://github.com/camenduru/ComfyUI /content/TotoroUI
Navigate to the cloned TotoroUI directory
%cd /content/TotoroUI
Install required Python libraries for the model, including:
- torchsde: Stochastic differential equations in PyTorch
- einops: Operations for tensors
- diffusers: Hugging Face library for diffusion models
- accelerate: For efficient training and inference
- xformers: Memory-efficient attention implementation
!pip install -q torchsde einops diffusers accelerate xformers==0.0.28.post2
Install the aria2 utility, a command-line download manager
!apt -y install -qq aria2
Download the UNET model weights for the Flux model from Hugging Face and save them to the appropriate directory
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/flux1-dev-fp8.safetensors -d /content/TotoroUI/models/unet -o flux1-dev-fp8.safetensors
Download the VAE model weights from Hugging Face and save them to the appropriate directory
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/ae.sft -d /content/TotoroUI/models/vae -o ae.sft
Download the CLIP model weights from Hugging Face and save them to the appropriate directory
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/clip_l.safetensors -d /content/TotoroUI/models/clip -o clip_l.safetensors
Download the tokenizer model weights for T5 from Hugging Face and save them to the appropriate directory
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/t5xxl_fp8_e4m3fn.safetensors -d /content/TotoroUI/models/clip -o t5xxl_fp8_e4m3fn.safetensors
Download a sample test image from Hugging Face and save it as 'test.png'
!wget https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/test.png -O /content/test.png
Import necessary Python libraries for processing and inference
import random import torch import numpy as np from PIL import Image # For image manipulation import nodes # Custom modules for handling pipeline nodes from nodes import NODE_CLASS_MAPPINGS # Map for accessing different model components from totoro_extras import nodes_custom_sampler # Additional sampling utilities from totoro_extras import nodes_post_processing # Post-processing utilities from totoro import model_management # Utilities for managing model memory and resources
Load different components of the pipeline using the NODE_CLASS_MAPPINGS dictionary
DualCLIPLoader = NODE_CLASS_MAPPINGS"DualCLIPLoader" # Loader for CLIP model UNETLoader = NODE_CLASS_MAPPINGS"UNETLoader" # Loader for UNET model RandomNoise = nodes_custom_sampler.NODE_CLASS_MAPPINGS"RandomNoise" # Noise generator BasicGuider = nodes_custom_sampler.NODE_CLASS_MAPPINGS"BasicGuider" # Model guider for conditioning KSamplerSelect = nodes_custom_sampler.NODE_CLASS_MAPPINGS"KSamplerSelect" # Sampler selector BasicScheduler = nodes_custom_sampler.NODE_CLASS_MAPPINGS"BasicScheduler" # Scheduler for diffusion steps SamplerCustomAdvanced = nodes_custom_sampler.NODE_CLASS_MAPPINGS"SamplerCustomAdvanced" # Advanced sampler VAELoader = NODE_CLASS_MAPPINGS"VAELoader" # Loader for VAE model VAEDecode = NODE_CLASS_MAPPINGS"VAEDecode" # Decoder for VAE VAEEncode = NODE_CLASS_MAPPINGS"VAEEncode" # Encoder for VAE EmptyLatentImage = NODE_CLASS_MAPPINGS"EmptyLatentImage" # Placeholder for empty latent images ImageScaleToTotalPixels = nodes_post_processing.NODE_CLASS_MAPPINGS"ImageScaleToTotalPixels" # Upscaling utility
Perform inference in no-grad mode for efficiency
with torch.inference_mode(): # Load CLIP, UNET, and VAE models with pre-trained weights clip = DualCLIPLoader.load_clip("t5xxl_fp8_e4m3fn.safetensors", "clip_l.safetensors", "flux")[0] unet = UNETLoader.load_unet("flux1-dev-fp8.safetensors", "fp8_e4m3fn")[0] vae = VAELoader.load_vae("ae.sft")[0]
Function to find the number closest to n
that is divisible by m
def closestNumber(n, m): q = int(n / m) # Calculate the quotient n1 = m * q # Closest number below or equal to n
# Calculate the other candidate based on the sign of n
and m
if (n * m) > 0: n2 = m * (q + 1) # Candidate above n
else: n2 = m * (q - 1) # Candidate below n
# Return the number closer to n
if abs(n - n1) < abs(n - n2): return n1 return n2Run the cell. This will set up your environment, clone the necessary repository, and download all required model weights. Once you run the above code lines, you should see the following output
/content Cloning into '/content/TotoroUI'... remote: Enumerating objects: 14652, done. remote: Counting objects: 100% (2642/2642), done. remote: Compressing objects: 100% (179/179), done. remote: Total 14652 (delta 2549), reused 2463 (delta 2463), pack-reused 12010 (from 1) Receiving objects: 100% (14652/14652), 22.89 MiB | 16.18 MiB/s, done. Resolving deltas: 100% (9901/9901), done. /content/TotoroUI ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.7/16.7 MB 88.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 906.4/906.4 MB 747.9 kB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 363.4/363.4 MB 4.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.8/13.8 MB 98.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.6/24.6 MB 78.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 883.7/883.7 kB 53.4 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 3.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.5/211.5 MB 5.6 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 15.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.9/127.9 MB 7.4 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.5/207.5 MB 5.9 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.7/188.7 MB 6.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 84.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 8.6 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.5/209.5 MB 5.7 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.2/61.2 kB 4.5 MB/s eta 0:00:00 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchaudio 2.5.1+cu121 requires torch==2.5.1, but you have torch 2.5.0 which is incompatible. torchvision 0.20.1+cu121 requires torch==2.5.1, but you have torch 2.5.0 which is incompatible. The following additional packages will be installed: libaria2-0 libc-ares2 The following NEW packages will be installed: aria2 libaria2-0 libc-ares2 0 upgraded, 3 newly installed, 0 to remove and 49 not upgraded. Need to get 1,513 kB of archives. After this operation, 5,441 kB of additional disk space will be used. Selecting previously unselected package libc-ares2:amd64. (Reading database ... 123630 files and directories currently installed.) Preparing to unpack .../libc-ares2_1.18.1-1ubuntu0.22.04.3_amd64.deb ... Unpacking libc-ares2:amd64 (1.18.1-1ubuntu0.22.04.3) ... Selecting previously unselected package libaria2-0:amd64. Preparing to unpack .../libaria2-0_1.36.0-1_amd64.deb ... Unpacking libaria2-0:amd64 (1.36.0-1) ... Selecting previously unselected package aria2. Preparing to unpack .../aria2_1.36.0-1_amd64.deb ... Unpacking aria2 (1.36.0-1) ... Setting up libc-ares2:amd64 (1.18.1-1ubuntu0.22.04.3) ... Setting up libaria2-0:amd64 (1.36.0-1) ... Setting up aria2 (1.36.0-1) ... Processing triggers for man-db (2.10.2-1) ... Processing triggers for libc-bin (2.35-0ubuntu3.4) ... /sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libumf.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_loader.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm_debug.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_level_zero.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_opencl.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libhwloc.so.15 is not a symbolic link
*** Download Progress Summary as of Wed Nov 27 17:42:18 2024 ***
[#85ff61 10GiB/11GiB(95%) CN:16 DL:139MiB ETA:4s] FILE: /content/TotoroUI/models/unet/flux1-dev-fp8.safetensors
Download Results: gid |stat|avg speed |path/URI ======+====+===========+======================================================= 85ff61|OK | 178MiB/s|/content/TotoroUI/models/unet/flux1-dev-fp8.safetensors
Status Legend: (OK):download completed.
Download Results: gid |stat|avg speed |path/URI ======+====+===========+======================================================= d72a8b|OK | 220MiB/s|/content/TotoroUI/models/vae/ae.sft
Status Legend: (OK):download completed.
Download Results: gid |stat|avg speed |path/URI ======+====+===========+======================================================= f3e14c|OK | 250MiB/s|/content/TotoroUI/models/clip/clip_l.safetensors
Status Legend: (OK):download completed.
Download Results: gid |stat|avg speed |path/URI ======+====+===========+======================================================= d2efce|OK | 131MiB/s|/content/TotoroUI/models/clip/t5xxl_fp8_e4m3fn.safetensors
Status Legend: (OK):download completed. --2024-11-27 17:43:02-- https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/test.png Resolving huggingface.co (huggingface.co)... 3.165.160.11, 3.165.160.61, 3.165.160.12, ... Connecting to huggingface.co (huggingface.co)|3.165.160.11|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://cdn-lfs-us-1.hf.co/repos/2d/6c/2d6cde08f0ddbdceafbcce501fdc08ef2283be0fcd41e4159060967c30c68d8f/e9588b88713b367b6b1e55dc8476052937af442427eec9eb2e9e25b39f4bd780?response-content-disposition=inline%3B+filename*%3DUTF-8''test.png%3B+filename%3D"test.png"%3B&response-content-type=image%2Fpng&Expires=1732988582&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczMjk4ODU4Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzJkLzZjLzJkNmNkZTA4ZjBkZGJkY2VhZmJjY2U1MDFmZGMwOGVmMjI4M2JlMGZjZDQxZTQxNTkwNjA5NjdjMzBjNjhkOGYvZTk1ODhiODg3MTNiMzY3YjZiMWU1NWRjODQ3NjA1MjkzN2FmNDQyNDI3ZWVjOWViMmU5ZTI1YjM5ZjRiZDc4MD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=PCL8XiIklkihiN13Z5WMTwcrJegjkzNdNhMJRP2eN1eAE0AVgLC02AH~O1s5267kWZ42gGkv6qOUnUo80ftLQVUDwLiyKapH7t9ljTJ8AYfLgeF6-roDDO00LJ1br2NFPBau5p0Gz-cWEjUGXr4WkxwWvjj-kNhfVFGjXxTY5f4hZ4IVyK3MTXCvChy5Plo~SUX9Ay0P1XOHGiVyiPIuigH7J64IxRSrGaftbAXlANsX-vMCkD5ChUNTFU3~8wqWBba1tnOkVPJ0JAU6WzwQAF4dGFWcbFQRIcUiCq8ZC4oUrFIOqkZ-HyJYAp-xnvL70WqBN4Vrf3ZTG-EZ8D2Rog__&Key-Pair-Id=K24J24Z295AEI9 [following] --2024-11-27 17:43:02-- https://cdn-lfs-us-1.hf.co/repos/2d/6c/2d6cde08f0ddbdceafbcce501fdc08ef2283be0fcd41e4159060967c30c68d8f/e9588b88713b367b6b1e55dc8476052937af442427eec9eb2e9e25b39f4bd780?response-content-disposition=inline%3B+filename*%3DUTF-8''test.png%3B+filename%3D"test.png"%3B&response-content-type=image%2Fpng&Expires=1732988582&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczMjk4ODU4Mn19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zLzJkLzZjLzJkNmNkZTA4ZjBkZGJkY2VhZmJjY2U1MDFmZGMwOGVmMjI4M2JlMGZjZDQxZTQxNTkwNjA5NjdjMzBjNjhkOGYvZTk1ODhiODg3MTNiMzY3YjZiMWU1NWRjODQ3NjA1MjkzN2FmNDQyNDI3ZWVjOWViMmU5ZTI1YjM5ZjRiZDc4MD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=PCL8XiIklkihiN13Z5WMTwcrJegjkzNdNhMJRP2eN1eAE0AVgLC02AH~O1s5267kWZ42gGkv6qOUnUo80ftLQVUDwLiyKapH7t9ljTJ8AYfLgeF6-roDDO00LJ1br2NFPBau5p0Gz-cWEjUGXr4WkxwWvjj-kNhfVFGjXxTY5f4hZ4IVyK3MTXCvChy5Plo~SUX9Ay0P1XOHGiVyiPIuigH7J64IxRSrGaftbAXlANsX-vMCkD5ChUNTFU3~8wqWBba1tnOkVPJ0JAU6WzwQAF4dGFWcbFQRIcUiCq8ZC4oUrFIOqkZ-HyJYAp-xnvL70WqBN4Vrf3ZTG-EZ8D2Rog__&Key-Pair-Id=K24J24Z295AEI9 Resolving cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)... 3.165.160.38, 3.165.160.3, 3.165.160.77, ... Connecting to cdn-lfs-us-1.hf.co (cdn-lfs-us-1.hf.co)|3.165.160.38|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1021777 (998K) [image/png] Saving to: '/content/test.png'
/content/test.png 100%[===================>] 997.83K --.-KB/s in 0.05s
2024-11-27 17:43:02 (20.1 MB/s) - '/content/test.png' saved [1021777/1021777]
WARNING:root:clip missing: ['text_projection.weight'] Once the Status Legend is (OK):download completed, you are good to go
Step 2. Running the image Regeneration pipeline The code generates a high-quality “anime-style” image based on a textual prompt using a diffusion-based generative AI model. It begins by preparing necessary parameters such as the prompt, image dimensions, and sampling method, and generates random noise as the starting point for image generation.
Please make sure that the dimension of your input image must be 640 x 960 pixles
with torch.inference_mode(): # Disable gradient calculations for faster inference positive_prompt = "anime style" # The text prompt that guides the image generation width = 1024 # Width of the output image in pixels height = 1024 # Height of the output image in pixels seed = 0 # Seed for random number generation; 0 means it will be randomized steps = 20 # Number of steps for the diffusion process (higher = better quality) sampler_name = "euler" # Name of the sampling method used during image generation scheduler = "simple" # Type of scheduler used for controlling noise levels
# If seed is 0, generate a random seed to ensure variation in outputs
if seed == 0:
seed = random.randint(0, 27391293129231)
print(seed) # Print the seed for reproducibility
# Encode the text prompt into latent space using the CLIP model
cond, pooled = clip.encode_from_tokens(clip.tokenize(positive_prompt), return_pooled=True)
cond = [[cond, {"pooled_output": pooled}]] # Format the conditioning for the model
# Generate a noise tensor based on the seed, which serves as the starting point for diffusion
noise = RandomNoise.get_noise(seed)[0]
# Initialize the guidance model using the UNET and the encoded prompt
guider = BasicGuider.get_guider(unet, cond)[0]
# Select the specified sampler to perform the denoising steps
sampler = KSamplerSelect.get_sampler(sampler_name)[0]
# Generate the schedule for noise levels during the denoising process
sigmas = BasicScheduler.get_sigmas(unet, scheduler, steps, 0.75)[0]
# Load an input image from a file (used for reference or transformation)
image = nodes.LoadImage().load_image("/content/shobhit.jpg")[0]
# Resize the input image to the target resolution while maintaining quality
latent_image = ImageScaleToTotalPixels.upscale(image, "lanczos", 1.0)[0]
# Encode the resized image into the latent space using the VAE
latent_image = VAEEncode.encode(vae, latent_image)[0]
# Perform the denoising and sampling process to generate the final latent image
sample, sample_denoised = SamplerCustomAdvanced.sample(noise, guider, sampler, sigmas, latent_image)
# Free up unused memory from the models
model_management.soft_empty_cache()
# Decode the latent image back into pixel space using the VAE
decoded = VAEDecode.decode(vae, sample)[0].detach()
# Save the generated image to a file
Image.fromarray(np.array(decoded * 255, dtype=np.uint8)[0]).save("/content/anime.png")
Display the generated image inline in the notebook
Image.fromarray(np.array(decoded * 255, dtype=np.uint8)[0]) Ouput of the above code is attached in the First Section of Samples.
Here, you can modify the positive_prompt variable in the code to try different prompts like:
"fantasy landscape" "cyberpunk" "magical" "cartoon" Re-run the cell to generate unique outputs.
Step 3. Running the image Generation pipeline Instead of re-generating, if you want to Generate a whole new image, you can do that too! Below is the code to do that:
Disable gradient calculations for faster inference, as this is an inference-only process
with torch.inference_mode(): # Define the positive prompt to guide image generation positive_prompt = "A beautiful Indian girl getting married to the man of her dreams!" # Set the desired width and height of the generated image width = 1024 height = 1024 # Seed for reproducibility; 0 indicates a randomized seed seed = 0 # Number of denoising steps in the diffusion process (higher = better quality, slower) steps = 20 # Specify the sampling method used during image generation sampler_name = "euler" # Type of scheduler for noise control in the denoising process scheduler = "simple"
# If seed is 0, generate a random seed to ensure output variation
if seed == 0:
seed = random.randint(0, 18999829382)
# Print the seed for reproducibility in case you want to recreate the same image
print(seed)
# Encode the textual prompt into latent space using the CLIP model
cond, pooled = clip.encode_from_tokens(clip.tokenize(positive_prompt), return_pooled=True)
cond = [[cond, {"pooled_output": pooled}]] # Format the conditioning information for the model
# Generate a noise tensor based on the seed; this serves as the starting point for the diffusion process
noise = RandomNoise.get_noise(seed)[0]
# Create a guidance model using the UNET and encoded prompt for refining the noisy latent representation
guider = BasicGuider.get_guider(unet, cond)[0]
# Select the sampler to perform the iterative denoising steps
sampler = KSamplerSelect.get_sampler(sampler_name)[0]
# Generate a schedule for noise levels during the denoising process
sigmas = BasicScheduler.get_sigmas(unet, scheduler, steps, 1.0)[0]
# Create an empty latent image with dimensions adjusted to the model's requirements (multiples of 16)
latent_image = EmptyLatentImage.generate(closestNumber(width, 16), closestNumber(height, 16))[0]
# Perform the denoising process to generate the final latent image
sample, sample_denoised = SamplerCustomAdvanced.sample(noise, guider, sampler, sigmas, latent_image)
# Clear unused memory from the models to optimize performance
model_management.soft_empty_cache()
# Decode the latent image into pixel space using the VAE
decoded = VAEDecode.decode(vae, sample)[0].detach()
# Save the generated image to a file
Image.fromarray(np.array(decoded * 255, dtype=np.uint8)[0]).save("/content/indian_girl_marriage.png")
Display the generated image inline in the notebook
Image.fromarray(np.array(decoded * 255, dtype=np.uint8)[0]) Ouput of the above code is attached in the First Section of Samples.
Re-run the cell to generate unique outputs, like below I tried with various prompts.
I set positive_prompt = "selfie.IMG", here is an awesome output 😀
Image generated using Flux.1-dev on Google Colab- Free of cost! Since, this is using T4 GPUs from Google Colab which is free of cost, speed is something
Step 4: Download Your Generated Images Simply right click on the generated image and download it locally.
Image: Download the Flux.1-dev Generated Image Locally As we marvel at the image we just created, it’s astonishing to realize how far technology has come in making creativity accessible to all. Generative AI not only transforms how we approach art but also bridges the gap between imagination and execution. By running this code, you’ve unlocked the potential to craft unique visual stories, limited only by the bounds of your imagination. Now, the question is: What masterpiece will you create next? The canvas is yours to explore, kindly do share your experiences in the comment section.
If you found this guide helpful and inspiring, don’t forget to clap (up to 50 times! 👏) and follow me for more exciting content on Generative AI and creative technology. Let’s keep building and exploring together! 🎨✨
This story is published on Generative AI. Connect with us on LinkedIn and follow Zeniteq to stay in the loop with the latest AI stories.
Subscribe to our newsletter and YouTube channel to stay updated with the latest news and updates on generative AI. Let’s shape the future of AI together!