Skip to content

omeregev/click2mask

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

[AAAI 2025] Click2Mask: Local Editing with Dynamic Mask Generation

Official PyTorch Implementation for "Click2Mask: Local Editing with Dynamic Mask Generation".

Website arXiv Paper PDF YouTube Video Hugging Face Demo Colab

[AAAI 2025] Click2Mask: Local Editing with Dynamic Mask Generation
Omer Regev, Omri Avrahami, Dani Lischinski

Given an image, a Click alt text , and a prompt for an added object, a Mask is generated dynamically, simultaneously with the object generation throughout the diffusion process.

Current methods rely on existing objects/segments, or user effort (masks/detailed text), to localize object additions. Our approach enables free-form editing, where the manipulated area is not well-defined, using just a Click alt text for localization.

πŸš€ Try Click2Mask Online

πŸ€— Hugging Face Demo

Try it instantly in your browser - no setup required.
Launch Demo β†’

Google Colab Demo

Includes both Gradio interface and command line for advanced usage.
Open in Colab β†’

Results

Output Examples

Each example includes an input image with a Click alt text , followed by outputs corresponding to the prompts below.

Qualitative Comparisons with SoTA Methods

A brief glimpse into the qualitative comparison between the SoTA methods β€” Emu Edit, MagicBrush and InstructPix2Pix β€” against our model, Click2Mask.
Upper prompts were given to baselines, and lower (shorter) ones to Click2Mask. Inputs contain the Click alt text given to Click2Mask.

Installation

πŸ’‘Check out our Hugging Face Demo or Google Colab Demo for instant access without installing.

Clone the Repository

git clone https://github.com/omeregev/click2mask.git
cd click2mask

Install Dependencies

Option 1: Using pip (Recommended)

pip install -r requirements.txt

Option 2: Using Conda
If you prefer conda or need a more isolated environment (note: uses older PyTorch version):

conda env create -f environment.yml
conda activate c2m

Download Model Checkpoint

Download the Alpha-CLIP checkpoint from here (1.2GB):

mkdir checkpoints
wget -P checkpoints https://huggingface.co/omeregev/click2mask/resolve/main/clip_l14_336_grit1m_fultune_8xe.pth

If the above link is broken, you can use this Google Drive mirror.

Usage

Gradio Interface

Launch the interactive web interface:

python app.py

Then open your browser at the provided public URL interface link.

Command Line Interface

  1. Run:
python scripts/text_editing_click2mask.py --image_path "<path/to/input/image>" --prompt "<the prompt>" --output_dir "<path/to/output/directory>" 

For example:

python scripts/text_editing_click2mask.py --image_path "examples/example1/img1.jpg" --prompt "a sea monster" --output_dir "outputs" 
  1. A window will pop to enable a clicked point over the input image. Once you have clicked with the mouse, press "Enter".

  2. The clicked point will be saved in the input directory as "path/to/input/image_click.jpg" for future use. For example:

python scripts/text_editing_click2mask.py --image_path "examples/example2_existing_click/img2.jpg" --prompt "a sea monster" --output_dir "outputs" 
  1. If you wish to change the clicked point in future use, delete it or add the argument "--refresh_click":
python scripts/text_editing_click2mask.py --image_path "examples/example1/img1.jpg" --refresh_click --prompt "a sea monster" --output_dir "outputs" 

Evaluating Edited Regions in Maskless Methods

We introduce Edited Alpha-CLIP to evaluate mask-free methods by extracting a mask of the edited region and using Alpha-CLIP to assess its alignment with the prompt.
Examples of mask extractions: outputs are on the left, extracted masks (green overlay) on the right.

To run Edited Alpha-CLIP similarity tests for methods comparison, here is a usage example (see documentation in script):

from scripts.similarity_tests.edited_alpha_clip import EditedAlphaCLip
edited_ac = EditedAlphaCLip()
image_in_p = "examples/edited_alpha_clip/input.png"
image_out_p = "examples/edited_alpha_clip/magic_brush.jpg"
prompt = "A bench"
save_outs = "outputs/edited_alpha_clip/bench_mb"
similarity = edited_ac.edited_alpha_clip_sim(image_in_p, image_out_p, prompt, save_outs)

A higher result is better.

Citation

If you find this helpful for your research, please reference the following:

@inproceedings{regev2025click2mask,
    title={Click2Mask: Local Editing with Dynamic Mask Generation},
    author={Regev, Omer and Avrahami, Omri and Lischinski, Dani},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={39},
    number={7},
    pages={6713-6721},
    year={2025},
    url={https://arxiv.org/abs/2409.08272},
    note={Full version with appendices available on arXiv}
}

Acknowledgements

This code is based on Blended Latent Diffusion and Stable Diffusion, and utilizes AlphaCLIP.

Releases

No releases published

Packages

 
 
 

Contributors