Visual ChatGPT

Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.

论文: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

模型文件下载：

bash download.sh

代码：

python visual_chatgpt.py

Quick Start

# create a new environment
conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

#  prepare the basic environments
pip install -r requirement.txt

# download the visual foundation models
bash download.sh

# prepare your private openAI private key
export OPENAI_API_KEY={Your_Private_Openai_Key}

# create a folder to save images
mkdir ./image

# Start Visual ChatGPT !
python visual_chatgpt.py

GPU memory usage

Here we list the GPU memory usage of each visual foundation model, one can modify self.tools with fewer visual foundation models to save your GPU memory:

Foundation Model	Memory Usage (MB)
ImageEditing	6667
ImageCaption	1755
T2I	6677
canny2image	5540
line2image	6679
hed2image	6679
scribble2image	6679
pose2image	6681
BLIPVQA	2709
seg2image	5540
depth2image	6677
normal2image	3974
InstructPix2Pix	2795

Acknowledgement

We appreciate the open source of the following projects:

Hugging Face LangChain Stable Diffusion ControlNet InstructPix2Pix CLIPSeg BLIP

README.md 2.2 KB Permalink History Raw

Visual ChatGPT

Quick Start

GPU memory usage

Acknowledgement

README.md 2.2 KB

Permalink History Raw