Spaces:
Runtime error
Runtime error
| title: GPT4V-Image-Captioner | |
| app_file: gpt-caption.py | |
| sdk: gradio | |
| sdk_version: 4.21.0 | |
| # GPT4V-Image-Captioner / GPT4V图像打标器 | |
| [中文版说明](https://github.com/jiayev/GPT4V-Image-Captioner/blob/main/README-CN.md) | |
| We now have [sd-webui-GPT4V-Image-Captioner](https://github.com/SleeeepyZhou/sd-webui-GPT4V-Image-Captioner) for SD WebUI | |
| This is a multifunctional image processing toolbox built with Gradio, capable of tagging images using the GPT-4-vision or Claude 3 API, the [cogVLM](https://github.com/THUDM/CogVLM) model, [Qwen-VL](https://huggingface.co/Qwen)(Alibaba Cloud), the [Moondream](https://github.com/vikhyat/moondream) model. | |
| Key features include: | |
| - One-click installation and use | |
| - Single image and multi-image batch tagging | |
| - Choice of online GPT4V or Claude 3 or [Qwen-VL](https://huggingface.co/Qwen)(Alibaba Cloud) & local CogVLM and Moondream models | |
| - Visual tag analysis and processing | |
| - Image pre-compression | |
| - Keyword filtering and watermark image recognition | |
| Developers: [Jiaye](https://civitai.com/user/jiayev1), [LEOSAM是只兔狲](https://civitai.com/user/LEOSAM), [SleeeepyZhou](https://civitai.com/user/SleeeepyZhou), [Fok](https://civitai.com/user/fok3827), GPT4. Welcome everyone to add more new features to this project. | |
|  | |
| ### Please note that the Claude 3 feature is not finished yet. | |
| To use Claude 3, simply replace the API key and URL with the Claude 3 API key and URL (/v1/messages), and changing the model name to "claude-3-opus" (or sonnet). | |
| # Installation and Startup Guide | |
| ### Windows (If the automatic installation fails, please refer to the [Manual Installation Instructions](#windows-manual-installation-instructions)) | |
| 1. Open Command Prompt as administrator and navigate to the directory where you want to clone the repository. | |
| 2. Clone the repository using the following command: | |
| ``` | |
| git clone https://github.com/jiayev/GPT4V-Image-Captioner | |
| ``` | |
| 3. Double-click `install_windows.bat` to run and install all necessary dependencies. | |
| 4. After the installation is complete, you can launch the GPT4V-Image-Captioner by double-clicking `start_windows.bat`. | |
| 5. Hold down Ctrl and click on the URL in the terminal (or copy the URL to your browser), which will open the Gradio app interface in your default browser. | |
| 6. Enter the official OpenAI or third-party GPT-4V API Key and API Url at the top of the interface. After setting the image address, you can start tagging the image. | |
| ### Linux / macOS | |
| 1. Open a terminal and navigate to the directory where you want to clone the repository. | |
| 2. Clone the repository using the following command: | |
| ``` | |
| git clone https://github.com/jiayev/GPT4V-Image-Captioner | |
| ``` | |
| 3. Navigate to the cloned directory: | |
| ``` | |
| cd GPT4V-Image-Captioner | |
| ``` | |
| 4. Make the install and start scripts executable with the following command: | |
| ``` | |
| chmod +x install_linux_mac.sh; chmod +x start_linux_mac.sh | |
| ``` | |
| 5. Execute the install script: | |
| ``` | |
| ./install_linux_mac.sh | |
| ``` | |
| 6. Launch the GPT4V-Image-Captioner in the terminal by executing the launch script: | |
| ``` | |
| ./start_linux_mac.sh | |
| ``` | |
| 7. Copy the URL displayed in the terminal and open it in your browser to access the Gradio app interface. | |
| 8. Enter the official OpenAI or third-party GPT-4V API Key and API Url at the top of the interface. After setting the image address, you can start tagging the image. | |
| ### Windows Manual Installation Instructions | |
| 1. Open the Command Prompt by pressing `Win + R`, typing `cmd`, and then pressing `Enter`. | |
| 2. Clone the repository to your local machine using the following command: | |
| ``` | |
| git clone https://github.com/jiayev/GPT4V-Image-Captioner | |
| ``` | |
| 3. Once cloning is complete, navigate to the cloned directory: | |
| ``` | |
| cd GPT4V-Image-Captioner | |
| ``` | |
| 4. Before installing any dependencies, make sure that Python is installed on your system. Check for Python's presence by typing the following command and pressing `Enter` in the Command Prompt: | |
| ``` | |
| python --version | |
| ``` | |
| If Python is not installed, you will get an error message. In that case, please visit the [Python official download page](https://www.python.org/downloads/) and follow the instructions to install it. | |
| 5. Create a virtual environment named `myenv` to avoid contaminating the global Python environment: | |
| ``` | |
| python -m venv myenv | |
| ``` | |
| 6. Activate the virtual environment you just created: | |
| ``` | |
| myenv\Scripts\activate | |
| ``` | |
| 7. Update `pip` to date: | |
| ``` | |
| python -m pip install --upgrade pip | |
| ``` | |
| 8. Install libraries within the virtual environment: | |
| ``` | |
| pip install scipy networkx wordcloud matplotlib Pillow tqdm gradio requests | |
| ``` | |
| 9. After completing the steps above, you can start GPT4V-Image-Captioner by double-clicking the `start_windows.bat` file. | |