Skip to content

Commit 6a98ced

Browse files
burtenshawpcuencaVaibhavs10
authored
[inference providers] tutorial for image editing with inference providers (#1880)
* tutorial using qwen imge edit and inference providers * add to toc * add flux kontext as an example * Update docs/inference-providers/guides/image-editor.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * add requirements txt * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Apply suggestions from code review Co-authored-by: vb <vaibhavs10@gmail.com> * add tldr * format tldr --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: vb <vaibhavs10@gmail.com>
1 parent cdf0995 commit 6a98ced

File tree

2 files changed

+350
-0
lines changed

2 files changed

+350
-0
lines changed

docs/inference-providers/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@
5454
title: Function Calling
5555
- local: guides/gpt-oss
5656
title: How to use OpenAI gpt-oss
57+
- local: guides/image-editor
58+
title: Build an Image Editor
5759

5860

5961
- title: API Reference
Lines changed: 348 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,348 @@
1+
# Building an AI Image Editor with Gradio and Inference Providers
2+
3+
In this guide, we'll build an AI-powered image editor that lets users upload images and edit them using natural language prompts. This project demonstrates how to combine Inference Providers with image-to-image models like [Qwen's Image Edit](https://huggingface.co/Qwen/Qwen-Image-Edit) and [Black Forest Labs' Flux Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev).
4+
5+
Our app will:
6+
7+
1. **Accept image uploads** through a web interface
8+
2. **Process natural language prompts** editing instructions like "Turn the cat into a tiger"
9+
3. **Transform images** using Qwen Image Edit or FLUX.1 Kontext
10+
4. **Display results** in a Gradio interface
11+
12+
<Tip>
13+
14+
TL;DR - this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit).
15+
16+
</Tip>
17+
18+
## Step 1: Set Up Authentication
19+
20+
Before we start coding, authenticate with Hugging Face using your token:
21+
22+
```bash
23+
# Get your token from https://huggingface.co/settings/tokens
24+
export HF_TOKEN="your_token_here"
25+
```
26+
27+
<Tip>
28+
29+
This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co).
30+
31+
</Tip>
32+
33+
When you set this environment variable, it handles authentication automatically for all your inference calls. You can generate a token from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained).
34+
35+
## Step 2: Project Setup
36+
37+
Create a new project directory and initialize it with uv:
38+
39+
```bash
40+
mkdir image-editor-app
41+
cd image-editor-app
42+
uv init
43+
```
44+
45+
This creates a basic project structure with a `pyproject.toml` file. Now add the required dependencies:
46+
47+
```bash
48+
uv add huggingface-hub>=0.34.4 gradio>=5.0.0 pillow>=11.3.0
49+
```
50+
51+
The dependencies are now installed and ready to use! Also, `uv` will maintain the `pyproject.toml` file for you as you add dependencies.
52+
53+
<Tip>
54+
55+
We're using `uv` because it's a fast Python package manager that handles dependency resolution and virtual environment management automatically. It's much faster than pip and provides better dependency resolution. If you're not familiar with `uv`, check it out [here](https://docs.astral.sh/uv/).
56+
57+
</Tip>
58+
59+
## Step 3: Build the Core Image Editing Function
60+
61+
Now let's create the main logic for our application - the image editing function that transforms images using AI.
62+
63+
Create `main.py` then import the necessary libraries and instantiate the InferenceClient. We're using the `fal-ai` provider for fast image processing, but other providers like `replicate` are also available.
64+
65+
```python
66+
import os
67+
import gradio as gr
68+
from huggingface_hub import InferenceClient
69+
import io
70+
71+
# Initialize the client with fal-ai provider for fast image processing
72+
client = InferenceClient(
73+
provider="fal-ai",
74+
api_key=os.environ["HF_TOKEN"],
75+
)
76+
```
77+
78+
Now let's create the image editing function. This function takes an input image and a prompt, and returns an edited image. We also want to handle errors gracefully and return the original image if there's an error, so our UI always shows something.
79+
80+
```python
81+
def edit_image(input_image, prompt):
82+
"""
83+
Edit an image using the given prompt.
84+
85+
Args:
86+
input_image: PIL Image object from Gradio
87+
prompt: String prompt for image editing
88+
89+
Returns:
90+
PIL Image object (edited image)
91+
"""
92+
if input_image is None:
93+
return None
94+
95+
if not prompt or prompt.strip() == "":
96+
return input_image
97+
98+
try:
99+
# Convert PIL Image to bytes
100+
img_bytes = io.BytesIO()
101+
input_image.save(img_bytes, format="PNG")
102+
img_bytes = img_bytes.getvalue()
103+
104+
# Use the image_to_image method with Qwen's image editing model
105+
edited_image = client.image_to_image(
106+
img_bytes,
107+
prompt=prompt.strip(),
108+
model="Qwen/Qwen-Image-Edit",
109+
)
110+
111+
return edited_image
112+
113+
except Exception as e:
114+
print(f"Error editing image: {e}")
115+
return input_image
116+
```
117+
118+
<Tip>
119+
120+
We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications.
121+
122+
However, you can experiment with different providers for various performance characteristics:
123+
124+
```python
125+
client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"])
126+
client = InferenceClient(provider="auto", api_key=os.environ["HF_TOKEN"]) # Automatic selection
127+
```
128+
129+
</Tip>
130+
131+
## Step 4: Create the Gradio Interface
132+
133+
Now let's build a simple user-friendly interface using Gradio.
134+
135+
```python
136+
# Create the Gradio interface
137+
with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface:
138+
gr.Markdown(
139+
"""
140+
# 🎨 AI Image Editor
141+
Upload an image and describe how you want to edit it using natural language!
142+
"""
143+
)
144+
145+
with gr.Row():
146+
with gr.Column():
147+
input_image = gr.Image(label="Upload Image", type="pil", height=400)
148+
prompt = gr.Textbox(
149+
label="Edit Prompt",
150+
placeholder="Describe how you want to edit the image...",
151+
lines=2,
152+
)
153+
edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg")
154+
155+
with gr.Column():
156+
output_image = gr.Image(label="Edited Image", type="pil", height=400)
157+
158+
# Example images and prompts
159+
with gr.Row():
160+
gr.Examples(
161+
examples=[
162+
["cat.png", "Turn the cat into a tiger"],
163+
["cat.png", "Make it look like a watercolor painting"],
164+
["cat.png", "Change the background to a forest"],
165+
],
166+
inputs=[input_image, prompt],
167+
outputs=output_image,
168+
fn=edit_image,
169+
cache_examples=False,
170+
)
171+
172+
# Event handlers
173+
edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
174+
175+
# Allow Enter key to trigger editing
176+
prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
177+
```
178+
179+
In this app we'll use some practical Gradio features to make a user-friendly app
180+
181+
- We'll use blocks to create a two column layout with the image upload and the edited image.
182+
- We'll drop some markdown into to explain what the app does.
183+
- And, we'll use `gr.Examples` to show some example inputs to give the user some inspiration.
184+
185+
Finally, add the launch configuration at the end of `main.py`:
186+
187+
```python
188+
if __name__ == "__main__":
189+
interface.launch(
190+
share=True, # Creates a public link
191+
server_name="0.0.0.0", # Allow external access
192+
server_port=7860, # Default Gradio port
193+
show_error=True, # Show errors in the interface
194+
)
195+
```
196+
197+
Now run your application:
198+
199+
```bash
200+
python main.py
201+
```
202+
203+
Your app will launch locally at `http://localhost:7860` and Gradio will also provide a public shareable link!
204+
205+
206+
## Complete Working Code
207+
208+
<details>
209+
<summary><strong>📋 Click to view the complete main.py file</strong></summary>
210+
211+
```python
212+
import os
213+
import gradio as gr
214+
from huggingface_hub import InferenceClient
215+
from PIL import Image
216+
import io
217+
218+
# Initialize the client
219+
client = InferenceClient(
220+
provider="fal-ai",
221+
api_key=os.environ["HF_TOKEN"],
222+
)
223+
224+
def edit_image(input_image, prompt):
225+
"""
226+
Edit an image using the given prompt.
227+
228+
Args:
229+
input_image: PIL Image object from Gradio
230+
prompt: String prompt for image editing
231+
232+
Returns:
233+
PIL Image object (edited image)
234+
"""
235+
if input_image is None:
236+
return None
237+
238+
if not prompt or prompt.strip() == "":
239+
return input_image
240+
241+
try:
242+
# Convert PIL Image to bytes
243+
img_bytes = io.BytesIO()
244+
input_image.save(img_bytes, format="PNG")
245+
img_bytes = img_bytes.getvalue()
246+
247+
# Use the image_to_image method
248+
edited_image = client.image_to_image(
249+
img_bytes,
250+
prompt=prompt.strip(),
251+
model="Qwen/Qwen-Image-Edit",
252+
)
253+
254+
return edited_image
255+
256+
except Exception as e:
257+
print(f"Error editing image: {e}")
258+
return input_image
259+
260+
# Create Gradio interface
261+
with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface:
262+
gr.Markdown(
263+
"""
264+
# 🎨 AI Image Editor
265+
Upload an image and describe how you want to edit it using natural language!
266+
"""
267+
)
268+
269+
with gr.Row():
270+
with gr.Column():
271+
input_image = gr.Image(label="Upload Image", type="pil", height=400)
272+
prompt = gr.Textbox(
273+
label="Edit Prompt",
274+
placeholder="Describe how you want to edit the image...",
275+
lines=2,
276+
)
277+
edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg")
278+
279+
with gr.Column():
280+
output_image = gr.Image(label="Edited Image", type="pil", height=400)
281+
282+
# Example images and prompts
283+
with gr.Row():
284+
gr.Examples(
285+
examples=[
286+
["cat.png", "Turn the cat into a tiger"],
287+
["cat.png", "Make it look like a watercolor painting"],
288+
["cat.png", "Change the background to a forest"],
289+
],
290+
inputs=[input_image, prompt],
291+
outputs=output_image,
292+
fn=edit_image,
293+
cache_examples=False,
294+
)
295+
296+
# Event handlers
297+
edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
298+
299+
# Allow Enter key to trigger editing
300+
prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image)
301+
302+
if __name__ == "__main__":
303+
interface.launch(
304+
share=True, # Creates a public link
305+
server_name="0.0.0.0", # Allow external access
306+
server_port=7860, # Default Gradio port
307+
show_error=True, # Show errors in the interface
308+
)
309+
```
310+
311+
</details>
312+
313+
## Deploy on Hugging Face Spaces
314+
315+
Let's deploy our app to Hugging Face Spaces.
316+
317+
First, we will export our dependencies to a requirements file.
318+
319+
```bash
320+
uv export --format requirements-txt --output-file requirements.txt
321+
```
322+
323+
This creates a `requirements.txt` file with all your project dependencies and their exact versions from the lockfile.
324+
325+
<Tip>
326+
327+
The `uv export` command ensures that your Space will use the exact same dependency versions that you tested locally, preventing deployment issues caused by version mismatches.
328+
329+
</Tip>
330+
331+
Now you can deploy to Spaces:
332+
333+
1. **Create a new Space**: Go to [huggingface.co/new-space](https://huggingface.co/new-space)
334+
2. **Choose Gradio SDK** and make it public
335+
3. **Upload your files**: Upload `main.py`, `requirements.txt`, and any example images
336+
4. **Add your token**: In Space settings, add `HF_TOKEN` as a secret
337+
5. **Launch**: Your app will be live at `https://huggingface.co/spaces/your-username/your-space-name`
338+
339+
340+
# Next Steps
341+
342+
Congratulations! You've created a production-ready AI image editor. Now that you have a working image editor, here are some ideas to extend it:
343+
344+
- **Batch processing**: Edit multiple images at once
345+
- **Object removal**: Remove unwanted objects from images
346+
- **Provider comparison**: Benchmark different providers for your use case
347+
348+
Happy building! And remember to share your app with the community on the Hub.

0 commit comments

Comments
 (0)