Skip to content

Commit 3f0a431

Browse files
author
Tom Daniel
committedNov 26, 2024
make image generation prompt a config
1 parent ead313f commit 3f0a431

File tree

3 files changed

+12
-3
lines changed

3 files changed

+12
-3
lines changed
 

‎.env.example

+4
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,10 @@ POST_INTERVAL_MAX= # Default: 180
4141
IMAGE_GEN= # Set to TRUE to enable image generation
4242
USE_OPENAI_EMBEDDING= # Set to TRUE for OpenAI, leave blank for local
4343

44+
#Generation Prompts
45+
SYSTEM_PROMPT= # Leave blank for empty system prompt or defined in character config
46+
IMAGE_GENERATION_PROMPT= # Leave blank for default image generation prompt or defined in character config
47+
4448
# OpenRouter Models
4549
OPENROUTER_MODEL= # Default: uses hermes 70b/405b
4650
SMALL_OPENROUTER_MODEL=

‎packages/core/src/types.ts

+2
Original file line numberDiff line numberDiff line change
@@ -599,6 +599,8 @@ export type Character = {
599599

600600
/** Optional system prompt */
601601
system?: string;
602+
/** Optional image generation prompt */
603+
imageGenerationPrompt?: string;
602604

603605
/** Model provider to use */
604606
modelProvider: ModelProviderName;

‎packages/plugin-image-generation/src/index.ts

+6-3
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ import {
1010
Plugin,
1111
State,
1212
} from "@ai16z/eliza";
13-
import { generateImage } from "@ai16z/eliza";
13+
import { generateCaption, generateImage, settings } from "@ai16z/eliza";
1414

1515
import fs from "fs";
1616
import path from "path";
@@ -32,6 +32,8 @@ About {{agentName}}:
3232
Write a two sentence image description that considers the <user_message> and may also include {{adjective}} about {{topic}} (without mentioning {{topic}} directly), from the perspective of {{agentName}}. Try to write something totally different than previous posts. Do not add commentary or acknowledge this request, just write the description of the image to be generated.
3333
Your response should not contain any questions. Brief, concise statements only. No emojis. Use \\n\\n (double spaces) between statements.`;
3434

35+
const imageGenerationPrompt = "You are an AI assistant specialized in crafting effective prompts for image generation. Your task is to analyze a user's message and create a comprehensive, natural-language prompt that will guide an image generation algorithm to produce high-quality, visually appealing images.\n\nBegin by analyzing the content of the user's message. Follow these steps:\n\n1. List out key elements from the user's message, categorizing them to ensure comprehensive coverage:\n * Topic: The main subject or scene with specific details\n * Material: The medium or style (e.g., digital painting, 3D render)\n * Style: The artistic direction (e.g., fantasy, vaporwave)\n * Artist: Specific artists to influence the visual style\n * Webpage Influence: Art platforms like ArtStation or DeviantArt for quality enhancement\n * Sharpness: Terms like \"sharp focus\" or \"highly detailed\" for clarity\n * Extra Details: Descriptors to enhance atmosphere (e.g., cinematic, dystopian)\n * Shade and Color: Color-related keywords to control mood (e.g., moody lighting)\n * Lighting and Brightness: Specific lighting styles (e.g., dramatic shadows)\n * Camera Angle: Perspective and framing (e.g., close-up, wide shot, aerial view)\n * Composition: Layout guidance (e.g., rule of thirds, centered, dynamic)\n * Time Period: Temporal context if relevant\n * Cultural Elements: Any specific cultural influences\n * Textures: Surface quality descriptions\n * Weather/Atmosphere: Environmental conditions if applicable\n * Negative Prompts: Elements to exclude from the image\n\n2. Brainstorm complementary elements that would enhance the user's vision:\n * Suggest fitting artists and styles if not specified\n * Consider atmospheric elements that would strengthen the concept\n * Identify potential technical aspects that would improve the result\n * Note any elements that should be avoided to maintain the desired look\n\n3. Construct your final prompt by:\n * Leading with the most important scene/subject details from the user's message\n * Incorporating all relevant technical and stylistic elements\n * Grouping related concepts together naturally\n * Maintaining clear, flowing language throughout\n * Adding complementary details that enhance but don't alter the core concept\n * Concluding with negative prompts separated by a \"Negative:\" marker\n\nRemember:\n- Preserve ALL specific details from the user's original message\n- Don't force details into a rigid template\n- Create a cohesive, readable description\n- Keep the focus on the user's core concept while enhancing it with technical and artistic refinements\n\nYour output should contain ONLY the final prompt text, with no additional explanations, tags, or formatting.";
36+
3537
export function saveBase64Image(base64Data: string, filename: string): string {
3638
// Create generatedImages directory if it doesn't exist
3739
const imageDir = path.join(process.cwd(), "generatedImages");
@@ -122,7 +124,7 @@ const imageGeneration: Action = {
122124

123125
const agentImagePrompt = await generateText({
124126
runtime,
125-
context: `${agentContext}\n\n<user message>${message.content.text}</user message>`,
127+
context: `${agentContext}\n\n<user_message>${message.content.text}</user_message>`,
126128
modelClass: ModelClass.SMALL,
127129
});
128130

@@ -133,7 +135,8 @@ const imageGeneration: Action = {
133135
const userId = runtime.agentId;
134136
elizaLogger.log("User ID:", userId);
135137

136-
const context = `You are an AI assistant specialized in crafting effective prompts for image generation. Your task is to analyze a user's message and create a comprehensive, natural-language prompt that will guide an image generation algorithm to produce high-quality, visually appealing images.\n\nHere is the user's message:\n<user_message> ${agentImagePrompt} </user_message>\n\nBegin by analyzing the content of the user's message. Follow these steps:\n\n1. List out key elements from the user's message, categorizing them to ensure comprehensive coverage:\n * Topic: The main subject or scene with specific details\n * Material: The medium or style (e.g., digital painting, 3D render)\n * Style: The artistic direction (e.g., fantasy, vaporwave)\n * Artist: Specific artists to influence the visual style\n * Webpage Influence: Art platforms like ArtStation or DeviantArt for quality enhancement\n * Sharpness: Terms like "sharp focus" or "highly detailed" for clarity\n * Extra Details: Descriptors to enhance atmosphere (e.g., cinematic, dystopian)\n * Shade and Color: Color-related keywords to control mood (e.g., moody lighting)\n * Lighting and Brightness: Specific lighting styles (e.g., dramatic shadows)\n * Camera Angle: Perspective and framing (e.g., close-up, wide shot, aerial view)\n * Composition: Layout guidance (e.g., rule of thirds, centered, dynamic)\n * Time Period: Temporal context if relevant\n * Cultural Elements: Any specific cultural influences\n * Textures: Surface quality descriptions\n * Weather/Atmosphere: Environmental conditions if applicable\n * Negative Prompts: Elements to exclude from the image\n\n2. Brainstorm complementary elements that would enhance the user's vision:\n * Suggest fitting artists and styles if not specified\n * Consider atmospheric elements that would strengthen the concept\n * Identify potential technical aspects that would improve the result\n * Note any elements that should be avoided to maintain the desired look\n\n3. Construct your final prompt by:\n * Leading with the most important scene/subject details from the user's message\n * Incorporating all relevant technical and stylistic elements\n * Grouping related concepts together naturally\n * Maintaining clear, flowing language throughout\n * Adding complementary details that enhance but don't alter the core concept\n * Concluding with negative prompts separated by a "Negative:" marker\n\nRemember:\n- Preserve ALL specific details from the user's original message\n- Don't force details into a rigid template\n- Create a cohesive, readable description\n- Keep the focus on the user's core concept while enhancing it with technical and artistic refinements\n\nYour output should contain ONLY the final prompt text, with no additional explanations, tags, or formatting.`;
138+
const context = runtime.character.system ??
139+
settings.SYSTEM_PROMPT ?? imageGenerationPrompt + `\n\nHere is the user's message:\n<user_message> ${agentImagePrompt} </user_message>`;
137140

138141
const imagePrompt = await generateText({
139142
runtime,

0 commit comments

Comments
 (0)