Use Amazon Titan models for image generation, editing, and searching

Amazon Bedrock provides a broad range of high-performing foundation models from Amazon and other leading AI companies, including Anthropic, AI21, Meta, Cohere, and Stability AI, and covers a wide range of use cases, including text and image generation, searching, chat, reasoning and acting agents, and more. The new Amazon Titan Image Generator model allows content creators to quickly generate high-quality, realistic images using simple English text prompts. The advanced AI model understands complex instructions with multiple objects and returns studio-quality images suitable for advertising, ecommerce, and entertainment. Key features include the ability to refine images by iterating on prompts, automatic background editing, and generating multiple variations of the same scene. Creators can also customize the model with their own data to output on-brand images in a specific style. Importantly, Titan Image Generator has in-built safeguards, like invisible watermarks on all AI-generated images, to encourage responsible use and mitigate the spread of disinformation. This innovative technology makes producing custom images in large volume for any industry more accessible and efficient. The new Amazon Titan Multimodal Embeddings model  helps build more accurate search and recommendations by understanding text, images, or both. It converts images and English text into semantic vectors, capturing meaning and relationships in your data. You can combine text and images like product descriptions and photos to identify items more effectively. The vectors power speedy, accurate search experiences. Titan Multimodal Embeddings is flexible in vector dimensions, enabling optimization for performance needs. An asynchronous API and Amazon OpenSearch Service connector make it easy to integrate the model into your neural search applications. In this post, we walk through how to use the Titan Image Generator and Titan Multimodal Embeddings models via the AWS Python SDK. Image generation and editing In this section, we demonstrate the basic coding patterns for using the AWS SDK to generate new images and perform AI-powered edits on existing images. Code examples are provided in Python, and JavaScript (Node.js) is also available in this GitHub repository. Before you can write scripts that use the Amazon Bedrock API, you need to install the appropriate version of the AWS SDK in your environment. For Python scripts, you can use the AWS SDK for Python (Boto3). Python users may also want to install the Pillow module, which facilitates image operations like loading and saving images. For setup instructions, refer to the GitHub repository. Additionally, enable access to the Amazon Titan Image Generator and Titan Multimodal Embeddings models. For more information, refer to Model access. Helper functions The following function sets up the Amazon Bedrock Boto3 runtime client and generates images by taking payloads of different configurations (which we discuss later in this post): import boto3 import json, base64, io from random import randint from PIL import Image bedrock_runtime_client = boto3.client(“bedrock-runtime”) def titan_image( payload: dict, num_image: int = 2, cfg: float = 10.0, seed: int = None, modelId: str = “amazon.titan-image-generator-v1”, ) -> list: # ImageGenerationConfig Options: # – numberOfImages: Number of images to be generated # – quality: Quality of generated images, can be standard or premium # – height: Height of output image(s) # – width: Width of output image(s) # – cfgScale: Scale for classifier-free guidance # – seed: The seed to use for reproducibility seed = seed if seed is not None else randint(0, 214783647) body = json.dumps( { **payload, “imageGenerationConfig”: { “numberOfImages”: num_image, # Range: 1 to 5 “quality”: “premium”, # Options: standard/premium “height”: 1024, # Supported height list above “width”: 1024, # Supported width list above “cfgScale”: cfg, # Range: 1.0 (exclusive) to 10.0 “seed”: seed, # Range: 0 to 214783647 }, } ) response = bedrock_runtime_client.invoke_model( body=body, modelId=modelId, accept=”application/json”, contentType=”application/json”, ) response_body = json.loads(response.get(“body”).read()) images = [ Image.open(io.BytesIO(base64.b64decode(base64_image))) for base64_image in response_body.get(“images”) ] return images Generate images from text Scripts that generate a new image from a text prompt follow this implementation pattern: Configure a text prompt and optional negative text prompt. Use the BedrockRuntime client to invoke the Titan Image Generator model. Parse and decode the response. Save the resulting images to disk. Text-to-image The following is a typical image generation script for the Titan Image Generator model: # Text Variation # textToImageParams Options: #   text: prompt to guide the model on how to generate variations #   negativeText: prompts to guide the model on what you don’t want in image images = titan_image( { “taskType”: “TEXT_IMAGE”, “textToImageParams”: { “text”: “two dogs walking down an urban street, facing the camera”, # Required “negativeText”: “cars”, # Optional }, } ) This will produce images similar to the following. Response Image 1 Response Image 2 Image variants Image variation provides a way to generate subtle variants of an existing image. The following code snippet uses one of the images generated in the previous example to create variant images: # Import an input image like this (only PNG/JPEG supported): with open(“<YOUR_IMAGE_FILE_PATH>”, “rb”) as image_file: input_image = base64.b64encode(image_file.read()).decode(“utf8”) # Image Variation # ImageVariationParams Options: #   text: prompt to guide the model on how to generate variations #   negativeText: prompts to guide the model on what you don’t want in image #   images: base64 string representation of the input image, only 1 is supported images = titan_image( { “taskType”: “IMAGE_VARIATION”, “imageVariationParams”: { “text”: “two dogs walking down an urban street, facing the camera”, # Required “images”: [input_image], # One image is required “negativeText”: “cars”, # Optional }, }, ) This will produce images similar to the following. Original Image Response Image 1 Response Image 2 Edit an existing image The Titan Image Generator model allows you to add, remove, or replace elements or areas within an existing image. You specify which area to affect by providing one of the following: Mask image – A mask image is a binary image in which the 0-value pixels represent the area you want to affect and the 255-value pixels represent the area that should remain unchanged. Mask prompt – A mask prompt is a natural language text description of the elements you want to affect, that uses an in-house text-to-segmentation model. For more information, refer to Prompt Engineering Guidelines. Scripts that apply an edit to an image follow this implementation pattern: Load the image to be edited from disk. Convert the image to a base64-encoded string. Configure the mask through one of the following methods: Load a mask image from disk, encoding it as base64 and setting it as the maskImage parameter. Set the maskText parameter to a text description of the elements to affect. Specify the new content to be generated using one of the following options: To add or replace an element, set the text parameter to a description of the new content. To remove an element, omit the text parameter completely. Use the BedrockRuntime client to invoke the Titan Image Generator model. Parse and decode the response. Save the resulting images to disk. Object editing: Inpainting with a mask image The following is a typical image editing script for the Titan Image Generator model using maskImage. We take one of the images generated earlier and provide a mask image, where 0-value pixels are rendered as black and 255-value pixels as white. We also replace one of the dogs in the image with a cat using a text prompt. with open(“<YOUR_MASK_IMAGE_FILE_PATH>”, “rb”) as image_file: mask_image = base64.b64encode(image_file.read()).decode(“utf8”) # Import an input image like this (only PNG/JPEG supported): with open(“<YOUR_ORIGINAL_IMAGE_FILE_PATH>”, “rb”) as image_file: input_image = base64.b64encode(image_file.read()).decode(“utf8”) # Inpainting # inPaintingParams Options: #   text: prompt to guide inpainting #   negativeText: prompts to guide the model on what you don’t want in image #   image: base64 string representation of the input image #   maskImage: base64 string representation of the input mask image #   maskPrompt: prompt used for auto editing to generate mask images = titan_image( { “taskType”: “INPAINTING”, “inPaintingParams”: { “text”: “a cat”, # Optional “negativeText”: “bad quality, low res”, # Optional “image”: input_image, # Required “maskImage”: mask_image, }, }, num_image=3, ) This will produce images similar to the following. Original Image Mask Image Edited Image Object removal: Inpainting with a mask prompt In another example, we use maskPrompt to specify an object in the image, taken from the earlier steps, to edit. By omitting the text prompt, the object will be removed: # Import an input image like this (only PNG/JPEG supported): with open(“<YOUR_IMAGE_FILE_PATH>”, “rb”) as image_file: input_image = base64.b64encode(image_file.read()).decode(“utf8”) images = titan_image( { “taskType”: “INPAINTING”, “inPaintingParams”: { “negativeText”: “bad quality, low…

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Prev
Coastal Road Project Set To Redefine Mumbai s Western Suburbs

Coastal Road Project Set To Redefine Mumbai s Western Suburbs

The Coastal Road project is not merely a feat of infrastructure but a catalyst

Next
Python packages caught using DLL sideloading to bypass security

Python packages caught using DLL sideloading to bypass security

ReversingLabs researchers have uncovered Python packages using DLL sideloading

You May Also Like