With the evolution of AI in recent years, various tools and services have emerged. As part of using these from Emacs, I previously implemented an Emacs extension to generate text using the OpenAI API. I still use it regularly, but it did not support image generation. However, the OpenAI API itself provides models and APIs for image generation. In this article, I will use that to further enrich my GNU Emacs experience.
OpenAI's image generation API creates images from given text. This time, we will use the "DALL-E 3" model. First, let’s check how the image generation API works.
Here’s an example of a request to call OpenAI's image generation endpoint.
:ORIGIN = https://api.openai.com
:API_KEY := openai-api-key
POST :ORIGIN/v1/images/generations
Content-Type: application/json
Authorization: Bearer :API_KEY
{
"model": "dall-e-3",
"prompt": "pretty cat",
"n": 1,
"size": "1024x1024"
}
{
"created": 1729945494,
"data": [
{
"revised_prompt": "Create an image",
"url": "https://example.com/DUMMY/img-NIXf.png?st=DUMMY&se=DUMMY&sp=r&sv=DUMMY&sr=DUMMY&rscd=inline&rsct=image/png&skoid=DUMMY&sktid=DUMMY&skt=DUMMY&ske=DUMMY&sks=DUMMY&skv=DUMMY&sig=DUMMY"
}
]
}
// POST DUMMY/v1/images/generations
// HTTP/1.1 200 OK
// Date: Sat, 26 Oct 2024 12:24:54 GMT
// Content-Type: application/json
// Content-Length: 1140
// Connection: keep-alive
// openai-version: 2020-10-01
// access-control-allow-origin: *
When this request is sent, the API returns a JSON response containing the URL of the generated image. The `data.url` contains the URL of the generated image, which can be accessed to download the image.
Let’s write Emacs Lisp to generate images using this API from GNU Emacs.
;;; openai-image --- OpenAI API Imaging Utility -*- lexical-binding: t -*-
;; Copyright (C) 2024 TakesxiSximada
;; Author: TakesxiSximada <[email protected]>
;; Maintainer: TakesxiSximada <[email protected]>
;; Repository:
;; Version: 1
;; Package-Version: 20241027.0000
;; Package-Requires: ((emacs "28.0")
;; Date: 2024-10-27
;; This file is not part of GNU Emacs.
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation, either version 3 of the License, or
;; (at your option) any later version.
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;; You should have received a copy of the GNU General Public License
;; along with this program. If not, see <http://www.gnu.org/licenses/>.
;; https://platform.openai.com/docs/api-reference/images/create
;;
;; Request
;;
;; curl https://api.openai.com/v1/images/generations \
;; -H "Content-Type: application/json" \
;; -H "Authorization: Bearer $OPENAI_API_KEY" \
;; -d '{
;; "model": "dall-e-3",
;; "prompt": "A cute baby sea otter",
;; "n": 1,
;; "size": "1024x1024"
;; }'
;;
;; Response
;;
;; {
;; "created": 1589478378,
;; "data": [
;; {
;; "url": "https://..."
;; },
;; {
;; "url": "https://..."
;; }
;; ]
;; }
;;; Code:
(defvar openai-image-previous-ai-prompt "")
;;;###autoload
(defun openai-image-create-and-view (ai-prompt)
(interactive
(list (string-trim (read-string-from-buffer "Open AI Image"
openai-image-previous-ai-prompt))))
(when (not (string-empty-p ai-prompt))
(setq openai-image-previous-ai-prompt ai-prompt)
(make-process :name "*Open AI Image*"
:buffer (generate-new-buffer "*Open AI Image*")
:command `("curl" "https://api.openai.com/v1/images/generations"
"-X" "POST"
"-H" "Content-Type: application/json"
"-H" ,(format "Authorization: Bearer %s" openai-api-key)
"-d" ,(json-encode `(:model "dall-e-3" :n 1 :size "1024x1024" :prompt ,ai-prompt)))
:sentinel (lambda (process event)
(when (string-equal event "finished\n")
(let ((resp (with-current-buffer (process-buffer process)
(goto-char 0)
(json-parse-buffer))))
(eww (gethash "url" (aref (gethash "data" resp) 0)))))))))
(provide 'openai-image)
;; openai-image.el ends here
openai-image.el
In this code block, we generate an image through the OpenAI API and use `eww` to access the image URL included in the response.
Emacs has several ways to send HTTP requests 1. For this, we used `make-process` to call `curl` as a subprocess. It’s a very primitive method, but since it only uses Emacs’s basic functions, there’s less chance of getting lost in the maze of Emacs Lisp.
The prompt (string passed to the AI) is set to the previous prompt by default to allow multiple adjustments and retries for image generation. You can refine the generated image by modifying the prompt as needed.
Using OpenAI's image generation model "DALL-E 3", I generated images from GNU Emacs. Specifically, I implemented Emacs Lisp that calls the API using `curl` and displays the URL of the generated image. I also incorporated features for initial prompt setting and adjustments to the generated image. This has made creating various images easier. Once again, my GNU Emacs has become even more powerful.
There are also `url-retrieve`, `request.el`, and `plz`, but `request.el` and `plz` are not part of the standard library.