How you can construct your personal meme generator with machine studying

Producing captions

I exploit two totally different implementations of GPT to generate the captions. There’s the newest GPT-3 Da Vinci mannequin from OpenAI that does a superb job, however you need to be enrolled of their beta program to make use of it. And there’s the open-source GPT-Neo mannequin from EleutherAI. The mannequin is loads smaller, nevertheless it’s free to make use of.

GPT-3 Da Vinci

OpenAI’s GPT-3 Da Vinci is at the moment the most important AI mannequin for Pure Language Processing. I’m utilizing their newest “zero-shot” fashion of prompting with their new Da Vinci Instruct mannequin. As a substitute of offering examples of what you’re asking the mannequin to do, you’ll be able to simply merely ask it what to do immediately.

Right here is the immediate that creates a caption for the apple pie image.

Create a humorous caption for a brand new meme about apple pie. The background image is Easy and simple apple pie served with vanilla ice cream, on a gingham tablecloth in Lysekil, Sweden.

I go the immediate into the decision to OpenAI together with some extra parameters. Right here’s the Python code.

import openai
response = openai.Completion.create(
engine=”davinci-instruct-beta”,
immediate=immediate,
max_tokens=64,
temperature=0.7,
top_p=0.5,
frequency_penalty=0.5,
presence_penalty=0.5,
best_of=1)

The max_token parameter signifies how lengthy the response must be. The temperature and top_p parameters are comparable in that they point out the quantity of selection within the response. The frequency_penalty and presence_penalty are additionally comparable in that they management how usually there are new deviations and new matters within the response. If you wish to know what all these parameters do, try my article from final month, here.

Earlier than I present examples of the output from GPT-3, right here is the authorized disclaimer that OpenAI means that I present, which is all true.

The creator generated the next textual content partially with GPT-3, OpenAI’s large-scale language-generation mannequin. Upon producing draft language, the creator reviewed and revised the language to their very own liking and takes final duty for the content material of this publication.

Operating the code 10 occasions will yield the next outcomes, at a complete price of $0.03. Be aware that I formatted the textual content to be in uppercase.

1: THIS IS THE PERFECT WAY TO END A DAY OF APPLE PICKING
2: NO, IT’S NOT THAT EASY
3: I’LL TAKE THE ONE WITH THE VANILLA ICE CREAM, PLEASE
4: APPLE PIE IS THE BEST!
5: THIS APPLE PIE IS SO GOOD, I CAN’T EVEN!
6: YOU’RE NOT THE ONLY ONE WHO LOVES APPLE PIE
7: IF YOU CAN’T FIND THE RECIPE, JUST GOOGLE IT
8: THE PIE IS GOOD, BUT IT’S NOT AS GOOD AS MY MOM’S
9: I’LL HAVE A SLICE OF THAT APPLE PIE, PLEASE
10: WE’RE GOING TO NEED A BIGGER PIE

OK, these are fairly good. One factor I realized is that GTP-3 Da Vinci will be humorous! For instance, caption quantity 2 appears to confer with the “straightforward as pie” idiom.

Be aware that GPT-3, like all AI fashions skilled on a big corpus of textual content, will replicate societal biases. Often the system will produce textual content that could be inappropriate or offensive. OpenAI has a characteristic to label generated textual content with one in every of three warning ranges: 0 – the textual content is secure, 1 – this textual content is delicate, or 2 – this textual content is unsafe. My code will present a warning for any of the generated captions which are flagged as delicate or unsafe.

GPT-Neo

GPT-Neo is a transformer mannequin created primarily by builders referred to as sdtblck and leogao2 on GitHub. The project is an implementation of “GPT-2 and GPT-3-style fashions utilizing the mesh-tensorflow library.” To date, their system is the scale of OpenAI’s GPT-3 Ada, their smallest mannequin. However GPT-Neo is on the market at no cost. I used the Huggingface Transformers interface to entry GPT-Neo from my Python code.

Since GPT-Neo doesn’t have “instruct” variations of their pre-trained fashions, I needed to write a “few-shot” immediate so as to get the system to generate captions for memes utilizing examples. Right here’s the immediate I wrote utilizing Catastrophe Lady and Grumpy Cat memes with instance captions.

Create a humorous caption for a meme. 

Theme: catastrophe lady
Picture description: An image of a lady taking a look at us as her home burns down
Caption: There was a spider. It’s gone now.

Theme: grumpy cat
Picture description: A face of a cat who appears to be like sad
Caption: I don’t like Mondays.

Theme: apple pie.
Picture description: Easy and simple apple pie served with vanilla ice cream, on a gingham tablecloth in Lysekil, Sweden.
Caption:

After setting the temperature parameter to 0.7 and the top_p to 1.0, I go the immediate into GPT-Neo to generate new captions. Right here’s the code to generate a caption.

from transformers import pipeline, AutoTokenizer
generator = pipeline(‘text-generation’,
system=0,
mannequin=’EleutherAI/gpt-neo-2.7B’)
outcomes = generator(immediate,
do_sample=True,
min_length=50,
max_length=150,
temperature=0.7,
top_p=1.0,
pad_token_id=gpt_neo_tokenizer.eos_token_id)

Listed here are the pattern outcomes.

1: I LOVE APPLE PIE
2: I CAN’T. I’M NOT ALLOWED
3: I LOVE THE SIMPLICITY OF AN APPLE PIE
4: APPLE PIE. THE ONLY THING BETTER THAN THIS IS A HOT BATH
5: I’M A PIE. YOU’RE A PIE
6: I LOVE PIE, AND THIS IS A GOOD ONE
7: I LOVE APPLES, BUT I’M NOT VERY GOOD AT BAKING
8: THE PIE IS DELICIOUS, BUT THE ICE CREAM IS NOT
9: I LOVE APPLE PIE. IT’S THE BEST
10: THE BEST FOOD IS WHEN YOU CAN TASTE THE DIFFERENCE BETWEEN THE FOOD AND THE TABLECLOTH

Hmmm. These are inferior to the GPT-3 captions. Most of them are fairly easy and never very humorous. Quantity 10 is simply plain absurd. However quantity 4 appears to be OK. Let’s use this as our caption.

The ultimate step is to compose the meme by writing the caption into the background picture.

Meme by AI-Memer