Behind the Scenes: A Short AI Exploration for our 2023 Holiday Greeting Video

AI, General - December 19, 2023

At I72 we thought it was the season to infuse a bit of the latest cool tech into the holiday cheer. As we used all sorts of neat AI tools through the year in our work, we figured why not try some more for our holiday greeting! Playing and editing our video we learned a few neat things and certainly were met with our share of surprises (both good and bad!).

Upon the launch of ChatGPT last year we created a little game at the agency challenging ourselves to come up with the quirkiest poems and limericks. Here is one winner:

Hot dogs and buns, oh what a sight

Sold in different numbers, what a plight

Why can’t they be sold, hand in hand

Together forever, in a single stand

But no, they must be sold apart

Leaving us with one bun, or a surplus of heart

The absurdity of it all, makes us want to shout

“Why can’t we just have the right amount, without a doubt?”

This time for the holiday we wanted to create a little poem about web design and development and pair it with a video.

First, we used Google’s new Bard AI to compare to ChatGPT for writing the script. Some of our first attempts with Bard yielded good rhymes but made it difficult to pair with visuals. We created this first draft:

This was a good start; however, we wanted to mention our clients in a familiar rhyme and after a few tweaks we used this prompt in ChatGPT:

After a few requests to shorten the results (a prompt for 10 lines often yielded more) we got the following poem:

Now it was time to give the lines life with some voiceover. We tried Speechify and Natural Reader but found they lacked the ability to fine-tune the output.

We decided to use PlayHT instead. This AI tool allowed for adding “emotion” to individual lines and generated multiple versions with the same voice. After a few regenerations and speed changes we heard our poem. Listen to [just] the audio here.

The imagery was the next step. Here we used ChatGPT to generate ideas for the individual lines. The first prompts created suggestions that were a bit too abstract.

But with a few refinements to the prompts we got something a bit more concrete but which we ultimately deviated from in many ways.

Now it was time to create the actual images. First we created the stills using Photoshop’s new Generative Fill tool. Generative Fill lets users add or remove content from images. It uses machine learning to analyze pixels, match color, lighting, and perspective to create realistic results based on a prompt ala Dall-E2.

We like that it creates three versions of your prompt to choose from right off the bat.

Photoshop performed very well and was much easier to use than the rudimentary “layout” of Discord needed to use MidJourney.

In order to create animations from the static images to “3D” form we used RunwayML. Runway is an applied AI research company with various video editing creation tools such as text to video. We found their site intuitive and robust.

The text to video feature was easy to use and simply required uploading the image file we had made in PS. A 4 second video is generated in a couple minutes with an option to expand in increments of 4.

Animating people using RunwayML did not yield the best results when it came to anatomy 🫣. This seems to be a common problem with AI generators but we have certainly seen improvement over time beginning with the rudimentary results of AI tool Craiyon to the latest version of Dall-E.

However, RunwayML handled other things very well; for example, just from an image of a sparkler it intuited that this is something that sparkles. The hand here though does get garbled in the last frames:

We also had fun making some crazier clips:

Next it was time to splice together the individual clips into a single video. CapCut was our first choice. This web software has great quick features to generate videos for popular platforms in the right ratios and with enticing effects and royalty-free tunes. We tried out one using mainly some stills:

The music was generated automatically as were the transitions and visual effects through AI. This was a cute test but ultimately it was hard to edit, especially on the audio end.

Capcut has ready-made transitions and effects on the left side panel and the ability to shorten individual clips and audio much like AfterEffects. Ultimately the AI features of CapCut were limited so the video was stitched manually with the effects and transitions plugged in. It was not clear whether the effects themselves were AI-generated or as-is.

Lastly, for the start and end screens we created images with another generator – Adobe Firefly. Firefly has some neat features such as built-in “match image to style” unlike Dall-E. Firefly generates images but is special because it is trained with images from Creative Commons, the public domain etc so it does not run afoul of copyright protections and the concerns of creatives in that respect.

To top it off we used the 3D text effects in PS to add blocky text on top alongside our logo:

And here is the final product!

In conclusion, we had a lot of fun using AI to make our greeting. While AI was certainly powerful and gave us plenty of shortcuts, it required a lot of human input to make important edits and creative decisions starting with the initial idea, poem editing, image creation etc.

Happy holidays!

The I72 Team