I’ve been extremely critical of neural net “AI” hype in the past. However, while there are not enough monkeys in the universe to recreate Shakespeare, if you’re willing to spam the create button over and over again, you may eventually luck out, and get the AI to make pretty images. These won’t necessarily be the images that you wanted, but they might at least be pretty standalones.
Neural nets are also pretty good at satire, since the images they create are inevitably wrong in some hilarious way. If that fits the tone that you’re going for, it’s no problem.

The hands add to the zaniness.
It’s no secret that my artistic talents are non-existent. Even for programmer art, this isn’t particularly competent. I’ve been reduced to creating enemies that are little more than geometric shapes, and I’m running out of shapes.
Hell, some programmers are able to create better art procedurally than I can in Aseprite, or any other art creation tool.

Yes, this is code. The article on DemoScene is coming later.
One of the main benefits of these AI art creation tools is, ostensibly, that they can create large amounts of relatively coherent art quickly. I don’t have access to a real artist, so I signed up for the free trials of five different AI art generation programs, and got to work.
The initial results were somewhat impressive. I wanted true top down artwork, in the style of Hotline:Miami. I clearly didn’t get that, but the pixel art that I did get would have been perfect for most isometric games.
The first few results were gave the wrong perspective, but I remained confident that I could simply tweak a few words here and there, or add another sentence specifying that the camera was to be looking directly downwards, and I’d get something more appropriate.
Unfortunately, I was wrong. Every single AI program I tried gave me blatantly incorrect artwork.
Me: Completely top down view.
Supersmart AI: Got it boss, from the side it is.
Me: The camera is directly overhead.
Genius 9000 IQ AI: No worries, we’ll place the camera directly to the side.
Me: Top down, Hotline Miami style.
Lovecraftian Eternal Intellect: It’s a plan. From the side, slightly underneath.
Getting an AI to do what you want is like trying to teach a retard how to do calculus. Or communicating with someone who isn’t an English speaker, but somehow thinks that they are. Incredibly frustrating at first, then eventually hilarious.
With a bunch of cajoling, I managed to create this.
Although the exact same prompt decided to create this the next time I used it, a top down image shot from underneath the waist of the subject.
I ran out of credits on most platforms, but had a few left on NightCafe Studio. I reached out to one of their help people.
Me: How on Earth do I get the AI to generate truly top down images for me? I just want prototypes for something that looks like Hotline:Miami. I specifically request true top down in various ways, including specifying that the camera is directly overhead, and all I get are these images where the camera is looking at the character from directly in front.
Sapphire (Helper): How much of the character are you describing? The AI doesn’t understand context, it is not as smart as most people expect it to be. It is just a mathematical algorithm that matches your prompt words to its training. This means that if you describe a bunch of details that would only really be able to be seen from the front of a character then that’s what the view is going to try to show you. But if you describe basically just “the view shows the top of their head” you are more likely to get a top down aerial view.
Me: Here’s an example of one of my prompts:
“Top-Down View (90*). Female video game protagonist. Camera directly overhead. Crown of head visible.”
Result is yet another image looking at a standing character from the front. Occasionally, I get a sort of 45* perspective from the top/front.
EDIT: I’m wondering if I can upload reference images on this site. Maybe if I fed it Hotline:Miami images for a while it would eventually get it.
Helper: Not exactly. The AI is not learning from us, it doesn’t remember images it has made previously. It was fully trained before it was made available to us. The exception is Lora fine-tuning models which allows pro-users to train the AI on a particular style or likeness, but this cannot be done using copyright material anyways.
Also, (90*) doesn’t mean what you want it to to the AI for the reasons I described above, it is not thinking and it doesn’t know what that number means.
…
This certainly isn’t the style you want, but here is an example using a simple prompt to get the top down look
Eventually, Sapphire directed me to an image she had created, which actually managed to show a character from a top down view. Unfortunately, the character is levitating, they are in the middle of a street, and are a double amputee with no legs.
However, this was the closest I’d been to what I wanted, so I figured that I’d try her prompt, and see what I could get. After all, we’re using the exact same prompt as her. Surely, we’re not going to get a shot from directly the side again.
I tried four more times and as you can see, none of these creations come remotely close to showing a top down image. Also, the girl on the bottom left has her breasts on backwards. Maybe even her entire torso.
Eventually, I decided to use the following prompt, which specifies many times over that the AI is to show the top of the characters head, to generate a bunch of images in various styles.
Topdown video game overhead street view, aerial view of Female video game protagonist. The view shows the top and back of their head. The camera is directly overhead. They are swinging a pipe.
The results weren’t useful, but they were funny.

Why is she gigantic?

The head is screwed on so badly, you might miss the claws for fingers.

How many elbows does she have?

Even ignoring the arms, this wasn’t the only image where the pipe mysteriously changed size halfway.

Also, she has two left thumbs.

I can’t even begin to understand why the AI wants to put the pipe through the characters head, but this happened twice.
No, none of the “how to use AI” tutorials helped. These programs are idiotic. If you want to make something that is even moderately unorthodox, such as true top down images, prepare for the AI to fail no matter how carefully you write your prompt.
Also, prepare for it to add extra fingers, characters, or appendages to your images for absolutely no reason.
We’re quite a ways from anything actually productive, since I’d need to clean up the images manually to create transparencies anyway, and ideally I’d get animations out of this. For now, the better prototype solution is just to get actual Hotline:Miami spritesheets off the internet and use those until they can be replaced with our own, handcrafted assets.
Anyway, that was my morning. If anyone has any help, I’d be glad to hear it.
I wonder how much it would cost to hire some Chinese high school kid with an aptitude for art to make what you want?
My experience with AI was using a search engine. I wanted to know what type of disc brake pads my bicycle uses (there are at least dozens of different shapes, all incompatible with each other, and the bike manual and manufacturer website have no information at all). I spent twenty minutes trying to get a straight answer out of the electronic bullshit artist, and never got the information I needed. In fairness, this is usually my experience with non-AI search engine searches, too: I search for something and get hundreds of pages of completely irrelevant bullshit instead of what I searched for.
It did, in the end, suggest I take it to a bike shop. This would likely have taken the same amount of time as my attempted AI search, but likely would have yielded an actual answer.
“The electronic bullshit artist.”
Part of the problem with neural net “AI”s is that they speak/act with such confidence that it’s fairly easy to erroneously assume they aren’t spouting total nonsense.
You’re doing it wrong. Easiest way to get what you want is to use neural style transfer and feed it reference material at the exact angle you want. Or do a proper LoRa pipeline.
Out of the box prompt to image generatori are useless.
I figured that I’d need some way of feeding the neural net reference photos. Unfortunately, that’s a premium feature for most of these companies, although if it actually works I’ll happily pay some small sum.
Had every single experience here myself a year ago lol.
But there’s a few prompt tricks you are lacking as i got very good concept art from ai years ago when i finally understood what makes it tick.
I worked the concept art over myself and id say it has given a good productivity boost.
Frankly i could probably get whatever you want pretty well if you just dm me whatever your need and ideas are
Alright, I’ll send you a small list of requests and see what we can get.