Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Turds in AI Generated Art (novalis.org)
82 points by luu on May 30, 2023 | hide | past | favorite | 121 comments


I don't understand the apparent expectation that AI tools generated perfection, every time.

People don't create perfection. If you hired a graphic artist to create an image and gave them the same prompt, there would almost certainly be things that you'd want changed.

So far, AI seems to do very well with ambiguous prompts that give it enough latitude to "do what it does best". Areas of an image where there is a lot of source images to work from end up great, while areas where there is little - or where there is a lot of variability in the source images - don't do so well.

Maybe we'll get to the point where AI generates acceptable (or even excellent) results the first time, every time. Maybe it'll even be soon, but we're not there yet. For now, AI is a very useful tool that can multiply the productivity of professionals and let amateurs fake being entry-level.

For now at least, there is a huge gap between "entry level" and "professional". Because of that, we'll need professionals for the foreseeable future.


The article isn't really asking for perfection. I think the key point being made is that these are basic mistakes that even a non-artist can easily recognise (the hallucinated objects being a particularly bad example), and which are a big obstacle to taking AI-generated work seriously.

Who cares if the work that AI produces is great, or even good, if it can't stop generating nonsense at the same time? The article even mentions this - you can try to work out the turds, but then productivity takes a big hit, and you're still not guaranteed any consistency. So what's the benefit of the AI at that point, from the perspective of a professional, or even an entry-level?


> I think the key point being made is that these are basic mistakes that even a non-artist can easily recognise

Sure, I can recognize them.

I can't draw at anywhere near the level of the output in the article, though.

AI (e.g., MidJourney) lets someone like me, who has neither the ability to create art on my own nor the time/motivation to acquire it, make whatever I need. If I see a "turd", I can regenerate that portion of the image until I'm happy.

To put it another way: AI makes different mistakes than an artist would, at any skill level. The strengths and weaknesses of humans and AIs seem to be similar in scale, but they're not in the same areas.


>If I see a "turd", I can regenerate that portion of the image until I'm happy.

Assuming the API is free, and you have unlimited time, sure. But if the API costs money for each generation, turd-free or full of turds, then at some point the cost of calling the API over and over and throwing away the turds starts to approach the cost -- both in time and money -- of just paying an artist to do it right in one iteration.


You're off by so many orders of magnitude that I'm forced to ask if you've looked into text-to-image generation prices versus artist commision prices, as well as completion timeframes.

There are a lot of things that AI text-to-image just can't do, like generate really novel or even uncommon configurations of objects that look like they're still obeying the laws of our universe. Plenty of use cases remain, for now, where it's true you do want a human artist.

...But underneath the (expanding) umbrella of things that you can do with AI, you simply cannot compare the costs. Humans are just absurdly expensive and slow compared to GPUs.


I see AI tools as being useful for getting a first draft of something (art, code, text, etc.) which the artist, writer, etc. can use to edit and refine into a product.

It can also be useful for experimenting with/iterating on concepts. For example, getting the tool to create 4-5 concepts you can use for inspiration.


Last time I tried github copilot to see if the "using AI to get a draft" workflow worked, I ended up with too many situations where I'd spend time reading and tweaking the generated code, to the point I had essentially rewritten the entire thing. This wasted a lot of time compared to just writing it like I wanted to in the first place. I'm really not sure how to incorporate it into my workflow.

I really care about code being correct and looking similar in both style and thought process to the rest of the codebase. People have an easier time reading code that has internal consistency in the way it does things.

I wish it worked better, because it's already very good at picking things up from context.


Here is more nonsense shadows by AI. Oops, not AI, but famous paintings by famous artists.

https://thereader.mitpress.mit.edu/the-art-of-the-shadow-how...


Difference is, those people invented art, they didn't copy it without understanding what it is.

Also, Midjourney and Stable Diffusion can draw shadows because humans figured out how to do it first.

Because humans are humans, and they eventually figured out how shadows work. They didn't need to be shown thousands of examples of correct shadows before learning how to do it.


AI tools are in their infancy. People are so scared and threatened that they're calling AI outputs "turds".

What's been accomplished thus far has been mostly pure research. Now billions of dollars and human engineering hours are going into this problem to optimize into ease of use, speed, productivity, and utility.

In four years these people will all be using AI or they will be out of jobs. Period.

Don't use AI => Go extinct like a butter churner.


The article is not calling generative work as a whole “turds” but rather it’s being used as a punchy term to describe the tell-tale smells of AI generated content. I didn’t get the impression that the author wanted to throw it all out, on the contrary, they are trying to identify and give names specific weirdness precisely because AI is in its infancy and we don’t have names for all this newly AI generated content. The blog post is a little hand-wavey about it, but it’s not meant to be a technical blog post either.

This kind of work, as fluffy as it feels, is important to progress new and not fully-formed innovations. If the turds the article points out ends up being a long-term problem for AI generated art, then we have a name for it already. If it gets stomped out once and for all in a future version and people figure out how to make this work in a general sense, then this word “turd” as it describes these artifacts just won’t ever catch on because it won’t be needed.


Isn’t there something about how getting 90% takes much less effort than the last 10%. I have a feeling AI will never become perfect, there will always be trivial mistakes such as these. There will be fewer of them for sure, and perhaps they‘ll become even less obvious as the technology matures, but there will always be mistakes made by AI which would make a professional deeply embarrassed.

Also the notion of “Use AI or loose your job” sound kind of ridiculous when it is considered a tool in the toolbox of a professional. It is like saying that cabinet makers should use routers or be out of job, or statisticians should use Kalman filters or loose their jobs. Sure there are times where an artist will use AI in their $dayjob, there probably already are, but that artists which don’t will loose their job, that is a weird prediction.


It's similar to a secretary who refuses to use a computer. Maybe you can find a few odd examples of that working out in the real world, but you're largely going to be out of a job if you don't minimally learn how to use a computer.

There are going to be many white collar jobs where if you refuse to use ai tooling, you're going to be an order of magnitude slower than your peers who do use it.


The job of a secretary does not offer nearly the same creative outlet as the job of a graphic designer or an illustrator, let alone an artist. Also a computer is a way more general purpose tool than AI generation.

The job of an illustrator is to return a finalized product, how you get there is less important, and sometimes the quality of the product, the unique style of the artist, etc. is way more important to a customer than the speed in which it is created. The work of a secretary does not have this aspect. I think the secretary analogy does not work in the same way as a woodworker analogy, a clothes maker analogy, or even a programmer analogy.


Let's not pretend most white collar work is deeply creative or extremely logically rigorous. Sure the top 1-10% of the field will still be doing deeply unique work that requires a lot of human input, but there's a ton of (code|graphic|marketing|etc.)monkey work out there that is what most white collars professionals do. If you're one of these professionals, you'll most likely be forced to adapt to these new tools under the implicit understanding that those who don't will soon be far less valuable and as such get replaced by those who do (adapt).


> In four years these people will all be using AI or they will be out of jobs. Period.

> Don't use AI => Go extinct like a butter churner.

There's a great reason to be scared of this. With enough automation, you won't have a job. Without jobs, there can't be an economy. And therefore it's game over for society as a whole.


Surely you've seen the chart of radiologist jobs vs. time?

(Hint: still hiring)


A lot of people use it as an exploratory tool, they're also quite good at generating base images which you can bring in as the bottom most layer in PS and start iterating upon it.

The authors main complaint of turds, a rather crass descriptor, is really more about visual coherence of the overall image many of which are easily fixable if you have even a bare modicum of design/tech skills.

The author also seems largely unfamiliar with the most important tools of generative art which are image2image and inpainting. These tools make it relatively trivial to mask certain sections of an image and replace them with the background, insert new objects, you can even use things like controlnet to set up specific perspectives and angles using depth/segmentation maps.

You can do fast prototyping, storyboarding, conceptual art, mockups, etc. most of my friends who work in the professional design world are already leveraging these tools.


>So what's the benefit of the AI at that point,

So, really we're going back to about 1750 at this point and asking the question "What is the point of making this machine that makes mistakes pretty often and is slower than a human"

And then 2 centuries later we would answer the same question in "Why the hell would a human do that, a machine is a thousand times faster"

It's not going take us another two centuries to go from first machines to automation powerhouse. Hell, I'm not sure if it's going to take us another two years to get past most of these problems.


If in 200 years we have no human artists left, we won't have any generated art either, because there won't be nothing to learn from.


You've obviously never used the AI image generators :)

It takes me 2 minutes to generate a bunch of iterations of an AI image, masking out and replacing elements as I iterate of whatever might be undesired. The result is an image that would have taken weeks to do otherwise (or impossible if I lack the talent).


I hate the self assured smugness everyone in the AI art communities seem to operate with.

"You obviously never used the AI image generators passive aggressive smiley face"

Like, why are you guys so obnoxious when you talk about this crap?


Because there is a ton of control you have over the image and these tools are incredible.

It is a ridiculous for someone to knock a tool if they barely know how to use it.

You practically see no midjourney prompts using :: because most people haven't bothered to read the documentation.


It's a legitimate call out if a bit blunt. You really have no business attempting to assess the value of a system that you haven't even bothered to interact with.


Just like how these AI companies built these systems without interacting with anyone who actually makes art.


Well, have you? Obviously we're both discussing this here and not in an "AI art community" and your tone comes off as trolling at best both in this comment and your original comment. HN is about curiosity and it's prudent to test and try stuff before you bash around - in this case, the AI art tools empowers everybody, and it's for free as well so why not learn it (or not).


Empowering everyone to produce crappy images to flood online spaces used by artists after stealing their work. Truly amazing product we have here.


[flagged]


You've unfortunately been breaking the site guidelines repeatedly and badly. This comment here was the most egregious of the ones I saw - we ban accounts that do personal attacks like this. But you've been breaking them in many other places too. That's not ok if you want to keep posting here.

Would you please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here from now on? We'd appreciate it.


Ah, yes, ad hominem


The other huge frustration with AI tools is getting them to "professional" level requires extra professional level time to fix those problems.

If those images were created with layers in PS, it is easy to modify and swap out and correct those issues.

With generative AI, you could in-paint until it generates a good enough version, but who has time for that. Recently, I generated an image with an extremely malformed hand. After 4 attempts to repaint failed, I gave up and cropped the image out.


Photoshop supports non-destructive generative layering as part of their new beta in conjunction with Adobe fire fly AI. unfortunately that means you have to install adobe cc...


I think it just removes objects, it can't redraw a hand with 6 fingers?


Generative Fill can also generate objects, etc... So you can use it to fix a hand.


I tried replacing the hand with a hand holding a bottle, so at least Stable Diffusion needs to draw less fingers, but it still looked off (just like how there are weird angles and lighting in the above blog with objects in the room).


> I don't understand the apparent expectation that AI tools generated perfection, every time.

The more interesting part of this for me is the degree to which this intermediate conclusion has translated into: because it's not perfect now, it therefore cannot replace humans at doing X.

In fact, I expect LLMs, etc. to improve at a breathtaking pace, and whatever nitpicking "turds" people find with them now, will be trivially fixed.

I agree with your overall point that professionals will still be needed in some, perhaps changed, capacity for some time to come. But if you think your job, as it exists today, is safe because LLMs, etc. can't do it exactly perfectly right now, then there is probably a failure of imagination happening.


I can't help but feel we've moved to quickly from astonishment to disappointment.

Lots of people queueing up to pour cold water on generative AI when it's still something I have to pinch myself to remind myself it's actually real.

(And yeah. I'm not denying the drawbacks. It's functionally useless for many (or even most) proposed applications. But can we still pause for a moment to be impressed it actually exists?)


I think it's pushback due to how AI is being sold as the ultimate tool that will replace artists/programmers/writers/whomever.

Because the technology has been making big leaps over the past couple of years, the comparisons are now being made not with what used to be before, but with what is being promised. And in that regard, things do fall short.

Basically the hype builders hyped things up so much they started hurting the hype.


This is the a false conclusion so many seem to be falling for. YOU couldn't write or do art like a professional before, and now you can do 90% of it with some caveats. That's a major headwind that pros will be trying to deny understandably. When the digital camera then iPhone came and out obliterated the vast majority of professional photography, pros nitpicked in forums like this till kingdom come, but it didn't change anything.


The problem is that customers did not necessarily appreciate the difference between their amateur creations and the professionals' enough to pay for it. I feel like there is going to be a rerun of the decimation of the professional photography market with designers, illustrators, writers, etc.


I agree entirely, and I weep for it, I was even laid off myself. But I'm not going to waste any more time than that and am going to get onboard with where things are going.


Which creative field did you work in, photography?


Interaction Designer @ Google, pretty senior too

And was a photographer long ago, double luck huh.


If I draw a picture of a fantasy-world bedroom, it might not be very detailed or well-shaped, because I am emphatically not a professional illustrator. But I can tell you what every single object is supposed to be, why I thought it belonged in a fantasy-world bedroom, and why I put it in its particular spot in that bedroom.


If I were to draw a fantasy bedroom there’s a pretty good chance some parts of the room would be squiggly doodles meant to be more interpreted by the viewers imagination rather than a specific object I had in mind.

(I’m also Not an illustrator)


>but I can tell you what every single object is supposed to be, why I thought it belonged in a fantasy-world bedroom, and why I put it in its particular spot in that bedroom.

I mean post ad hoc the machine could do that too. In fact, that is the only way your brain can explain anything.


Huh? Are you saying human artists don't consider composition while creating their works?


Not always, not every time.

I've created music and art in a visual medium and there's a lot of post hoc justification that your brain likes to present


sure, and AI could draw a better picture without those things, and most people would prefer the better illustration to the conceptually flawless composition


I think we're comparing apples (heh) and oranges a bit here. iPhone cameras are essentially the same thing as digital camera and now anyone can use them. Maybe amateurs are messing up their lighting or exposure or composition or whatever, but they are doing the same thing a professional photographer once did. It's also pretty difficult to tell professional photography from amateur stuff.

AI isn't drawing like a human artist does. It's a completely distinct process from someone putting marks on a page. The caveats, or turds as the author of the original article put them, are pretty difficult to fix without having actual training and skills in art, and acquiring those skills is a lot harder than learning to take photos.


"We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run,"

-- Roy Amara


We also tend to underestimate AI hurdles in the short run, especially in other fields than our own.


As far as I can tell, AI is about 10 years ahead of where experts were forecasting in ~2015.


I don't think it's being sold that way.

It's a very complicated subject with several meta-conversations, and things being the way they are, a lot of people need to project arguments onto "They"

Here, we see a misinterpretation: its not "it's going to replace profession wholesale", it's a simple causal chain "hey you do 30% more work with GPT" => well then you need ~30% less workers => affect employment.

And we can run out a centithread of chastising any number of groups under the sun of maximalist claims, ex. from here, directly spin into "well _some_ people argue for complete replacement of artists", spin that into AGI doomers, etc.


> AI is being sold as the ultimate tool that will replace artists/programmers/writers/whomever

I see a potential risk here.

This is HN, so let's consider programmers in isolation for a moment. Today, I can issue the following prompt to ChatGPT and get a solution that works:

> Using Python, generate an example unit test that uses monkey patching.

That gives me a runnable, valid example test. I can then ask it to refactor it to use pytest instead of unittest:

> regenerate the above code sample using pytest instead of unittest.

Here's the resultant code:

    # A simple class that we want to test
    class Calculator:
        def add(self, a, b):
            return a + b
    
    
    # Monkey patching the add method to subtract instead
    def mock_add(self, a, b):
        return a - b
    
    
    # Test the monkey patched method
    def test_add_with_monkey_patching():
        # Create an instance of the Calculator class
        calc = Calculator()
    
        # Monkey patch the add method of the Calculator instance
        calc.add = mock_add.__get__(calc, Calculator)
    
        # Test the monkey patched method
        result = calc.add(5, 3)
    
        # Assert that the result is what we expect
        assert result == 2
It _works_, but it completely misses the point. The test case is testing whether or not the monkey patching works as expected - not testing something using monkey patching to isolate a side effect in the method under test, which was my intention.

Here's where I think there is a potential major issue: As someone who is currently a "Senior Staff Software Engineer" and has been doing this sort of work for two decades now, if you had given me the same prompt my first response would have been "use monkey patching to do what?". ChatGPT doesn't do that. Instead, it gives you a response that best fits the question. It includes a half dozen paragraphs explaining the code, what it does, and what monkey patching is, sure... but it doesn't even touch on why you'd want to use monkey patching, when it is or is not the best approach, or any alternatives. In short, ChatGPT lacks the intuition that a human programmer will develop over the course of a career.

It's getting better. A few months ago I could have asked ChatGPT the same question and probably 30% of the time gotten a result that would even run. In the past couple of months I don't think I've run into that at all.

The "intuition" thing is a big deal, though. ChatGPT is a great tool in the hands of a competent engineer who has the experience to examine the output with a critical eye. Today, I could see it replacing a junior engineer in a professional setting. We should not use it in that way.

If we do, in a few years we'll get to the point where the current generation of senior engineers are retiring. If we've replaced the majority of our junior engineers with AI, we'll be left without people who can use the AI-generated results effectively.

There has to be a way to balance the savings represented by these tools (in terms of both time and money) with the fact that the people who are able to most effectively use it today only got to that point because they've spent much of their adult lives developing the exact skillset that we're proposing to replace with them.


>If we do, in a few years we'll get to the point where the current generation of senior engineers are retiring. If we've replaced the majority of our junior engineers with AI, we'll be left without people who can use the AI-generated results effectively.

I mean, you might not look at job postings for Jr. Engineers anymore, but they all ask for someone who can lead and mentor teammates less senior than they are so it seems like there won’t be many replacements for senior engineers retiring in the next decade regardless of ai’s existence.


Jr level job requirements in the future "Must have at least 75,000 GPU hours of experience"


I told it that it was a senior engineer and I was a random client and asked the same thing. It explained what monkey patching was for, creates an example where a function pulls from an API and we patch it. Depending on runs or prompts you can see more or less explanation. I like getting it to first reason about its role and task, then it has that in its context when replying.

Have you tried telling it who it is? You're coming at it with context about how it should act without telling it - it can role play a lot of things. I spoke to a teacher dealing with kids doing their homework too well with it, and showed you can just say "write X as a student with middling English" and get realistic mistakes.


I hate that every time someone adds a nuanced view to these discussions, someone comes in to ask if they're "using it right" and asks them if they're typing words right as if that actually stops this thing from hallucinating bullshit all the time.


It's fine to not want to learn to use a tool, but it doesn't really add much to the conversation to get angry about people talking about how to use it.

> as if that actually stops this thing from hallucinating bullshit all the time.

This has nothing to do with hallucinations though. It's about giving something a poorly specified problem and then complaining that it gives a valid answer but doesn't act like a particular persona when that persona was not explained. This is a very general problem with a general approach to fixing it. Want it to challenge you on what you're asking it to do? Tell it that.


He never asked you to explain how to use the tool. You're assuming he doesn't know how to use a tool that requires approximately 10 minutes to learn. It's completely patronizing to ignore his point and continue with "Well have you tried.." in a way that fundamentally does not address the issue he's bringing up at all. Your tips and tricks add nothing to the conversation.


> . You're assuming he doesn't know how to use a tool that requires approximately 10 minutes to learn

That raises the question of why they used it ineffectively in the first place then. They asked it to do something, it did, but they say it misses the point that a senior engineer would spot. My point is that if you want to judge how well it operates in replacing someone you have to actually specify who it's supposed to be imitating. It's not idle speculation, as I said I did this and it worked. Asking it for an example of monkeypatching gave back a proper example, along with a warning about maintainability.

Telling it I was a junior engineer and wanted to use monkeypatching did the same thing. Asking a followup of why I should be so careful with it and what I could do instead it gives explanations about several problems and then a few other approaches. Telling it simply to be a senior and I'm a junior, and that it should come up with followup questions for me it not only explains those problems but gives me the kind of followups a junior should ask but may not know to. Exactly the kind of intuitions about what's going on that are said to be missed here.

So I think the example used is significantly weaker than it appears, and the argument hinges on what GPT4 is lacking.

Now, a very solid response to the problem posed is that if it can approximate a more senior role for discussions it can make junior engineers more effective and teach them those skills.


FYI: I’ve read the whole comment chain under this post before replying

> I told it that it was a senior engineer and I was a random client and asked the same thing.

I’m aware that ChatGPT is capable of performing the task I chose. I agree that it’s a relatively weak example, especially if it was my goal to show that AI can’t do this.

That’s not really what I was trying to show, though. My point is that you knew two things that many/most people probably wouldn’t:

1) That crafting your prompts to get the LLM to adopt a “persona” impacts the results significantly

2) That monkey patching has downsides, and that understanding when it is appropriate is important

I strongly suspect that you knew both of those things because you are a senior engineer yourself. My point is that some level of domain knowledge is necessary to get good results from it; my fear is that one day in the not too distant future, far fewer people will be around with that knowledge, because we’ve slowed the “senior engineer pipeline” down to a trickle through the use of these tools.


I disagree with those points. It brought up that monkey patching has downsides with an explanation, and elaborated when asked what they were.

Here's a relevant paragraph

> That said, it's crucial to understand that monkeypatching can make your tests harder to understand and maintain if used excessively or improperly. When you monkeypatch, you're essentially altering the normal behavior of your code. This can lead to situations where tests pass because of the specific setup in the test and not because the actual code is correct, making it less effective at catching real bugs.

More than that, I could instead tell it at the start to tell me the questions I should ask as followups, here's the list when I told it I think I need to use monkeypatching:

1. When is it appropriate to use monkeypatching and when should I avoid it?

2. What are some common pitfalls or mistakes to avoid when using monkeypatching?

3. Can you give me more examples of how to properly use monkeypatching?

4. Are there any other techniques or tools for mocking or stubbing that I should consider?

I could then just say "answer those". No prompting about assuming downsides or problems.

I agree on point one but it's a very easy lesson to learn - try telling it that it's a pirate. It's also not actually required because someone like me can create the persona prompting and then everyone else can just use it. I'm not subtly crafting a weird prompt, it's just telling it who to be for the most part.

> my fear is that one day in the not too distant future, far fewer people will be around with that knowledge, because we’ve slowed the “senior engineer pipeline” down to a trickle through the use of these tools.

On the other side, what we have here is potentially an infinitely patient, always available mentor. You don't have to crash through mistakes on your own to learn, though some lessons are learned harder because of that.


It's like when you hire a smooth talking guy who makes a great first impression but actually that's his main skill and he doesn't perform very well.

I'm a natural contrarian and always used the worst first. I feel like I've warmed up a lot to generative AI (text particularly) but understand the limitations. People who came in from the other end thinking it was more than it is are naturally going to be let down when they see its performance in the sober light of day. I think more people are in this category because a big strength I see is for generative AI to produce stuff that superficially looks really great, but doesn't hold up if you actually want to do your own thing with it.


A talking dog is very impressive when it first appears. When suddenly talking dogs are everywhere, drown out the people I want to talk to, then they are annoying.

Moreover, it's not at all strange that people learned what generative AI looks. When I see it in "in the wild", it instantly in my mind becomes "generated thing". And exactly because the images are created by a process of averaging, once I'm used to them, they have a "nothing special or interesting here" feeling, which isn't a good feeling for many thing, for example accompanying illustrations.


I don't think you can tell AI generated images from real images anywhere near as much as you think you can.

I'd bet good money you couldn't achieve more than say 60% in a properly controlled test


I think you're missing the direction this AI stuff is going.

I would claim to be able to tell an AI-generated image from illustration hand-draw by a person if that illustration had roughly the characteristics of standard AI-generated images. IE, sort of curvilinear magic realism feel, crude knowledge of anatomy and perspective, uniform soft lighting etc. And there are quite a few artists out like that and I shudder for their careers.

The problem is the value of things that look like AI has been reduced because they're everywhere - and that include a fair number of non-AI-generated stuff, sure.


It's trivially easy to generate ai images that aren't "curvilinear magic realism" etc

I can't help but feel you're just betraying your limited exposure to what's currently possible.

You're confusing "what lots of people seem to want to generate" with "what the models are capable of"


It's usually pretty obvious.


It's often most definitely not.


Welcome to the hype cycle!

As I've gained more experience with it I feel I've seen the boundaries of what it's good at, what it isn't good at, and what it isn't useful for. It's good at like 80% of something. But there's so many things in that other 20%.

In the right places it will make certain individuals more productive in may ways but I don't feel the way I did 6 months ago.


It's a testament to how quickly humans can set a new baseline at their circumstances. I try to remember to say "WHAT THE FUCK THIS IS UNBELIEVABLE" at least a few times a day when using GPT and MidJourney, but it's much easier to be irritated that ChatGPT lost the context of the code it was helping me write than to be blown away that I can just casually use one of the greatest technical achievements in the history of mankind while I'm sitting here in my underwear.


> "...one of the greatest technical achievements in the history of mankind..."

... is doing a lot of work here.

This is why expectations are exceeding reality. That said, the reality of these generative AIs is pretty damn impressive. I just wouldn't put it up against computers, cars, planes, boats, nuclear energy, medicine, chemistry or any of the other technological advances that did change the world. Not until the grounding problem is solved. After that, maybe.


The first car didn’t change the world, nor did the first plane, etc. At least not directly.

But they did pave the way for the examples that did change the world.

No different here. Generative AI will change the world. Doesn’t mean these models will in and of themselves.


Most of us aren't playing with multi-modal models yet. Text can describe any world, but when it starts coming AI watching live video feeds for learning, can it base itself in reality with that?


I’m reminded of Louis CK’s rant “ Everything Is Amazing and Nobody Is Happy”: https://m.youtube.com/watch?v=kBLkX2VaQs4

Like, nothing like this was remotely possible a year ago. I’m sure we’ll hit a wall at some point, but you certainly shouldn’t dismiss things as not possible because they’re not currently possible.


It’s only disappointment now after evangelists promised astonishment in the first place - I’m no AI-naysayer, I’m just not a true believer and I have no interest in becoming one. It’s new, it’s cool, it’s not perfect, it’s not magic, it’s not a silver bullet. It is what it is.


It's at a point I never thought we would get to in my lifetime.

I remember the original AI winter and how implausible sci-fi depictions of AI seemed for decades.


This happens with every innovation. Once something becomes sufficiently commonplace, it becomes boring and a larger target for criticism.


Seems strange to me to prompt an AI to generate this kind of stylized isometric video game art that's clearly intended to look like it's made out of uniform, reusable "object assets"... but as a single-shot "gestalt" of each scene. (It's like doing an oil painting of a Minecraft world, and expecting the voxels to look voxel-y.)

For the sake of artistic coherence, wouldn't you rather those same tables and chairs and beds to be reused everywhere else you need a table/bed/chair in the game?

Which is to say: wouldn't it make more sense to ask the AI to generate individual objects; select the best ones, over many iterations; put those together into an asset library (i.e. a tileset/spritesheet); build some training-data tilemap "scenes" out of those objects for few-shot training; and then ask the AI to generate you more tilemaps, with the tileset provided as part of the input? (In other words, to do the same thing for levels that it's doing for music when it generates MIDI.)

(Heck, you might even be able to train it to do both steps at once — train it on screenshots of a tilemap-editor UX (e.g. RPG Maker or Tiled), where the tileset palette and the resulting tilemap are visible at the same time; and where the screenshots are composed such that the tilemap only uses tiles available in the tileset. You'd expect it to learn both to invent nice reusable tiles for you, and also to use those tiles to build you a scene.)


Better context for those questions from the creators of the game jam entry referenced in the post: https://blog.luden.io/generated-adventure-the-postmortem-of-...


I was looking for actual turds in the picture. Disappointed.


yeah I was looking at the bushes in the plant pots and thinking "are these the turds?" until I found out that the author is just being extremely picky. I do like to imagine a future dominated by AI where we get weird emergent patterns like the AI just secretly including turds everywhere it can though


I don't think they're being picky at all. These are obvious mistakes.


you’re right once you look carefully. the reason I’m reluctant to swim with this article is because this is exactly what I expect from generative AI in the first place. the article's perspective seems to come from a place of naïvety


Me too! Damn clickbait.


I want to emphasize something that other comments seemed to have missed - when a human artist makes a mistake, I have a sense of sadness, or pity. But when an AI makes a mistake, the types of mistakes it makes are revolting. I can't watch that beer AI video because looking at it for a couple of seconds makes me want to throw up.

That's the real problem with the "turds" in AI art - they're just close enough to pass first inspection, but you only realize afterwards how messed up they really are.

I honestly think for now I'd rather just deal with my own inadequacies as an artist than try to bargain with the little freaky demon in the computer to draw actually nice images.


I find the messed-up imagery disturbing not necessarily because of what they depict but because it's such a convincing emulation of "visions" I've had when my brain is doing... not so great. Illness, fever dreams, and bad drug trips.

It looks like what most people have probably seen in some capacity but wouldn't or couldn't ever put down on paper. That's deeply fascinating in its own right.


This is just a noise in similar way how earlier AI generators cannot do fingers and piano (and computer) keyboards.


Hallucinations. Larger, better trained, multimodal models are needed. All current image generation models are too small to be usable in serious settings.

Until you have all that, you can minimize the hallucinations in the result. I have some tips to share (not for the music though).

-1. You still have to know how to draw, otherwise you won't have the eye to spot the "turds" nor skills to fix them. It will still be faster than drawing absolutely everything by hand.

0. Don't use Midjourney for anything complex. It hallucinates well and gives good first impression, but it's simply not suited for anything that has complex composition and lots of meaningful detail that need to be controlled like the examples in the article.

1. Finetune your model on its own output and your own work.

2. Use higher order guidance, not just text. Besides the fact that current models don't understand the natural language well (let alone your intent), text is inherently unsuited for explaining artistic intent. Use controlnets on your sketches, latent noise spatial composition, and other similar tricks - they are just better at that, faster, and more precise.

3. Don't generate everything in one piece, photobash. Make complex compositions out of simpler objects. The model can't keep the track of too much stuff. Use automated workflows to save time (otherwise what's the point?). Check out ComfyUI for a powerful node-based editor - create a pipeline once and use it many times. Use plugins for the common software like Photoshop, Krita, Blender, Houdini etc.

4. Fix your errors manually. If your work is complex, it will be faster.

>But I feel like something that gets lost in the discussion of AI-generated artwork is the turds.

No, it's not. It's the major obstacle in the current models, and gets a lot of attention.


"But I feel like something that gets lost in the discussion of AI-generated artwork is the turds."

You know, we already have a word for this... "flaws". Does the notion that AI-generated artwork has flaws really get lost in the discussion? Midjourney not being able to do hands until recently was a whole meme...


It's a specific kind of flaw—an indistinct, unexpected object in the image.


The author could've chosen a better descriptor. What they're really looking for is overall image "coherence". Essentially things within the scene that are visually incongruous.


They could have, yeah. Even when the items aren't bad, or out of place, MidJourney can't help but fill the space with details. It doesn't like whitespace.


I clicked the article expecting to see humorous little piles of steaming excrement inserted into images. Am I to assume turds in this context mean artifacts or out-of-place mistakes?


I was expecting the same, highly disappointing, but I guess we could whip some up in no time!


Yea it’s not fully automatic yet, but that’s what Photoshop Content Aware Fill (and now Generative Fill) and Healing Brush is for.


I've only see GF on photographs; does it work for anything else yet?



We call this "inpainting" here.


>But the shadows are still going to be annoying, and the iteration time is painfully slow.

I seriously doubt this can be described as painfully slow compared to an average artists making the equivalent assets, especially if you get into the local UIs that have tools like inpainting/masking, plugins, loras, checkpoints, and all that jazz.


> And yes, you can paint out the turds in Midjourney and tell it to fill in the area again until it gets it right. But the shadows are still going to be annoying, and the iteration time is painfully slow.

Just cleaning things up in photoshop and a tablet pen (or ipad) is much faster than drawing this from scratch.


> and the iteration time is painfully slow.

Yeah, that's pretty hilarious. It show how incredibly far and fast the goalpost has moved. Two years ago, you would have to hire someone.


Getting the shadows actually right? Not a chance, and I've got about 20k hours in photoshop


Previously:

- https://novalis.org/blog/2022-12-05-i-am-frustrated-with-sta...

- HN discussion of that post: https://news.ycombinator.com/item?id=33902248

- The author's response to HN comments on that post: https://novalis.org/blog/2022-12-10-response-to-some-hacker-...

FWIW, the author didn't submit that post, or this one.


I thought they were actual turds


And yes, you can paint out the turds in Midjourney and tell it to fill in the area again until it gets it right. But the shadows are still going to be annoying, and the iteration time is painfully slow.

... uhh no, no you cannot. What the author seems to be describing is inpainting, which is a staple of most stable diffusion systems such as automatic 1111, but not something that mj currently supports.


I guess the retort here would be, you had to circle the turds for someone to notice. JPEG encoding creates lots of turds, actually the entire output image is riddled with them. Tons of little artifacts.

If you expect perfection out of it you’re going to be disappointed but it’s pretty astonishing that it’s as difficult to be nit picky as it is currently.


If I glance at an AI-generated image of an imaginary object or space, the overall impression looks correct. If I engage my brain at all, the turds are really obvious really fast, whether someone points them out or not.


Hold my beer. [1]

NSFW of the revolting kind. No nudity / violence / gore but they are a bit disgusting depending on your taste. They aren't turds but sorta look like them.

[1] https://imgur.com/a/7cwzlrk


A bizarre post that points out the flaws of AI generated content without providing any sort of salient point about it. Capping it off with "I don't want to hear about the fixes for this" is the icing on this particular "turd" cake.


Think people didn’t get the link turd to polished turd in this blogpost. Is what you get when you ask a beginner artist without any guidance to create something, and instead of making something that has the right competition, perspective, proportions, colors and values etc, they tend to over render the image. Making it look “pretty”, the issue that it tries to compensate shitty foundations with glitters and sparkles, widgets, shadows and highlights.

Ultimately it often doesn’t have the right impact.

Ai art falls in the same category for me. Polished turds, bad writers, bad ideas, bad concepts, boring subjects.

I’m hoping it mostly inspires a generation to make cool shit and enjoy it. If you take a skilled artist to use ai art, the results are good, if you take beginners and give it tools you get polished ai art turds. Things that looks cool but ultimately missing the foundation to make it work.

At the same time, perhaps it doesn’t matter, and people want to have polished turds regardless.

Compare this to a game like unpacking, carefully crafted rooms and smart storytelling. Or vampire survivor, addictive gameplay loop with assets that almost seem like they come straight out of some gamemaker pack. In the end, it’s all about intend, and no amount of polish is going to fix


> In the end, it’s all about intend

Precisely, and it doesn't matter how you realize it - by hand or with something more automated (digital manipulation), or with matte painting/photobashing (the most recent resentment target), or with any ML tool.

If you don't have the eye, taste, and rigor, you're going to create something boring. If you do, you'll create something interesting. The issue with the examples in OP is not the tool - it's the creators who were sloppy (and also incredibly rushing as it's just a tech demo created in 72 hours).


Most of the mistakes pointed out in the 4 images are things I would've never noticed.


Seems overly petty and nitpicky article for a gamejam game.


That image of the dragon seems pretty old, also as the article mentions most of this stuff can be easily remove with inpainting.


It's funny you mention that, that overall smeared look is instantly identifiable as version 3 or earlier of midjourney. For reference, they are currently up to version 5.1.


Yeah I thought about the same thing, I don't know why the author brings up such image as it's not clearly the experience you would have using MJ today.


Delta between AI and artist is reducing but non-zero. This is a good outcome.


This is my take on most AI discussion

Many are like "Oh yea, AI isn't at good as us doing..."

And my response is "Thank fucking god, we aren't ready yet".

Sudden society changing events aren't great, especially when they start coming rapid fire. You can go from "there's a few angry starving artists" to "why are people rioting in the streets" pretty damned quickly.


Are shadows really that annoying, I’m willing to bet that most of the gamers would just glance over these defects.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: