So is the logic "technology X is potentially dangerous and therefore should be kept in the hands of the morally ambiguous (at best) corporation that created it"?
If Google made something dangerous, well, I don't trust Google at all. I'd rather have it in the open so I know what I'm dealing with. (I know Stable Diffusion isn't a google product, I'm using it as a neutral-ish example.)
But also, what dangers are we talking about? "The model produces something racist" is not _dangerous_ to anyone, it means the model needs work to make it not do racist things. Wouldn't public release help with that?
To take your racist model example: yes, having it in the open will help with that eventually. Although possibly not, because the problem isn't just in the model, it's in the training data, so unless that's also released you're limited in how much of a fix is possible after the fact.
In the meantime, a ton of people who just don't know or care about the issues will build products on top of this really cool thing which, because those problems haven't yet been fixed, harm people.
How's it going to harm people? What's an example of what one might ask this to do where "the model very rarely outputs racist images even for innocuous inputs" causes real harm?
The only answers I can think of require us to use the model for unethical things - like profiling people at the airport or something. That would be abjectly wrong for lots of reasons even if the model itself were fine.
If you use the model to make images to use creatively, well, there's always a human deciding what images to use. Also, "I saw a racist picture" doesn't rise to a level of harm where I feel like we have to pump the brakes.
It's harm in a similar way that racist depictions appearing in fiction—and even the lack of positive depictions of marginalized groups appearing in fiction—is harm.
I'm all for diversity in fiction, eliminating caricatures, positive portrayal of marginalized groups, etc. But that's different from saying "I think works which don't have these features should be banned".
That kind of harm is just too dilute to justify this approach to me, especially given all the ethical concerns that exist if you force the model to stay only in the hands of its creators and make them its gatekeepers. (Bearing in mind that the creators of the model are a for-profit corporation, not a bunch of AI ethicists.)
And again, I support improving the model so it doesn't generate racist (or any other -ist) content. I just think that improvement, and indeed all work on the model, is better done in the open.
It's not about banning works that don't include such things. It's about genuine concern that entire categories of works that lack them cause problems.
If none of my images from Stable Diffusion contain black people, that's not a big deal (I mostly don't even ask it for people, because it's not that good at them).
If none of the images from Stable Diffusion or any of its brethren contain black people, that's a problem, especially if we start using these sorts of tools to generate art for things like games, book illustrations/covers, etc....but even if we don't; even if it's just a bunch of people playing around with them on the Internet. Not seeing yourself in media, even this kind of media, does a kind of harm that it's very hard for us "normal" (white, cis, straight, male, etc—and yes, I am all of those) people to fully grasp.
Fair enough, but what does that imply? The magic N-Ball that has a minor chance of saying something unintentionally racist is such dangerous technology that it must be kept out of the hands of the peasants?
The archetype of this argument is of course the (in)famous "P doesn't Q people, people Q people" *
While this is "true", the sense in which it is true is so limited as to be entirely unhelpful. If you manufacture P, and you know Q may be an outcome, why are you manufacturing P, and how are you preventing Q?
* Where, as we all know, Q is usually "kill" or "harm", and P might be "guns" or "autonomous vehicles" or "military robots" or "facial recognition" etc.
I see this has attracted some downvotes, which is of course fine. However, it would really help improve the quality of this conversation if people could reply here, and perhaps explain objections to the archetype I've described.
Perhaps it was just my overly convoluted writing style? :) If though it was about the content...
The statement "P doesn't Q people, people Q people" is absolutely devoid of any useful information or novel insight, and doesn't take the conversation in a new direction that is useful. In fact, it can kick the conversation into an anti-productive quagmire.
Let's see... "Stable Diffusion doesn't produce racist memes, people produce racist memes." Well duh. A more useful conversation might be about how we protect against possible automated mass-generation of racist messages (or whatever), what roles SD et al have to play, how we deal with possible outcomes, etc etc.
For what it's worth, I do not think Stable Diffusion should be kept under wraps. Paradigm-shifting technology should be discussed in the open. Developers/engineers should be prepared to walk away from anything with a significant downside if that downside hasn't been exposed to thorough debate. And we should be prepared to shoulder responsibility when shiny new toys are used to do terrible things.
> The statement "P doesn't Q people, people Q people" is absolutely devoid of any useful information or novel insight, and doesn't take the conversation in a new direction that is useful. In fact, it can kick the conversation into an anti-productive quagmire.
Are you presuming everyone reading and responding to this thread are on your level of interpretation or analytical superiority? This is the first time reading about "P doesn't Q people, people Q people" on HN and I imagine won't be the last. There is no "In fact," here and it already rubbed off as a completely pretentious statement, which given the circumstances is not unusual in a place like this.
(I didn't downvote you but am expecting a downvote)
> Is the knife a murderer or the person who wields it?
I attempted to abstract that, to make my response impersonal. The archetype is based on a phrase well known in some countries, but certainly not all (my apologies for making assumptions): "guns don't kill people, people kill people".
Another reason I avoided that specific sentence was to avoid the strong emotions it invokes.
About what I wrote starting "In fact...", the sentences following that were my attempt to expand the point about quagmires.
That doesn't seem a) particularly useful or b) like it was hard to do before.
I could easily gather a corpus of racist slogans, a corpus of racist images, and smash them together to get a few thousand racist memes. It would probably take less time than relying on an AI model to generate each one (which takes a few seconds to a few minutes each).
And it's not as though there aren't millions of actual racists out there spewing out racist content constantly anyway.
Given this and the other weirdness that keeps coming out of Google's AI department, I find myself wondering: is AI research considered a serious area of academic inquiry, with respected scholars and stuff, or is it mostly full of crackpots and con artists and self-promoters?
Kind of a weird question. It definitely is a respected area of research, yeah, and Google no doubt do excellent work.
Google's problem is not a lack of seriousness per se, it's that they seem to be stuck in a downward spiral of hard-left blank slateism that views the general public not as humans but as an unwashed mass whose brains are basically empty EEPROMs waiting to be flashed through the eyeballs. Whilst they are elite and super-smart and can handle AI, everyone else would be immediately converted into drooling zombies.
So they've got stuck in this pattern that goes like:
1. OpenAI announce some cool new model that does something neat, invites friends and says open version will come later when they make it "safe", read: ideologically pure. When eventually released it's broken by the hacks they put in to try and meet their ill defined requirements e.g. see how "cowboy with a cat" got turned into a picture of an Asian woman but worked with cowgirls.
2. Google immediately release a web page stating that they were already doing that, and their version is better in every way, but it's far too dangerous to release. As a consequence nobody can verify their claims and their stuff is just ignored.
3. Eventually a mishmash of third parties duplicate the works based on their papers and make public stuff with it that isn't garbled by filtering. Sky proceeds to not fall. Google/OpenAI learn nothing. GOTO 1.
This pattern happened with text transformer models, it happened with DALL-E, there are probably other examples too. For some reason AI researchers at these places believe that whilst Word / Photoshop / Blender / a painting easel is anodyne and safe technology, a tech that does the same thing more conveniently is a revolutionary social crisis in a box. No evidence for their beliefs on this matter is ever presented, because it's all based on ideological assumptions about human nature that are taken as axiomatic.
Very sad. Google could be a leader in this space. Instead they've disappeared into a hole, terrified of the world outside.
AI seems to attract a particularly strange set of groupies. Google is today where Bell Labs used to be back in the day. Lots of money and lots of researchers running wild. Although Google seems to go out of their way to hire particularly interesting people. With pronouns and what not.
Seems like it's less of an issue for their applied research and more to do with ethics or policy researchers.
From my point of view, most of these folks would have to be deluding themselves the moment they apply for the job, or incredibly ignorant of google's past.
I tried to find a demo of GPT3 just now and I could not. Yet in just a week I've seen half a dozen new projects or improvements on SD. This veneer of safety is just rent seeking bullshit.
Maybe not exactly what you need from GPT3, but the SD of language models right now is Bloom: https://huggingface.co/bigscience/bloom which might serve your needs.
I can't imagine the hubris required to make a person think that because their company is rich they are especially and uniquely entitled to the fruits of academic research.
> Sr Research Scientist at Google. Opinions are my own (she/her)
So, just the ramblings of a random person on the internet. Nothing to see here.
She may work in the field, but that's just her opinion, not some technical insight, and she is not talking in the name of her company. Her opinion is as good as mine, or yours, or the guy I met at the bar last night. I fact, I think it is a bit disingenuous for her to mention her employer in her Twitter profile, it may get her in trouble.
Now, if Google follows up, it may have some value.
She brings her opinions to work, clearly. And Google's policy for years has been to publish clickbait research that tease AI wonders that stretch the imagination and then never release to public access, let alone the open source and academic researcher communities they feed off. It's obvious that at the least that particular sr scientist's "safety" conceptualisation frameworks are triviality obsessed, divorced from reality and utterly miscalibrated.
That's exactly what it is; and in a way the church was right: once people started to read the scriptures for themselves in vernacular they formed their own opinion and created different churches.
The New Testament is itself a product of the Church: the latter existed well before the former took its shape or before even quill was put to paper.
Therefore, it took members of the Church to write down these doctrines, and also leaders of same Church to discern and rule which writings were orthodox, and which were heretical, mere decades after Christ's ascension.
"Different churches" have been founded more or less continuously, and the 1517 splits where people tended to follow politically- motivated renegade priests, bishops, and monarchs are not really isolated from the narrative of Church history itself.
The New Testament canon wasn't established before the 4th century AD, so not "mere decades" after Jesus' death (or "ascension" if you're a believer), but centuries after (although of course it didn't happen in a fortnight, it was a process that had started earlier). The wikipedia article about this is quite well documented: https://en.wikipedia.org/wiki/Development_of_the_New_Testame...
You're right that different churches have been appearing continuously since the beginning of Christianity, but access to the text of the Bible in vernacular did fuel the Reformation. How could people build a "personal relationship" with God if they don't have direct access to his words?
But it was mere decades that the NT began to take shape. The article you've linked says it was all written before AD 120. So, 87 years - fewer than 9 decades.
And practically all learned people - all the movers and shakers (so to speak) of the Reformation - understood Latin really well. About as well as the Hellenic world understood Koine Greek in the Apostolic Age. So they already had direct access to both Scripture and Tradition (chiefly being the sacred liturgy in Latin) and the Eastern Orthodox already actively translated the latter into vernaculars.
Yeah, researchers and big companies could have "properly tested and audited" it for a while before release, but then what? Is Google an expert on what is ethical to give to the public? No, they're clearly only worried about profits, reputation, brand image, so yeah preventing nazis from using your tools is a valid concern to Google et al., but not to us. Online nazis are already doing their nazi things, just not with their latest tool.
Full title should have been: "Senior research scientist at GoogleAI: “Can't believe Stable Diffusion is out there for public use and that's considered as ‘ok’!!!"
It's neither OK, nor not OK. it's just what's inevitably bound to happen. Even borderline magic technology like Dall-E, Stable Diffusion etc. will become commonplace over time (and it happens extra quickly with digital technology). Ethics committees and other such bodies are largely pissing in the wind with this sort of stuff, in my view.
Makes me wonder that too. Along with what percentage of internet searches in general would be better served by a user owned and fine tuned InstructGPT model frontend linked to a network of larger backend models of various kinds and generalist specialisations. Perhaps this is something execs also see, hence the hyperaggressive gatekeeping around the tech can more be interpreted as moat dredging rather than genuine corporate concern for "safety".
Try breaking into DS/ML as a backend engineer. Even with Python experience, a ton of backend experience, an MS in CS, and courses taken in the topic, I’ve still been unable to do so. And I know a lot of people in the same boat.
There is a lot of unfounded fear about AI in general lately.
Reality is that AI features rarely replace anyone but amplify the individual that can incorporate it in their skillset.
The best option would be if models like Imagan, Dall-E and Midjourney all followed this open principle, as each of the approaches has its strengths and weaknesses and could potentially learn from each other to improve.
They're hiring as many loonies as they can, so they know where they are.
Think about it this way - you want the highest quality AI development team you can get, so you hire everyone. Then you put the more "fringe-y" characters in a building by themselves, where you can let them tinker about and make ridiculous statements in public and not do any real harm, while the grown-ups actually get on with things.
Stable diffusion is good open source software and is well motivated in a very complicated technology/political space. Sure they inherit all of the existing problems with image datasets but they have some plans to mitigate racial and socio-political bias. The alternative is to either not do the project or make it secretive under the supposed motivation of safety. This interview is a good overview of the principles behind their approach: https://www.youtube.com/watch?v=YQ2QtKcK2dA
Actually, if you care about the effects on society, the safest thing to do with stable diffusion or GPT-3 would be to release if publicly and widely.
The more people play with it, the more they are used to it and the less likely to be fooled by it. In addition, large corporations and nation states have the money and access to talent that they can build and train these models. Thus if you hide it, you are just making the technology more potent in the hands of the very people who have the most incentive to use it to manipulate people.
I can't believe I have to login to use it, I also can't believe the generated content goes through automated content moderation before I get to see it.
The state of affairs in open AI software (the concept) is shameful.
This was my feeling about all the VR systems as well. I haven't looked deeply into it, but when I was excited about getting my hands on some hardware and hacking away, it seemed like every device required going through some megacorp and their walled garden and no VR device is treated simply like a monitor on your face, which killed my interest in being a VR dev.
The problem with VR is that's is just a way more complex system than a monitor. You have IMUs, cameras, controllers, lens distortion, hand tracking and a lot of other stuff to deal with. Until a little over a year ago there wasn't even any kind of API devices could plug into, it was all vendor specific.
This might all ease up a little over the coming years, we now have both OpenXR, as well has hardware that is a lot more self-contained and no longer requires the PC to do so much work. But it will still take a while until a VR headset can 'just work' the same way a monitor does.
If you are talking about stable diffusion you can both run it locally without logging in, and disable content moderation.
I don't understand why they even included content moderation. People will say it's to prevent public relations issues but that doesn't seem like a good justification, you shouldn't have to go out of your way to placate the ultra-puritanical. And the filter is trivial to remove so how is it anything but a facade?
You need to login to download the weights the first time. The whole experience kind of left me with a bad taste. I do appreciate they didn't go the API access route like open AI and others but still the experience is a far cry compared to traditional open source software. Additionally their license, with it's verbosity was kind of a deal breaker, I am of the opinion that it would be better to just go with a standard software license, like the MIT license or something or Apache or something.
Her stated issue is that the Stable Diffusion project is open for the public to use and modify.
She later elaborates that her concern is that the generator could be used to create fake photos of people, who could then be persecuted by using the photos as evidence.
"It is difficult to get a man to understand something when his salary depends upon his not understanding it."