GO Gentoo! Awesome!
I guess there will be some push back.
The other thing is I guess it will be kinda hard to differentiate AI generated and human generated.

- https://marc.info/?l=gentoo-dev&m=171324172908553&w=2Hello,
On 2024-04-14, the Gentoo Council has unanimously approved the new AI
policy. The original wording from the mailing list thread was approved:
"""
It is expressly forbidden to contribute to Gentoo any content that has
been created with the assistance of Natural Language Processing
artificial intelligence tools. This motion can be revisited, should
a case been made over such a tool that does not pose copyright, ethical
and quality concerns.
"""
I have started drafting a Wiki page detailing this at [1]. We will also
look into how best provide this new information to our contributors.
[1] https://wiki.gentoo.org/wiki/Project:Council/AI_policy
--
Best regards,
Michał Górny

It has to be seen, as he said (I agree), as an automation tool.Looking ahead, Hohndel said, we must talk about "artificial intelligence large language models (LLM). I typically say artificial intelligence is autocorrect on steroids. Because all a large language model does is it predicts what's the most likely next word that you're going to use, and then it extrapolates from there, so not really very intelligent, but obviously, the impact that it has on our lives and the reality we live in is significant. Do you think we will see LLM written code that is submitted to you?"
Torvalds replied, "I'm convinced it's gonna happen. And it may well be happening already, maybe on a smaller scale where people use it more to help write code." But, unlike many people, Torvalds isn't too worried about AI. "It's clearly something where automation has always helped people write code. This is not anything new at all."

The funniest part is that AI does a very similar thing to what humans do:As for the stated copyright concerns, I'm no legal expert, but it seems like the horse has left the barn on this one. Every big tech company I know of already has code that had some sort of AI generation involved in the development process merged deeply into their trunks. I just don't see the entire software industry getting sued over this.
I'll again place my opinion that this is not an universal fact at all anymore. AI made remarkable progress, it's already *very* powerful in summarizing documents or even internet discussions, including open source tooling running LLM in the hands of users who specialized it a little to get short summaries or whatever they wanted. I say it would already *increase* rather than decrease availability-quality of documentation, or the understanding of bug reports. That it should not be completely trusted... isn't that the same for working with tired hobbyist humans?szatox wrote:Quality is a real concern, since AI lacks the depth and understanding.
I wonder if we take Gentoo's Wiki to AI, will AI produce better document (and summary)? Will Gentoo team accept this result?Rad wrote:I'll again place my opinion that this is not an universal fact at all anymore. AI made remarkable progress, it's already *very* powerful in summarizing documents or even internet discussions,szatox wrote:Quality is a real concern, since AI lacks the depth and understanding.


Are you thinking of compilation times, by any means ?GalaxyNova wrote:I think this decision was a mistake. Gentoo is in no position to talk about energy usage.

I think the most important factor to consider is per-capita energy usage, on which compilation and AI are probably roughly the same. If you've ever tried to run an LLM like LLaMa offline for example, it maxes out your CPU in the same way as emerging @world.kgdrenefort wrote:Are you thinking of compilation times, by any means ?GalaxyNova wrote:I think this decision was a mistake. Gentoo is in no position to talk about energy usage.
If yes, well, sure it takes more times to update, if you don't manage it well then, you could have to let your computer runs all the night for compilation, which takes ressources, that is true.
At the same time, a simple request to an AI is requesting way more resources than you can think of I think. Specially when you see some goofy stuff that it's being used for…
Regards,
GASPARD DE RENEFORT Kévin

Yeah they are definitely not quite at the level of human intelligence. But to me this doesn't seem like a valid reason to completely ban the usage of such tools in any Gentoo project.szatox wrote:Ah, the "don't have children to save the planet" argument... Save it for whom exactly?![]()
Specialized AI is a handy tool when used within the scope of it's capabilities. Just like any other tool.
Chatbots are good for chatting. Writing code and reporting bugs use text, but they are engineering. Chatbots suck at engineering. They can organize words in reasonably looking patterns, but they don't understand depth.
Anyone needs a proof? Go to YT and search for "ChatGPT plays chess"... And enjoy the mayhem.
A similar thing happens with tensioned straps in generated pictures.... Following curves both the outside (correct) and inside curves of elbows (where they should go straight instead).
With the quality I see the following situation. Currently at some stage human still has to review the code. If this is done by a submitter and he reasonably can claim correctness, I would not care if AI was used in the development process. Possible problem is when the code checks will fall on the maintainers, i.e it is easy to submit AI generated code and let others deal with whether it works as intended. I see that a lot with students submissionsHu wrote:I was not involved in the discussion, nor in the subsequent vote. My take on it is that:
- Regarding copyright, most countries have copyright laws that are poorly suited to software, at best. For an organization that cannot afford to spend millions defending against a copyright lawsuit, extreme caution seems like the right approach to me. Once major jurisdictions have clear statutory or case law holding that AI output is not subject to the copyright of the input training data, this will be less of a concern. I am not aware of any jurisdictions that have said that it is subject, or is not subject, so the cautious approach is to assume a court might decide that the copyright on the input training data does transfer to the AI output. Most, if not all, AI output currently has poor attribution, so it is difficult to determine whether the input training data was subject to an enforceable copyright, and if it was, then whether the input had an acceptable license (such as BSD/MIT for any purpose, or GPL for projects that accept GPL contributions) that would permit its use even if a court does rule that the copyright is passed down.
- Regarding quality, my opinion is that the submission's quality should be judged independent of the source. If the submitter's work is of good quality, whether because the submitter is a good author, because the submitter took poor quality AI output and manually improved it, or because the submitter used a high quality AI that produces naturally good output, does not matter. However, at present, AI output is often not of a quality that it can be submitted verbatim, and I would not want reviewers (whose time is often very precious) to waste their time picking through poor quality AI output. Establishing a rule that summarily rejects AI output is heavy-handed, but effective. Once submitters provide output that is not obviously poor quality, this point will become difficult to enforce, but also unnecessary.
- Regarding the other concerns, I have not read enough to form an opinion.
The intelligence part in AI is a lie. The correct name is "statistics".who cares what tools were used if the code is good... are you going to ban code that was typed on particular keyboards
Well, when I asked "who cares what tools were used if the code is good?", my definition of "good code" did not include "garbage".szatox wrote:The intelligence part in AI is a lie. The correct name is "statistics".who cares what tools were used if the code is good... are you going to ban code that was typed on particular keyboards
Statistics sucks at generating code, but its time is much cheaper than human's and it's capable of producing a volume of garbage big enough to overwhelm and drown any developer on the receiving end of this pipe.
Whatever keyboard you're using, it's driven by a real (hopefully) intelligence, and you still have to literally touch every letter and think about it's purpose.
Coding aids can be used to speed up developer's work, but they can not and will not replace them, but people fundamentally are lazy and LLMs make pushing the work on someone else too easy to make such a process sustainable.
Change my mind

And this is the the biggest problem, IMO. I don't see enough participants in this thread addressing it.Hu wrote:
- Copyright status on AI generated outputs is unclear at best. Accepting contributions that have a decent chance of becoming a copyright mess later is risky, so the safe path is to refuse AI generated outputs until the situation improves.
Ionen wrote:As a packager I just don't want things to get messier with weird build systems and multiple toolchains requirements though
Ah, sincere apologies: I did say "I've just skimmed through this thread, so maybe I've missed something important", and clearly I had. The decision still doesn't make sense to me, but this time it's probably more about my ignorance of Gentoo development culture/process.Hu wrote:psycho: I suggest you read the thread in full before responding to it. I think my prior post addressed your questions. To recap:
- Generative AIs tend to produce low quality output. Reviewer time is too precious to waste cleaning up that output, so anyone who uses a generative AI and submits it directly for review is wasting reviewer's time. If a would-be contributor uses generative AI as a start, then personally cleans up the garbage to a presentable level before submitting it for review, that would be different. As szatox suggests though, that's not likely to be what happens. When the contributor does do a good enough job cleaning up the input that it no longer appears to be AI-generated garbage, then the reviewer will not be able to readily reject it as AI garbage.
- Copyright status on AI generated outputs is unclear at best. Accepting contributions that have a decent chance of becoming a copyright mess later is risky, so the safe path is to refuse AI generated outputs until the situation improves.