Open-source game engine Godot is drowning in 'AI slop' code contributions: 'I don't know how long we can keep it up'

ryujin470@fedia.io · 3 months ago

Open-source game engine Godot is drowning in 'AI slop' code contributions: 'I don't know how long we can keep it up'

Randelung@lemmy.world · 3 months ago

Codeberg Anubis when?

pkjqpg1h@lemmy.zip · 3 months ago

there is?

Randelung@lemmy.world · 3 months ago

I’m ignorant 😅 I don’t use either. I guess it doesn’t really defend against browser-remote-controlling bot agents.

pkjqpg1h@lemmy.zip · 3 months ago

browser-remote-controlling bot agents

if you mean some users giving control of their browser to an bot no it don’t because it’s still a legit user browser window

but most of bots don’t use a legit browser window (because it would be impossible to scale)

Randelung@lemmy.world · 3 months ago

I was thinking that using selenium or similar would allow the bot to circumvent any block that works in a browser. Since it’s probably not doing a million PRs at once, doing that would be viable. It could even use the cookie from the selenium session to then use the api directly.

Kinda like flaresolver does for prowlarr/jackett.

In which case Anubis is only a temporary measure until the vibe coders wise up.

pkjqpg1h@lemmy.zip · 3 months ago

Defense systems also improve. Anubis can make the Proof-of-Work (PoW) more difficult or add new checks. This competition is won by whoever can keep their costs lower. When spammers have to use more resources for each pull request while normal users do not pay an extra cost, the defenders win.

AmbitiousProcess (they/them)@piefed.social · 3 months ago

Unfortunately Anubis wouldn’t stop the bots, it would just slow them down.

Anubis just adds proof of work, AKA computation, to your requests. It’s why your browser takes a second before it can access the site. It’s nothing for things on your scale, but it’s a fuck ton of time and money for large scraping operations accessing millions of links every day.

For a bot submitting PRs though, it’s not gonna be a meaningful hindrance unless the person is specifically running a bot designed to make thousands of PRs every day, which a lot of these aren’t.

Really unfortunate.

kepix@lemmy.world · 3 months ago

gzdoom just simply banned ai code, and made a new fork that tries to stay clean. why cant they do the same?

qarbone@lemmy.world · 3 months ago

Is all AI code tagged “hey, Claude made this puddle of piss code”?

This is a real “just catch all the criminals” type comment.

The_Decryptor@aussie.zone · 3 months ago

Much in the same way that laws don’t prevent crime, a project banning AI contributions doesn’t stop people from trying to sneak in LLM slop, it instead lets the project ban them without argument.

woelkchen@lemmy.world · 3 months ago

gzdoom just simply banned ai code

You got that wrong. Graf Zahl added AI code and all the other contributors left to fork it to create UZDoom.

BitsAndBites@lemmy.world · 3 months ago

It’s everywhere. I was just trying to find some information on starting seeds for the garden this year and I was met with AI article after AI article just making shit up. One even had a “picture” of someone planting some seeds and their hand was merged into the ceramic flower pot.

The AI fire hose is destroying the internet.

maplesaga@lemmy.world · 3 months ago

I fear when they learn a different layout. Right now it seems they are usually obvious, but soon I wont be able to tell slop from intelligence.

jj4211@lemmy.world · 3 months ago

You will be able to tell slop from intelligence.

However, you won’t be able to tell AI slop from human slop, and we’ve had human slop around and already overwhelming, but nothing compared to LLM slop volume.

In fact, reading AI slop text reminds me a lot of human slop I’ve seen, whether it’s ‘high school’ style paper writing or clickbait word padding of an article.

badgermurphy@lemmy.world · 3 months ago

One could argue that if the AI response is not distinguishable from a human one at all, then they are equivalent and it doesn’t matter.

That said, the current LLM designs have no ability to do that, and so far all efforts to improve them beyond where they are today has made them worse at it. So, I don’t think that any tweaking or fiddling with the model will ever be able to do anything toward what you’re describing, except possibly using a different, but equally cookie-cutter way of responding that may look different from the old output, but will be much like other new output. It will still be obvious and predictable in a short time after we learn its new obvious tells.

The reason they can’t make it better anymore is because they are trying to do so by giving it ever more information to consume in a misguided notion that once it has enough data, it will be overall smarter, but that is not true because it doesn’t have any way to distinguish good data from garbage, and they have read and consumed the whole Internet already.

Now, when they try to consume more new data, a ton of it was actually already generated by an LLM, maybe even the same one, so contains no new data, but still takes more CPU to read and process. That redundant data also reinforces what it thinks it knows, counting its own repetition of a piece of information as another corroboration that the data is accurate. It thinks conjecture might be a fact because it saw a lot of “people” say the same thing. It could have been one crackpot talking nonsense that was then repeated as gospel on Reddit by 400 LLM bots. 401 people said the same thing; it MUST be true!

Urist@lemmy.ml · 3 months ago

I think the point is rather that it is distinguishable for someone knowledgeable on the subject, but not for someone is not. Thus making it harder to evolve from the latter to the former.

ZeroOne@lemmy.world · 3 months ago

So I guess it is time to switch to a different style of FOSS development ?

The cathedral style, which is utilized by Fossil, basically in order to contribute you’ll have to be manually included into the group. It’s a high-trust environment where devs know each other on a 1st-name basis

ThirdConsul@lemmy.zip · 3 months ago

What if I want to contribue to a FoSS project because I’m using it but I don’t want to make new friends?

ZeroOne@lemmy.world · edit-2 3 months ago

But you still use it, no ?

RemADeus@thelemmy.club · 3 months ago

That is a wonderful method because it works in a similar way of many FediVerse server administrators admitting people to new accounts. This way is the slop is immediately filtered away

Ænima@lemmy.zip · 3 months ago

But what if the code is your own and super embarrassing?

ZeroOne@lemmy.world · edit-2 3 months ago

Why would your code be embarassing ? Yes I get it, but so what But at least it’s not AI-Slop, you fork it & do your own thing.

It’s not a perfect solution

Ænima@lemmy.zip · 3 months ago

I doubt my skills are sufficient enough in anything I make, feel less confident in it, and more judged by others critiquing me for it. I know I don’t suck at what I’ve done so far, but I never feel good enough to share my work with the public at large.

raynethackery@lemmy.world · 3 months ago

This is big tech trying to kill FOSS.

village604@adultswim.fan · 3 months ago

Which is funny because most of them rely on it

derAbsender@piefed.social · 3 months ago

Stupid question:

Are there really no safe guards to the merging process except for human oversight?

Isnt there some “In Review State” where people who want to see the experimental stuff, can pull this experimental stuff and if enough™ people say “This new shit is okay” it gets merged?

So the Main Project doesnt get poisoned and everyone can still contribute in a way and those who want to Experiment can test the New Stuff.

Mniot@programming.dev · 3 months ago

There are automated checks which can help enforce correctness of the parts of the code that are being checked. For example, we could imagine a check that says “when I add a sprite to the list of assets, then the list of assets becomes one item longer than it was before”. And if I wrote code that had a bug here, the automated check would catch it and show the problem without any humans needing to take the time.

But since code can do whatever you write it to do, there’s always human review needed. If I wrote code so that adding a sprite also sent a single message to my enemy’s Minecraft server then it’s not going to fail any tests or show up anywhere, but we need humans to look at the code and see that I’m trying to turn other developers into a DDoS engine.

As others replied, you could choose to find and run someone’s branch. This actually does happen with open-source projects: the original author disappears or abandons the project, other people want changes, and someone says “hey I have a copy of the project but with all those changes you want” and we all end up using that fork instead.

But as a tool for evaluating code that’ll get merged, it does not work. Imagine you want to check out the new bleeding-edge version of Godot. There’s currently ~4700 possible bleeding-edge versions, so which one will you use? You can’t do this organically.

Most big projects do have something like beta releases. The humans decide what code changes to merge and they do all that and produce a new godot-beta. The people who want to test out the latest stuff use that and report problems which get fixed before they finally release the finished version to the public. But they could never just merge in random crap and then see if it was a good idea afterward.

brucethemoose@lemmy.world · edit-2 3 months ago

Many do have automated checking, testing, rules for the PR maker to follow and such.

If they don’t have it set up, and the project is big, TBH the maintainers should set it up.

The issue is that these submitters are (often) drive-by spammers. They aren’t honest, they don’t care about the project, they just want quick kudos for a GitHub PR on a major project.

Filtering a sea of scammers is a whole different ballgame than guiding earnest, interested contributors. Automated tooling isn’t set up for that because (outside the occasional attempt to sneak malware into code) it wasn’t really a thing.

Little8Lost@lemmy.world · 3 months ago

It would be nice to bump upthe useful stuff through the community but even then there could be bot accounts that push the crap to the top

JackbyDev@programming.dev · 3 months ago

You can always checkout the branch and run it yourself.

Kissaki@feddit.org · edit-2 3 months ago

Most projects don’t have enough people or external interest for that kind of process.

It would be possible to establish some tooling like that, but standard forges don’t provide that. So it’d feel cumbersome.

And in the end you’re back at having contributors, trustworthiness, and quality control. Because testing and reviewing are contributions too. You don’t want just a popularity contest (I want this) nor blindly trust unknown contribute.

Captain Aggravated@sh.itjust.works · 3 months ago

It is my understanding that pull requests say “Hey, I forked and modified your project. Look at it and consider adopting my changes in your project.” So anyone who wants to look at the “experimental stuff” can just pull that fork. Someone in charge of the main branch decides if and when to merge pull requests.

The problem becomes the volume of requests; they’re kinda getting DDOS’d.

blipcast@lemmy.world · 3 months ago

Yup! Replace the word “fork” with “branch” and that basically matches the workflow. Forking implies you are copying the code in its current state and going off to do your own thing, never to return (but maybe grabbing updates from time to time).

One would hope that the users submitting these PRs vetted to LLM’s output before submitting, but instead all of that work is getting shifted onto the maintainers.

ILikeBoobies@lemmy.ca · edit-2 3 months ago

https://github.com/godotengine/godot/pulls

Is what you’re referring to but even if you have dedicated testers that’s still people who have to go through the influx of pulls.

Then there’s preference changes as well.

https://github.com/godotengine/godot/pull/116434/commits/6a2fc8561da8fcf168cea3aff5a8cba77266b026

Even if there’s nothing wrong with this one for instance. Someone will like “get rid of hard-coded” where as I would oppose this change because it makes it harder to read.

So you still need core team to look over it. If ai gives you 1000 of these in different areas it’s wasting time. While people can read about standards, ai doesn’t rather it just does what it’s told.

SocialMediaRefugee@lemmy.world · 3 months ago

A similar problem is happening in submissions to science journals.

Hemingways_Shotgun@lemmy.ca · 3 months ago

This was honestly my biggest fear for a lot of FOSS applications.

Not necessarily in a malicious way (although there’s certainly that happening as well). I think there’s a lot of users who want to contribute, but don’t know how to code, and suddenly think…hey…this is great! I can help out now!

Well meaning slop is still slop.

HugeNerd@lemmy.ca · 3 months ago

Time to become a plumber!

Bongles@lemmy.zip · 3 months ago

I don’t contribute to open source projects (not talented enough at the moment, I can do basic stuff for myself sometimes) but I wonder if you can implement some kind of requirement to prove that your code worked to avoid this issue.

Like, you’re submitting a request that fixes X thing or adds Y feature, show us it doing it before we review it in full.

selfAwareCoder@programming.dev · 3 months ago

The trouble is just volume and time, even just reading through the description and “proof it works” would take a few minutes, and if you’re getting 10s of these a day it can easily eat up time to find the ones worth reviewing. (and these volunteers are working in their free time after a normal work day, so wasting 15 or 30 minutes out of the volunteers one or two hours of work is throwing away a lot of time.

Plus, when volunteering is annoying the volunteers stop showing up which kills projects

Magnum, P.I.@infosec.pub · 3 months ago

Tests, what you are asking for are automated tests.

Bongles@lemmy.zip · 3 months ago

Can that be done on github?

brotato@slrpnk.net · 3 months ago

Yep, take a look into GitHub actions. Basically you can make it so that a specific set of tests are run every time a PR is opened against your code repo. In the background it just spins up a container and runs any commands you define in a YAML config file.

no_circumlocution@lemmy.world · 3 months ago

Better yet, don’t use GitHub!

GreenKnight23@lemmy.world · 3 months ago

just deny PRs that are obvious bots and ban them from ever contributing.

even better if you’re running your own git instance and can outright ban IP ranges of known AI shitlords.

sin_free_for_00_days@sopuli.xyz · 3 months ago

The bots, they don’t like when you do that.

GreenKnight23@lemmy.world · 3 months ago

fuck em

If my own mother can’t shame me, a glorified sex bot has a snowballs chance in hell of doing it.

chunes@lemmy.world · 3 months ago

well godot games couldn’t get much slower, so

order216@lemmy.world · 3 months ago

Why people try to contribute even if they don’t work on their codes? Ai slop not helping at all.

AnyOldName3@lemmy.world · 3 months ago

CV padding and main character syndrome.

brucethemoose@lemmy.world · 3 months ago

Godot is also weighing the possibility of moving the project to another platform where there might be less incentive for users to “farm” legitimacy as a software developer with AI-generated code contributions.

Aahhh, I see the issue know.

That’s the incentive to just skirt the rules of whatever their submission policy is.

gwl [he/him]@lemmy.blahaj.zone · 3 months ago

https://bsky.app/profile/peaklabs.dev/post/3metye7c5dk2p - Antislop Code Action