The “correct” way to use AI for coding (and anything really) is to ask for explanations / tutorials when you can’t find one online, then learn from that.
except the “explanation” frequently will be 100% “hallucinated” bullshit
For what it’s worth, I’ve been working on (yet another) ActivityPub based micro blogging application and LLMs have been enormously helpful and so far as I can tell, correct. Often it cites the AP specs and its extensions, as well as specific implementations from existing major AP apps. It can show me expected outputs, what responses from my app should look like in response to different requests from other servers, and quickly give context for features like Mastodon’s shared inbox. I’m not having it simply generate code, but I think I’m still moving way faster than I otherwise could. I don’t recall it ever giving me incorrect information.
It’s the first time I’ve used an LLM as a tool this way, and I’m pretty impressed with it. I’m using the assistant made available through Kagi.
Tip: check those citations yourself before publishing with your name on the product. Yeah, they’re usually correct - do you only usually not want to be perceived as a lazy idiot?
People say the best way to see this is asking AI about subject you’re expert of.
This is not always possible, I had people who said “but I’m not expert at anything”. Another way is to ask them about yourselves. For example if you have reddit account that is has some age, Gemini has deal with reddit and feeds them everything that’s posted. First response might even look good, but continue talking (as it is getting more ridiculous), don’t try correct, you can see how it is making shit up.
Since they are feeding it with everything lemmy might also work.
I’ve seen very mixed results depending on which model I’m using. The newer ones, since about November of 2025, have been getting significantly better - but some of the “free class” tools are still using older ones today.
Free Gemini gave me extremely ridiculously bad advice about how to get through a traffic jam today. Free Gemini also drew the crudest sketch imaginable for a prompt, same prompt fed to ChatGPT yielded a really nice quality cartoon panel of basically exactly everything in the prompt, with some nice/appropriate embellishments.
I’ve become rather disillusioned with Gemini’s use of search tools lately. It’s odd given that it’s a Google model, you’d think Google would be at the top of the search engine game. But honestly, Deepseek’s been my go-to lately when I want an answer that’s likely to be synthesized from a lot of web searches. I’ve had it search over a hundred different pages for a generic “how does this work?” Sort of query. It didn’t read them all, but it’s casting a wide net and it’s letting me actually see the details. Gemini seems more willing to just tell me what it “thinks” the answer to a question is based off of its training data, which is not a particularly reliable thing for an LLM to do.
Gemini seems more willing to just tell me what it “thinks” the answer to a question is based off of its training data, which is not a particularly reliable thing for an LLM to do.
Yeah. I pay for Claude, my company pays even more for Cursor, so comparing them to free Gemini probably isn’t fair.
Gemini is very useful for offhand queries while Claude is chewing on a bigger problem, but if it’s something that needs complex analysis and/or extensive research… the tools that let you build up a folder full of files related to the task are vastly superior to chatbots. Gemini does have a Claude Code command line tool that does that kind of development in a folder, I didn’t install it until last week. Gave it a coding problem to work on (lookup realtime weather radar data from NOAA, present recent data on a map on a webpage)… it sort of succeeded, but with poor user experience. Again, I’m in “Free mode” which can do quite a bit on a day’s allowance of tokens, but… I don’t feel like their paid modes would be particularly higher quality. If they are, they’re doing themselves a tremendous disservice by demoing such substandard performance in free mode.
That’s why I always ask it to cite sources. Basically googld ATP since google is turning to shit and all other search engines still aren’t quite as good
It could very easily use a completely different or hallucinated source.
But a lot of LLM products are now providing source links right in the response. I’ve found them useful, and hopefully they aren’t produced just by feeding the text back in and asking for a link.
except the “explanation” frequently will be 100% “hallucinated” bullshit
For what it’s worth, I’ve been working on (yet another) ActivityPub based micro blogging application and LLMs have been enormously helpful and so far as I can tell, correct. Often it cites the AP specs and its extensions, as well as specific implementations from existing major AP apps. It can show me expected outputs, what responses from my app should look like in response to different requests from other servers, and quickly give context for features like Mastodon’s shared inbox. I’m not having it simply generate code, but I think I’m still moving way faster than I otherwise could. I don’t recall it ever giving me incorrect information.
It’s the first time I’ve used an LLM as a tool this way, and I’m pretty impressed with it. I’m using the assistant made available through Kagi.
Tip: check those citations yourself before publishing with your name on the product. Yeah, they’re usually correct - do you only usually not want to be perceived as a lazy idiot?
And that’s why you have to check everything.
People say the best way to see this is asking AI about subject you’re expert of.
This is not always possible, I had people who said “but I’m not expert at anything”. Another way is to ask them about yourselves. For example if you have reddit account that is has some age, Gemini has deal with reddit and feeds them everything that’s posted. First response might even look good, but continue talking (as it is getting more ridiculous), don’t try correct, you can see how it is making shit up.
Since they are feeding it with everything lemmy might also work.
I’ve seen very mixed results depending on which model I’m using. The newer ones, since about November of 2025, have been getting significantly better - but some of the “free class” tools are still using older ones today.
Free Gemini gave me extremely ridiculously bad advice about how to get through a traffic jam today. Free Gemini also drew the crudest sketch imaginable for a prompt, same prompt fed to ChatGPT yielded a really nice quality cartoon panel of basically exactly everything in the prompt, with some nice/appropriate embellishments.
I’ve become rather disillusioned with Gemini’s use of search tools lately. It’s odd given that it’s a Google model, you’d think Google would be at the top of the search engine game. But honestly, Deepseek’s been my go-to lately when I want an answer that’s likely to be synthesized from a lot of web searches. I’ve had it search over a hundred different pages for a generic “how does this work?” Sort of query. It didn’t read them all, but it’s casting a wide net and it’s letting me actually see the details. Gemini seems more willing to just tell me what it “thinks” the answer to a question is based off of its training data, which is not a particularly reliable thing for an LLM to do.
Yeah. I pay for Claude, my company pays even more for Cursor, so comparing them to free Gemini probably isn’t fair.
Gemini is very useful for offhand queries while Claude is chewing on a bigger problem, but if it’s something that needs complex analysis and/or extensive research… the tools that let you build up a folder full of files related to the task are vastly superior to chatbots. Gemini does have a Claude Code command line tool that does that kind of development in a folder, I didn’t install it until last week. Gave it a coding problem to work on (lookup realtime weather radar data from NOAA, present recent data on a map on a webpage)… it sort of succeeded, but with poor user experience. Again, I’m in “Free mode” which can do quite a bit on a day’s allowance of tokens, but… I don’t feel like their paid modes would be particularly higher quality. If they are, they’re doing themselves a tremendous disservice by demoing such substandard performance in free mode.
That’s why I always ask it to cite sources. Basically googld ATP since google is turning to shit and all other search engines still aren’t quite as good
Then why not ask just for the sources and read them yourself?
It could very easily use a completely different or hallucinated source.
But a lot of LLM products are now providing source links right in the response. I’ve found them useful, and hopefully they aren’t produced just by feeding the text back in and asking for a link.
That’s exactly how those links are produced.