wetdry.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
We are a community focused on gaming, tech, entertainment, and more.

Administered by:

Server stats:

717
active users

Contrary to popular belief, AI killwords weren't patched out of GPT-4/ChatGPT - there's just a different set because of the different tokenizer.

@MrCheeze what exactly is an ai killword, is it a bug or a debug feature or something?

@chrisisgr8 @MrCheeze basically it's a string that exists in the ai's dictionary but is never used in the training data so it doesn't know what to do when encountering it

computerphile did a good video on the subject: youtube.com/watch?v=WO2X3oZEJO

@nil huh, super interesting. can't wait to pepper some of those into my code :3

@chrisisgr8 @nil Note that they had already stopped training models using the GPT-2/GPT-3 tokenizer before glitch tokens were discovered.

If they reuse GPT-4's tokenizer ("tiktoken cl100k_base") for GPT-5, though, it may be possible to become the exclusive user of some of its tokens in its training data.