Opus 4.5 can oneshot 600 line Mastodon bot

It's truly joever

Nope, it still made a bug, but wow, what a progress from gpt-3.5

wtf, i don't think that's a bug, i think that's the model shitting the bed

and qwen3-vl heretic just stops thinking, wtf
{'role': 'assistant', 'reasoning_content': 'Okay, the user wants me to role', 'content': ''}

Way too slow if you can't fit it in VRAM.
Letting the GPU do all the processing and letting it overflow into RAM is faster than having llama.cpp split it between CPU and GPU

interesting though, seems like prompt caching can poison rhe context and start things like typing in all caps

I turned cache off and restarted the process! Why is it doing all caps even now?

I liked Gemma more than than Qwen, but unfortunately 12b is too retarded to format answers correctly.

Sign in to participate in the conversation
Game Liberty Mastodon

Mainly gaming/nerd instance for people who value free speech. Everyone is welcome.