@r000t So I guess my initial guess was correct, it was spilling over to RAM and PCIe bandwidth became the bottleneck. It just confused me because task manager didn't show any spill over and inference with llama cpp was faster when i let it spill
@teto probably
@PanzerTan you meant arouse, right?
@LukeAlmighty 3080 12gb, i'm using heretic to uncensor InternVL3.5-14B
@ageha wow, they must be barely selling anything
@ageha buy up their stock and sell it for profit ![]()
@sun oven as in for baking or as in for heating (afaik it's both)
I'm the joke, but you're the punchline.
I run this website. I like posting funnies and fugging lolis.