@yaboisugoi i know, but unless you have 1tb of vram local models are for running funny chatbots, not for anything serious,
and even then Kimi and Deepseek can't compare to GPT and Claude

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:04

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:04

Jun 11, 2026, 20:04

YaboiSugoi @yaboisugoi@fsebugoutzone.org

@matrix you can look for a second hand quadro p6000 or m6000 (24gb variety) on ebay or elsewhere and with that much vram you'll be able to comfortably run qwen 35B 3.5 or 3.6 at 4 bit quantization. in my experience these local models are very good at coding, would highly recommend. if you want, give me a prompt and i'll run it for you and post back the results so you can see yourself
image.png
image.png

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:13

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:13

Jun 11, 2026, 20:13

meso @meso@new.asbestos.cafe

@yaboisugoi @matrix iirc these used to be mega mega MEGA cheap

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:26

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:26

Jun 11, 2026, 20:26

YaboiSugoi @yaboisugoi@fsebugoutzone.org

@meso @matrix compared to everything else, they still are - 1200 "new low price" for an RTX mfw

the reason the old ones are still relatively cheap is because when it comes to AI they're good at running language models thanks to the developments quantization (you can run an LLM on any potato that supports vulkan including p6000 and m6000),

but they're useless for almost all other AI stuff outside LLMs, because other things largely haven't developed minmaxed software through quantization. instead these rely on using
- pytorch that uses cuda internally, and these GPUs are not supported by cuda any longer (as of last november i think), so you have to use very old pytorch dependency that is not supported by newer software for newer models
- and also because these GPUs support no less than fp32 math operations which makes the big 24GB ram not so big if you have to load a 8gb model at fp16 precision and it expands to 16gb at fp32 precision. newer AI stacks work better if your GPU has instructions that can calculate fp16 and even fp8 in some cases on the instruction level, since it allows you to even take a big fp16 model, and have it casted down to fp8 at load time
image.png

75d417db6c10a502f0c25f93ad60e07646b93de0cce6850f7ee5179a5bfc7804?name=image.png

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:15

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:15

Jun 11, 2026, 20:15

meso @meso@new.asbestos.cafe

@yaboisugoi @matrix can I @ you when i need a prompt done i wanna see how this performs but i dont have anything that ive asked my local llm before and it was so fucking useless and bad i uninstalled llama++

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:16

**meso** @meso@new.asbestos.cafe · Jun 11, 2026, 20:16

Jun 11, 2026, 20:16

meso @meso@new.asbestos.cafe

@yaboisugoi @matrix its like, ask it anything about a language that isnt javascript or python and it starts lying about some stackoverflow copied code

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:27

**YaboiSugoi** @yaboisugoi@fsebugoutzone.org · Jun 11, 2026, 20:27

Jun 11, 2026, 20:27

YaboiSugoi @yaboisugoi@fsebugoutzone.org

@meso @matrix yeah sure

**matrix07012** @matrix@gameliberty.club · Jun 11, 2026, 21:12 *

**matrix07012** @matrix@gameliberty.club · Jun 11, 2026, 21:12 *

Jun 11, 2026, 21:12 *

matrix07012 @matrix@gameliberty.club

@yaboisugoi
try running my homework from this semester.

What is the minimum number of bits to which the string
EVE_OTTO_AND_ANNA_DID_GOOD_DEED.OTTO_SEES_EVE.ANNA_SEES_OTTO.EVE_SEES_DAD.
and the string
ABRACADABRACA
using any technique you deem fit. Spaces in the string have been intentionally replaced with underscores (use the underscore character).
Submit Python code containing two functions: the compress function and the decompress function.
The compress function takes a text string as input and produces a string of ones and zeros, which may contain spaces for clarity. The decompress function takes a string containing ones and zeros and produces the decompressed string. The code may contain definitions of other functions.
Functions should be generic, i.e., not return 'ABRACADABRACA' if '0' else 'EVE_...'; they must not derive the output from the state of a global variable (e.g., returning an index into a cache of strings used as input to the compress method).

After iterating for a bit Opus 4.6 got 313 bits on the first string and 59 on the second.

and after telling it try better it got 274 and 56

and those aren't good results.

Qwen 3.6 35b is really impressive for its size but I don't think it's a good enough replacement.