First off, it worked on my floor-standing PC: Zen3 cpu (5950X), 32 GB memory, RX570 CPU (4 GB) using mesa/rusticl and not ROCM. The ollama docs say it needs ROCM, but no, it seems to work fine with rusticl - btop showed my GPU and CPUs working hard (generating useful warmth on a cold day).
I downloaded a couple of LLMs - codellama:34b - 19 GB of model supposedly optimized for code generation, from Meta IIUC, and qwen3:14b - 9 GB model optimized for code translation and the like, developed by Alibaba Cloud.
First attempt was with the larger codellama, given the task
- literally just that, no preamble, just fire up ollama and reply to the prompt with that. It reasonably quickly wrote a fairly small Java class which met the stated requirement but no more.write a java class that handles sub-arrays of a byte array like strings and substrings
Second attempt with qwen3, same task. Qwen3 ships with thinking mode output enabled, so there was a lot of explanation, all of it very sensible, and the resulting Java class had more function, and was properly commented, including warnings about the tradeoff of performance against safety (the point of this particular way of implementing substrings/subarrays is that the are effectively a base-limit pair into an existing data structure, so the subarray may reflect changes to the underlying storage), and test cases.
I then gave it a further task:
IMHO Qwen3 excelled - it spotted that Rust has a "slice" entity that actually does what I want. Rather than say "stoopid pointy-haired boss doesn't know Rust" (which is right) it decided that perhaps I wanted something that reflected the Java API it had generated, but using Rust and slices.write a Rust version of that program
Performance of the qwen3 model was to produce output about as fast as I could read and understand - perhaps 2 tokens per second, maybe more in the thinking phase; it was like watching someone think and type at the same time. Personally, if I'd been trying to recruit a programmer, I'd have given qwen3 the job immediately and sent the rest home - it amazed me.
I could post the output, but I'm not sure if that would contravene the Gentoo policy on AI code.
I did a quick check with Google to see if it was merely regurgitating a discussion on stackexchange or similar, but it appears not. The nearest is some code in Google's "protocol buffers", but that does not appear to have the discussion of "why do it this way" reflected in qwen3's thinking.




