yep, this is the way, i guess. as somebody who has taken this very same exercise...

jawon · 2025-04-12T20:06:19 1744488379

What size gemma are you using? Is the refactoring running independently or managed by you?

ach9l · 2025-04-12T22:14:11 1744496051

i've been testing all models that fit the mac studio 512 gb ever since i got it. previously i was mostly focused on getting tool use and chain of thought fine-tuning for coding, around the size of llama 3.2 11b. but even some distill r1s on llama 3 70b run well on macbooks, although quite slow compared to a regular api call to the closed models.

for mac studios i've found the sweet spot to be the largest gemma, up until llama scout was released, which fits the mac studio best. scout, although faster to generate, takes a while longer to fill in the long context, basically getting the same usability speeds as with the qwq or gemma 27b.

the refactoring is a test driven task that i've programmed to run by itself, think deep research, until it passes the tests or exhausts imposed trial limits. i've wrote it by instructing gemini, r1 and claude. in short, i've made gemini read and document proposals for refactoring, based on the way i code and strict architectural patterns that i find optimal for projects that handle both an engine and some views such as the react.js views that are present in these vscode extensions.

gemini pro gets it really well and has enough context capacity to maintain several different branches of the same codebase with these crazy long files without losing context. once this task is completed, training a smaller model based on the executed actions, (by that i mean all the tool use: diff, insert, replace and most importantly, testing) to perform the refactoring instructions is fairly easy.