Using linear algebra to convert a large code model

mendeza · on Aug 10, 2022

Amazing work! Is it possible to finetune this model on your own code, or a subset? Finetuning this model on pytorch code to help with tensor manipulation would be awesome!

woweoe · on July 28, 2022

And what is the code being used for? It's harder to be able to visualize something like code without knowing the purpose.

lifthrasiir · on July 28, 2022

> I would love to be able to run CodeGen models locally and fast, ideally fast enough that they can be used for interactive tasks like code completion. [...] GPT-J is a very popular model and a lot of work has been put into making fast implementations, like the one in FasterTransformers. [...] Unfortunately, these don't work with CodeGen. Even though the two are 99.9% identical, they're just different enough that you can't naively transfer over the CodeGen weights and run them in a GPT-J implementation.