Hacker Newsnew | past | comments | ask | show | jobs | submit | thatguyman's commentslogin

SiFive (the biggest RISC-V developer) has already partnered with Intel: https://www.sifive.com/blog/sifive-made-a-splash-at-the-risc...

Quote: "As part of our efforts to build out the RISC-V ecosystem, SiFive has partnered with Intel to develop the HiFive Pro P550 Development System (previously code-named Horse Creek). During his keynote, Patrick was joined on stage by Intel Foundry Services’ Bob Brennan to share a first look at this high performance platform that features a quad-core SiFive Performance™ P550 processor and is implemented in the Intel 4 technology platform. The board will enable a new generation of RISC-V software, continuing the tradition of SiFive HiFive boards that have helped drive the growth of the RISC-V ecosystem. The board will be commercially available in the summer of 2023."


it is interesting. i believe they may be hitting a wall as well


A paper from a week ago found that models trained on multiple data modes perform an order of magnitude better than text-only models of the same or even larger size.


Genuinely curious, what does it mean in this context to perform better? (hopefully that doesn't come across as snarky as text sometimes does).


Some of these large models are able to do zero shot learning and perform tasks they weren't explicitly trained on since the training objective is very general.

Being able to perform more advanced types of zero shot learning tasks would be comparable and further the accuracy on those tasks can be evaluated


Any chance this could improve coding LLMs like Co-pilot? Or would that sort of thing be limited to source code feeds (not that Github has a shortage).


The next big step for coding LLMs will be context window increases, leaked docs have OpenAI pricing for up to 16K I believe, 4x the current maximum. Now you're talking "write a class" instead of this line and maybe sometimes a method


It can already reliably write a todo web browser application.

With 16k and some other techniques, I’m guessing it could write a custom CMS database backed web application.


Not 16k, 32k. 8x the current window.


Nice


What is 16k referring to here


I’ve begun to grok it as “the amount of ram I have to play in before I have to start sharding work”

more literally and correctly, it’s the maximum number of tokens in the input and output, combined, where a token is 4/3 of a word

So we’re shifting from 5K words maximum to 40K (per sibling comment, who pointed out 32K context leaked as well)


A minor correction: a token is 3/4 of a word. ie. it’s slightly smaller than a word, not larger.


Are you referring to PALM-E? It didn't have any positive transfer for NLP tasks, in fact the unfrozen model performed slightly worse after the finetune. That being said, PALM-E wasn't really a multimodal model from the start, it's still basically just a text model with a visual one glued on top. Whether a truly multimodal model will be better at reasoning and data efficiency is still an open question though.


I figure it’s a stand-in for embodiment.


Can you share a link to this paper?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: