Hacker Newsnew | past | comments | ask | show | jobs | submit | more xpuente's commentslogin

Proof of that this is wrong, is that our brains require <20W of power. Non-brute force AI will be very efficient. Simple operations (int add/cmp) and no float matrix multiplications and keep moving around the same data all the time from memory to the ALU. GPUs will be overthrown by simple PIM.


Security through obscurity is really a bad idea, and Apple is no exception. In the long run, this will likely drive the adoption of RiscV as a better alternative.


This RISC-V evangelism is worrying. Using RISC-V doesn't make your system secure; Good ISA implementations do. The ISA has no bearing on security vulnerabilities. Perhaps a faulty decoder could be a vulnerability vector, but a faulty RISC-V decoder wouldn't be compliant, and neither would a faulty ARM decoder.

If I add a custom crypto extension to a RISC-V core and implement it badly, is that the fault of RISC-V? No! It's my own. And RISC-V doesn't help anyone here because their license allows me to keep my extension completely closed source - no different than Apple is today with ARM.


My comment was not about the ISA implementation or specification, It's about the TCB (trusted compute base), which in Apple (like intel and AMD) is closed. In RiscV is open. I would recommend you to educate yourself on any topic before lecture others.


>The ISA has no bearing on security vulnerabilities.

Complexity leads to bugs, some of which are going to be security bugs.

ISAs impose complexity upon implementations. To claim they do not matter would be disingenuous.


What does this have to do with security through obscurity? This is an issue with cache prefetching.


It has to with the secure processor. Although you seems to ignore what is the TCB.


No, this only works on the regular processor cores. It's a cache timing attack that depends on the attack code and the targeted cryptographic code running on processors that share cache.

See the FAQ at https://gofetch.fail/


Yes. This is good for Bitcoin.


I wonder what will happen if someone figures out that fused FP mult add is no longer needed (e.g. just count spikes and add subtract permanence). This could be a big problem for the guys with all their eggs in one basket (like NVIDIA).


This is what I believe as well. There's no way that FMA is the optimal way to compute NNs in silicon. It's overly precise.

NVIDIA will be okay though, they have the volume to get the newest nodes first. Only a few can compete with them.


Most likely related to this:

https://www.searchenginejournal.com/openai-pauses-new-chatgp...

The back-end cost does not scale. Hence, they have a big problem. AGI nonsense reasons are ridiculous. Transformers are a road to nowhere and they knew it.


We need RISCV hardware with Ztso extension to shine here. Enforcing Total store ordering (TSO) over a release consistency (RC) memory model is otherwise a pain. I don't know if Dynarec is tackling this problem (or just focusing emulation on single-threaded software).


Dynarec can optionnaly handle Strong Memory Model emulation, but it's disable by default to have maximum emulation speed (it's the BOX86_DYNAREC_STRONGMEM env. var.).


Ztso was only ratified in January. It'll be a while until hardware.


The first hypothesis seems to be false. We have no idea how the brain works or how conscience arises (and what it is). So how can we reproduce or surpass it with a pattern recognition machine like a DL system? Just because the brain uses pattern recognition as a trick to build a model of the world does not mean that DL magically produces conscience. There is too much evidence that biological systems don't work that way, and all this discussion seems like old alchemists discussing whether gold transmutation could destroy the world. We have to learn "chemistry" first. Later we can start discussing that (or deeper issues like what will be the relationships between us and sentient pieces of silicon).


You don’t need consciousness to dangerous automation.


[1] is not very far from that. IMHO, [1] it is better.

[1] https://pages.cs.wisc.edu/~remzi/OSTEP/vm-freespace.pdf


OSTEP is what made it _finally_ click for me, but I think the online demo is great because of the interactivity. Both have their place!


Too bad. Not a single mention of CTLoop in the whole paper.


I wonder why they don't start to use secure enclaves to fix this cheating debacle? This might solve the problem for good. Between the cheaters and the companies trying to stem the tide, is destroying FPS online gaming.

I guess intel SGX chaos didn't help either.


How many uOps is that instruction in x86_64?

Note that while the performance effect of L1I cache misses is negligible, the complex x86-64 decoder may not be. Some SPECpu2017 have front-end problems due to the x86-64 decoder.


`MOV Rq, Iq` takes one uop with a throughput of up to four(!) per cycle on some ISAs: https://uops.info/html-instr/MOV_R64_I64.html


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: