More

xpuente · on Sept 11, 2024

Proof of that this is wrong, is that our brains require <20W of power. Non-brute force AI will be very efficient. Simple operations (int add/cmp) and no float matrix multiplications and keep moving around the same data all the time from memory to the ALU. GPUs will be overthrown by simple PIM.

xpuente · on March 22, 2024

Security through obscurity is really a bad idea, and Apple is no exception. In the long run, this will likely drive the adoption of RiscV as a better alternative.

colejohnson66 · on March 22, 2024

This RISC-V evangelism is worrying. Using RISC-V doesn't make your system secure; Good ISA implementations do. The ISA has no bearing on security vulnerabilities. Perhaps a faulty decoder could be a vulnerability vector, but a faulty RISC-V decoder wouldn't be compliant, and neither would a faulty ARM decoder.

If I add a custom crypto extension to a RISC-V core and implement it badly, is that the fault of RISC-V? No! It's my own. And RISC-V doesn't help anyone here because their license allows me to keep my extension completely closed source - no different than Apple is today with ARM.

xpuente · on March 22, 2024

My comment was not about the ISA implementation or specification, It's about the TCB (trusted compute base), which in Apple (like intel and AMD) is closed. In RiscV is open. I would recommend you to educate yourself on any topic before lecture others.

snvzz · on March 23, 2024

>The ISA has no bearing on security vulnerabilities.

Complexity leads to bugs, some of which are going to be security bugs.

ISAs impose complexity upon implementations. To claim they do not matter would be disingenuous.

tzs · on March 22, 2024

What does this have to do with security through obscurity? This is an issue with cache prefetching.

xpuente · on March 22, 2024

It has to with the secure processor. Although you seems to ignore what is the TCB.

tzs · on March 22, 2024

No, this only works on the regular processor cores. It's a cache timing attack that depends on the attack code and the targeted cryptographic code running on processors that share cache.

See the FAQ at https://gofetch.fail/

meindnoch · on March 22, 2024

Yes. This is good for Bitcoin.

xpuente · on Dec 21, 2023

I wonder what will happen if someone figures out that fused FP mult add is no longer needed (e.g. just count spikes and add subtract permanence). This could be a big problem for the guys with all their eggs in one basket (like NVIDIA).

zarzavat · on Dec 21, 2023

This is what I believe as well. There's no way that FMA is the optimal way to compute NNs in silicon. It's overly precise.

NVIDIA will be okay though, they have the volume to get the newest nodes first. Only a few can compete with them.

xpuente · on Nov 21, 2023

Most likely related to this:

https://www.searchenginejournal.com/openai-pauses-new-chatgp...

The back-end cost does not scale. Hence, they have a big problem. AGI nonsense reasons are ridiculous. Transformers are a road to nowhere and they knew it.

xpuente · on May 31, 2023

We need RISCV hardware with Ztso extension to shine here. Enforcing Total store ordering (TSO) over a release consistency (RC) memory model is otherwise a pain. I don't know if Dynarec is tackling this problem (or just focusing emulation on single-threaded software).

ptitSeb · on May 31, 2023

Dynarec can optionnaly handle Strong Memory Model emulation, but it's disable by default to have maximum emulation speed (it's the BOX86_DYNAREC_STRONGMEM env. var.).

snvzz · on May 31, 2023

Ztso was only ratified in January. It'll be a while until hardware.

xpuente · on May 23, 2023

The first hypothesis seems to be false. We have no idea how the brain works or how conscience arises (and what it is). So how can we reproduce or surpass it with a pattern recognition machine like a DL system? Just because the brain uses pattern recognition as a trick to build a model of the world does not mean that DL magically produces conscience. There is too much evidence that biological systems don't work that way, and all this discussion seems like old alchemists discussing whether gold transmutation could destroy the world. We have to learn "chemistry" first. Later we can start discussing that (or deeper issues like what will be the relationships between us and sentient pieces of silicon).

ChatGTP · on May 23, 2023

You don’t need consciousness to dangerous automation.

xpuente · on May 22, 2023

[1] is not very far from that. IMHO, [1] it is better.

[1] https://pages.cs.wisc.edu/~remzi/OSTEP/vm-freespace.pdf

shepherdjerred · on May 23, 2023

OSTEP is what made it _finally_ click for me, but I think the online demo is great because of the interactivity. Both have their place!

xpuente · on March 31, 2023

Too bad. Not a single mention of CTLoop in the whole paper.

xpuente · on Dec 5, 2022

I wonder why they don't start to use secure enclaves to fix this cheating debacle? This might solve the problem for good. Between the cheaters and the companies trying to stem the tide, is destroying FPS online gaming.

I guess intel SGX chaos didn't help either.

xpuente · on Nov 16, 2022

How many uOps is that instruction in x86_64?

Note that while the performance effect of L1I cache misses is negligible, the complex x86-64 decoder may not be. Some SPECpu2017 have front-end problems due to the x86-64 decoder.

colejohnson66 · on Nov 16, 2022

`MOV Rq, Iq` takes one uop with a throughput of up to four(!) per cycle on some ISAs: https://uops.info/html-instr/MOV_R64_I64.html