More

_just7_ · 2024-12-28T23:11:29 1735427489

I would be much more intrested in a piece on what you can train with this kind of rig, rather than the rig itself

minimaxir · 2024-12-28T23:24:04 1735428244

The bottleneck for most model training sizes is VRAM, and since each 4090 has 24 GB VRAM, that's 96 GB VRAM total. The article mentions that it can train LLMs from scratch up to 1 billion hyperparameters, which tracks.

Nowadays that's not a lot: a single H100 that you can now rent has 80 GB VRAM, and doesn't have the technical overhead of handling work across GPUs.

tmostak · 2024-12-29T03:19:09 1735442349

You should be able to train/full-fine-tune (i.e. full weight updates, not LoRA) a much larger model with 96GB of VRAM. I generally have been able to do a full fine-tune (which is equivalent to training a model from scratch) of 34B parameter models at full bf16 using 8XA100 servers (640GB of VRAM) if I enable gradient checkpointing, meaning a 96GB VRAM box should be able to handle models of up to 5B parameters. Of course if you use LoRA, you should be able to go much larger than this, depending on your rank.

sabareesh · 2024-12-28T23:37:21 1735429041

Definitely agree but part of the reason why i built this to learn about all the overhead and gotchas

llm_nerd · 2024-12-29T03:39:59 1735443599

Is there a reason you used hyperparameters rather than parameters? I was going to politely correct the terminology but you seem to be in AI for some time so either it was a mistype or I am misunderstanding what you are referencing.

didgeoridoo · 2024-12-29T04:05:56 1735445156

I imagine that when you get really deep into model training, it can seem like there are a billion hyperparameters you have to worry about.

minimaxir · 2024-12-29T05:50:51 1735451451

It's a force of habit, parameters would be more accurate (almost everyone uses them interchangeably nowadays)

unixpickle · 2024-12-29T06:56:19 1735455379

Wait what? Who actually calls trainable params "hyperparameters"? Nobody at OpenAI does, as far as I know.

minimaxir · 2024-12-29T07:04:50 1735455890

People who are making quick social media posts while taking a casual walk outside on websites that don't make it easy to edit posts and are not expecting to be nitpicked about it.

Overall, it's something I've seen very often on social media and less technical articles about LLMs. OpenAI would fall into the "almost" category.

llm_nerd · 2024-12-29T17:31:13 1735493473

It's okay to say that you mistyped or whatever, while taking a casual walk outside on websites that don't make it easy to edit posts and are not expected to be nitpicked about it. Throwing in that everyone uses them interchangeably, however, is just profoundly wrong on every level.

I wasn't nitpicking. It is a HUGE differentiation, and I pointed it out specifically because people pick up on terminology so people who might not know better will go forward and just drop in the more super duper hyperparameter, not realizing that it makes them look like they don't know what they're talking about. As I said in the other post, no one who knows anything uses them interchangeably. It is just completely wrong.

minimaxir · 2024-12-29T20:42:32 1735504952

Again, I've heard and used the terminology "model hyperparameter" in place of "model parameter", and I've also heard "model parameter" in place of "model hyperparameter" because not every human interaction is a paper on arXiv and the terms are obviously very similar. The context of the term is what matters in the end (as demonstrated by other comments following my correct intent), and society will not crumble if using either term incorrectly in casual conversation. No one intentionally uses the wrong term, but as jokingly said in another comment "when you get really deep into model training, it can seem like there are a billion hyperparameters you have to worry about."

I appreciate being corrected, but you are the one who asked for my opinion based on my extensive time in AI, you can choose to believe it or not.

Bancakes · 2024-12-29T03:09:50 1735441790

I doubt the RAM is added up. I think that’s only a feature reserved for their NVLinked HPC series cards. In fact, without nvlink, I don’t see how you’d connect them together to compute a single task in a performant and efficient way.

minimaxir · 2024-12-29T05:54:28 1735451668

It depends on how the parallelism is implemented, e.g. distributed data parallel (DDP) to synchronize gradients: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html

It's a rabbit hole I stay away from for pragmatic reasons.

whimsicalism · 2024-12-29T17:12:07 1735492327

yeah essentially this

sabareesh · 2024-12-28T23:25:06 1735428306

Here is some additional journey apart from the rig. https://sabareesh.com/posts/llm-intro/

layer8 · 2024-12-29T00:19:27 1735431567

How long does training a 1B or 500M model take approximately on the 4-GPU setup? Or does that dramatically depend on the training data? I didn’t see that info on your pages.

sabareesh · 2024-12-29T01:22:15 1735435335

Roughly it takes 7 days to train on 100B tokens on 500M model

paxys · 2024-12-29T03:28:37 1735442917

And where you get the training data from.

sabareesh · 2024-12-29T03:30:22 1735443022

Start with FineWebEdu

_just7_ · on Nov 22, 2024

I think you mean Sam Altman

OJFord · on Nov 23, 2024

They've mixed up with Sam Bankman-Fried, not sure how that affects the point they were intending to make, but I think they both have.. mixed reputations. (Only one is currently in prison though...)

napier · on Nov 23, 2024

maybe he does. but which one is in prison?

_just7_ · on May 8, 2024

No, infact most journals have peer reviews cordoned off, not viewable to the general public.

lupire · on May 8, 2024

That's pre-publication review, not scientific peer review. Special interests try to conflate the two, to bypass peer review and transform science into a religion.

Peer review properly refers to the general process of science advancing by scientists reviewing each other's published work.

Publishing a work is the middle, not the end of the research.

_just7_ · on March 13, 2024

2024 election

_just7_ · on Feb 22, 2024

The simplest way to do it is probably just to have a high quality 360 camera, that way you mostly get around the problem of orientation

Someone · on Feb 22, 2024

Yes, but even that isn’t simple, I think. They’d not want to land on top of it, so they’d have to push it out from the lander or have it propel itself away from the lander. If they push it out and it doesn’t have a way to stabilize itself, keeping the lens pointing upwards then will require tight control over that push.

So, I guessed (see below) you’d need power to make the sat orient itself.

However, I googled a bit more, and found this: https://mynews13.com/fl/orlando/space/2024/02/21/embry-riddl..., which says:

“EagleCam will be spring ejected from the Nova-C class lander Odysseus about 30 meters above the lunar surface during the final descent. It will take three images a second from each of its three cameras (a total of nine images a second), capturing its six-second freefall to the surface and Odysseus’ descent and soft landing. About an hour after landing, our team will receive the five images of our choosing. During descent, Dr. Henderson and I will be timing events in landing sequence to match to image numbers to choose the first five images we bring back to Earth. Once we have those images, I will post them directly to @eraueaglecam on Instagram. Shortly after that, they will also be available on @spacetechnologieslab on Instagram and @SpaceTechLab on X (formerly Twitter).”

So, it isn’t a 360 camera, and they’re making 50-ish images and hoping for the best. Doesn’t look like the sat has rockets or that they’re trying to make it possible to make more photos after impact on the moon.

If my guesses/intuition is right we won’t see the actual touchdown (still cool to have anything, of course), but corrections welcome.

adastra22 · on Feb 22, 2024

Have two. One on each side. Doesn't matter if one ends up in the regolith.

volemo · on Feb 22, 2024

It’d surely land sidewise. :D

adastra22 · on Feb 22, 2024

Which is fine. 360 degree camera.

TeMPOraL · on Feb 22, 2024

Have three, 120 degrees apart. They'll double as backup landing legs.

adastra22 · on Feb 22, 2024

Wouldn’t be any better. You’d need 4 to be able to reliably land with one pointed out of the regolith. That’s probably pushing it in terms of mass. 3 wouldn’t be any better than 2 though.

adolph · on Feb 22, 2024

> push it out from the lander

Selfie stick sounds simpler.

_just7_ · on Dec 11, 2023

He literally just said that's not how volcanos work

frozenport · on Dec 11, 2023

Lol, thats a quote from the article.

andrewflnr · on Dec 12, 2023

I, for one, am pretty sure the article is wrong there. What I've heard from geologists is that Yellowstone's magma chambers simply are not molten enough to erupt.

fooker · on Dec 12, 2023

https://abcnews.go.com/US/yellowstone-supervolcano-lot-magma...

andrewflnr · on Dec 12, 2023

Right, "up to 20%", whereas the threshold for "barely eruptible" is in the 50% range, according to this article: https://www.pnas.org/doi/10.1073/pnas.1617105113 Unless the rules are way different for supereruptions, which I guess we have to consider, but probably not 30% different.

_just7_ · on Nov 23, 2023

Held in deposit forever, ie until someone sues someone else or the money are forgotten

VHRanger · on Nov 24, 2023

That goes to Apple then.

The accounting liability gets erased after a few years and the asset stays on the books

Like Starbucks gift cards that go unused.

_just7_ · on July 10, 2023

Yeah I don't really think a meaningfully larger amount people are traveling because Airbnb is an app that exist, so demand from tourism should stay flat, no?

ajsnigrutin · on July 10, 2023

Apartments in residential buildings are well.. apartments, not hotels. If you're a tourist, go to a hotel, I don't want a new set of loud obnoxius neighbors partying every few days next door to my apartment (and a shared wall in between). The apartment that should be rented out at cheaper monthly rate is now getitng rented out to lud tourists per daily rates, and that is a bad thing for locals living in that city. Some areas are already destroyed by airbnbs, so banning airbnbs (and all the other short-term rentals of residential apartments) should be banned.

cycrutchfield · on July 11, 2023

Why should you get a say about what your neighbor does with their property or what guests they have stay over?

piva00 · on July 11, 2023

An apartment building is owned collectively and with that comes a set of rules. In most places you aren't allowed to be a nuisance, even though you own a share of the house you don't get to dictate or do whatever you want just because you own something. Your property is just a part of a larger property, respect others or get the fuck out.

It's a pretty simple arrangement that one gets into when they purchase an apartment.

cycrutchfield · on July 11, 2023

Why do you assume that guests that stay over are a nuisance?

piva00 · on July 11, 2023

If you host a few guests a year, for a few days, and they are your friends or people who you might trust, no issue whatsoever.

A never ending cycle of tourists staying for 2-10 days as your neighbours? That's definitely a nuisance over time, very improbable that churning through 50-100 groups of different people per year won't create issues to neighbours.

If you don't see how it could be an issue I think only if you lived in a touristic place, neighbour to AirBnBs, to actually understand. I say that not to provoke you but because I feel it's hard to empathise when it's not your day-to-day life. Just this year I experienced that when staying at a friend's place in Lisbon, I shared it on another HN thread:

> As an anecdote: a month ago I stayed a few days (5-7) at a friend's place I was visiting who lives in Lisbon, just on his floor there are 4 AirBnBs (owned by the same person). Not only it was a nuisance with noise for most of the days I was there it was also a nuisance to have drunk British girls banging on your door at 02.00 in the night when they don't remember the fucking apartment they are supposedly going to. My friend mentioned it's not uncommon for that to happen, or to have a gag of people show up to a party in one of the apartments. Other people living in the building have complained to AirBnB, to the police, to the housing association, nothing really happens.

cycrutchfield · on July 11, 2023

So sounds like the bug is lax enforcement of noise ordinances? Not sure why it makes a difference whether the drunk and noisy houseguest is a short-term renter or a friend of the owner.

piva00 · on July 12, 2023

A friend of the owner implies the owner is a neighbour, someone who you can eventually talk about the issue in the house. It makes a difference as there's a greater degree of social bond (and consequently shame).

Are you being obtuse on purpose to not give up on a flawed argument?

subpixel · on July 11, 2023

For the same reason I have a say about zoning and ordinances and just about every other thing that affects my property and my enjoyment of it.

cycrutchfield · on July 11, 2023

My question is why should you have much of a say about those things as well. Smells like a NIMBY

ajsnigrutin · on July 11, 2023

If it's a residential property, it's a residential property, not (what is effectively a) hotel.

cycrutchfield · on July 11, 2023

Again: it’s not your property, so why do you care about their guests staying for 1 night or 1 month? Seems like a rather artificial distinction.

subpixel · on July 11, 2023

I live within walking distance dozens of short-term rentals in a very small town.

The difference between having dozens of neighbors with families, children, and strong ties to the community and not having those neighbors and community connections is in no way artificial.

Some reasons I care: schools that decline with population loss, the lack of available/affordable rental homes for local residents, the effects on local retail and service businesses, the tendency for hurried visitors to drive at unsafe speeds while staring at their phone for directions.

Problems that you do not experience remain real problems.

cycrutchfield · on July 11, 2023

So basically instead of thinking about how you can better accommodate visitor demand and respect property owners’ rights, you would prefer to regulate your “small town feel” back into existence.

subpixel · on July 11, 2023

It seems your passion for this topic is such that you are attributing to me things others may have said in this thread (or elsewhere).

cycrutchfield · on July 11, 2023

I’m just drawing some reasonable inferences from your list of complaints. If you feel I’ve misrepresented your position, feel free to correct me.

ajsnigrutin · on July 11, 2023

Because it's an apartment building with shared space, shared resources and shared investments. If your neighbors plot is a residential plot, he cannot build a pig farm there or a nuclear power plant. If your neighbors apartment is an apartment, he cannot have a shooting range, a bar, a strip club or a hotel there. There's a difference between occasional guests and airBnB, the same as there is a difference between a friend who helps you fix your car vs someone who regulary fixes cars for money.

cycrutchfield · on July 11, 2023

As long as the property owner registers their short term rental with the local government and pays their taxes, and as long as their guests are respectful and don’t cause any trouble, why do you even care?

ajsnigrutin · on July 11, 2023

It's a limited resource being misused. It is in the interest of the locals for tourists to stay in hotels and apartments to be available for long(er) term rent for the locals. Barcelona was one of the first cities to ban short-term rentals (under 31 days), and hopefully not the last.

cycrutchfield · on July 11, 2023

It’s only “limited” because housing has been made artificially scarce by local governments operating in the interests of property owners by restricting new builds in order to keep housing prices high. In regions and countries that have allowed enough housing to be built to actually meet demand there is much less hand-wringing about short-term rentals.

smahs · on July 11, 2023

I think I get your perspective (but also you should really understand that your arguments are just going unidirectional instead of building a healthy dialog), but it made me wonder what really went wrong (or changed, depending on your perspective) from Hegel's utopian egalitarianism to how communism was implemented in practice a century later. I guess people just tend to "adapt" legal and governance theories built with perfectly good intentions to their advantage and then things go awry.

anamexis · on July 10, 2023

That doesn't follow. Even if tourism stays flat, demand can shift from hotels to Airbnbs.

_just7_ · on May 17, 2023

I mostly agree that there is probably some self-interest at play when they ask for regulation. Though the fact that one of the existing firms in the industry wants regulation doesn't invalidate all the legitimate reason for regulation. For example, existing car manufacturers might benefit significantly from all the safety requirements for cars as they create a barrier to entry for new players, but I doubt that many people want to abolish all the testing and certification cars have to go through.

_just7_ · on April 1, 2023

What is up with that site giving me three seperate popups warning me that the site uses cookies

dpifke · on April 2, 2023

What is up with comments that go against the guidelines?

From https://news.ycombinator.com/newsguidelines.html:

Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

kortilla · on April 2, 2023

Guidelines can’t prevent people from complaining about getting eye fucked by a website.

bdcravens · on April 2, 2023

> In Comments

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

nier · on April 2, 2023

Disabling content blockers, I can see what you’re describing. Seems like satire.

iscrewyou · on April 2, 2023

way of tangent from this thread but which content blockers do you use?

jdeibele · on April 2, 2023

Not the poster but the Hush extension for Safari seems to work well on this site because I didn't see any notices. And that's what it's supposed to accomplish. https://apps.apple.com/us/app/hush-nag-blocker/id1544743900

I'm sure there are similar things for FireFox or Chromium browsers but Hush seems Safari-only.

nier · on April 5, 2023

I was surprised to find only AdGuard on my main iPhone. I do also use Hush on my other one.

AdGuard has most all filters enabled except the language-specific ones and the «Other» category.

throwaway1777 · on April 2, 2023

Thanks GDPR!

jiggawatts · on April 2, 2023

Thanks marketing people that decided to double down despite the clear distaste of the general public at their relentless tracking.

kortilla · on April 2, 2023

Is this the clear distaste of the general public or an overreaction with bad regulation by a government authority?

It’s been how many years now and GDPR has done very little to improve anything despite the cookie prompts on websites everywhere?

At this point they are as useful as TOS (not) with the annoyance of seeing one every website.

bee_rider · on April 2, 2023

I like the banner, it lets me know to turn back.

lukeschlather · on April 2, 2023

The properly GDPR compliant cookie banners allow you to itemize certain items on the website's TOS that you may choose to accept or not accept. Website TOS are very useful for the company operating the website and the cookie rules allow you as a visitor to get some of that usefulness back.

morsch · on April 2, 2023

How do you know GDPR has done very little?

dmix · on April 2, 2023

Browsers like Firefox could hypothetically kill cookies in their browser tomorrow but doesn't. Or at least make a big stink about it. Do you think they should? Do you think we'd be better off as a society?

emidoots · on April 2, 2023

Yes we'd be better off if cookies were removed.

But if it was only Firefox, people would simply add banners saying "Firefox is not supported." since it's only like 3% of marketshare these days.

sva_ · on April 2, 2023

I think when you click "accept", you also accept to things like fingerprinting and storing a fingerprinted identity on their server, as well as perhaps supercookies, that allow your ISP to track you.

PeterisP · on April 2, 2023

Browsers can't tell if a cookie is a generic setting ("chose Yes/No on a banner") or a uniquely identifiable one; and they can't tell if a cookie is functionally required (ID for a logged-in session) or not (ID to track random visitors).

The distinction is legal, not technical; so it has to be enforced by legal, not technical means.

mjevans · on April 2, 2023

Firefox COULD default to cookies off (with an in menu widget to force them on for non-automatic handling), and if any forum submission happens _ask_ if the end user wants to accept the site's cookies.

PeterisP · on April 2, 2023

Looking at a typical site, a reasonable user might want to accept one (or perhaps a couple) of many dozens of cookies a site attempts to set. Choosing it manually per site per cookie is difficult but perhaps theoretically possible, however even that still requires cooperation from the site to honestly identify that this one is the cookie which is functionally required, and these fifty are for ad tracking, and ensuring that cooperation still requires legal means and can't be done with purely technical ones.

Furthermore, there is the important distinction about multiple uses of the same data. There are uniquely identifiable cookies that are functionally required for one purpose but the site may want to use it for other purposes as well (e.g. share that data with heir "trusted partners" for targeted advertising) for which user may reasonably want to refuse permission, so a browser accepting a cookie doesn't imply such permission and something extra is required.

EarthLaunch · on April 2, 2023

It's a bad system that allows that and popups to be incentivized. The worst systems are created with the best intentions. Is ignorance an excuse?