Hacker Newsnew | past | comments | ask | show | jobs | submit | Fripplebubby's commentslogin

Hiring is still a pretty non-uniform thing despite attempts to make it less so - I'm sure there are some teams and orgs at all these large companies that do it well, and some that do it les well. I think it is pretty well accepted that university brand is not a good signal, but it is an easy signal and if the folks in the hiring process are a bit lazy and pressed for time, a bit overwhelmed by the number of inbound candidates, or don't really know how to evaluate for the role competencies, I think it's a tool that is still reached for today.

In a way, I think the hiring process at second-tier (not FAANG) companies is actually better because you have to "moneyball" a little bit - you know that you're going to lose the most-credentialed people to other companies that can beat you dollar for dollar, so you actually have to think a little more deeply about what a role really needs to find the right person.


> This is partly a tooling problem. Many of the tools we use do not do a good job of capturing and representing this data. For example, the majority of latency graphs produced by Grafana, such as the one below, are basically worthless. We like to look at pretty charts, and by plotting what’s convenient we get a nice colorful graph which is quite readable. Only looking at the 95th percentile is what you do when you want to hide all the bad stuff. As Gil describes, it’s a “marketing system.” Whether it’s the CTO, potential customers, or engineers—someone’s getting duped. Furthermore, averaging percentiles is mathematically absurd. To conserve space, we often keep the summaries and throw away the data, but the “average of the 95th percentile” is a meaningless statement. You cannot average percentiles, yet note the labels in most of your Grafana charts. Unfortunately, it only gets worse from here.

I think this is getting a bit carried away. I don't have any argument against the observation that that average of a p95 is not something that mathematically makes sense, but if you actually understand what it is, it is absolutely still meaningful. With time series data, there is always some time denominator, so it really means (say) "the p95 per minute averaged over the last hour", which is or can be meaningful (and useful at a glance).

Also, the claim that "[o]nly looking at the 95th percentile is what you do when you want to hide all the bad stuff" is very context dependent. As long as you understand what it actually means, I don't see the harm in it. The author makes this point that, because a load of a single webpage will result in 40 requests or so, you are much more likely to hit a p99 and so you should really care about p99 and up - more power to you, if that's the contextually appropriate, then that is absolutely right, but that really only applies to a webserver serving webpage assets which is only one kind of software that you might be writing. I think it is definitely important to know, for one given "eyeball" waiting on your service to respond, what the actual flow is - whether it's just one request, or multiple concurrent requests, or some kind of dependency graph of calls to your service all needed in sequence - but I don't really think that challenges the commonsense notion of latency, does it?


Nearly all time series databases store single value aggregations (think p95) over a time period. A select few store actual serialized distributions (Atlas from Netflix, Apica IronDB, some bespoke implementations). Latency tooling is sorely overlooked mostly because the good tooling is complex, and requires corresponding visualization tooling. Most of the vendors have some implementation of heat map or histogram visualization but either the math is wrong or the UI can’t handle a non trivial volume of samples. Unfortunately it’s been a race to the bottom for latency measurement tooling, with the users losing.

Source: I’ve done this a lot


I take it as a given that what is stored and graphed is an information-destroying aggregate, but I think that aggregate is actually still useful + meaningful


Someone smart I know coined it as “wrong but useful”


They care very deeply about this and devoted a lot of resources to (re)grading the digital versions that you see today on Disney+. The versions you see are intentional and not the result of cost cutting. (I was not directly privy to this work but I worked on Disney+ before its launch and I sat in on some tech talks and other internal information about the digital workflows that led to the final result on the small screen and there was a lot of attention on this at the time)

I think there's a discussion to be had about art, perception and devotion to the "original" or "authentic" version of something that can't be resolved completely but what I don't think is correct is the perception that this was overlooked or a mistake.


I'm hearing you out, but how is this going to affect the part of this that is client behavior rather than database behavior? If there is some kind of sdk that actually captures the interface here (that is, that the client needs to be compatible with both versions of the schema at once for a while) and pushes that back to the client, that could be interesting, like a way to define that column "name" and columns "first name", "last name" are conceptually part of the same thing and that the client code paths must provide handling for both at once.


The post is a clear example of when YAGNI backfires, because you think YAGNI but then, you actually do need it. I had this experience, the author had this experience, you might as well - the things you think you AGN are actually pretty basic expectations and not luxuries: being able to write vectors real-time without having to run other processes out of band to keep the recall from degrading over time, being able to write a query that uses normal SQL filter predicates and similarity in one go for retrieval. These things matter and you won't notice that they actually don't work at scale until later on!


That's not YAGNI backfiring.

The point of YAGNI is that you shouldn't over-engineer up front until you've proven that you need the added complexity.

If you need vector search against 100,000 vectors and you already have PostgreSQL then pgvector is a great YAGNI solution.

10 million vectors that are changing constantly? Do a bit more research into alternative solutions.

But don't go integrating a separate vector database for 100,000 vectors on the assumption that you'll need it later.


I think the tricky thing here is that the specific things I referred to (real time writes and pushing SQL predicates into your similarity search) work fine at small scale in such a way that you might not actually notice that they're going to stop working at scale. When you have 100,000 vectors, you can write these SQL predicates (return the 5 top hits where category = x and feature = y) and they'll work fine up until one day it doesn't work fine anymore because the vector space has gotten large. So, I suppose it is fair to say this isn't YAGNI backfiring, this is me not recognizing the shape of the problem to come and not recognizing that I do, in fact, need it (to me that feels a lot like YAGNI backfiring, because I didn't think I needed it, but suddenly I do)


If the consequence of being wrong about the scalability is that you just have to migrate later instead of sooner, that's a win for YAGNI. It's only a loss if hitting this limit later causes service disruption or makes the migration way harder than if you'd done it sooner.


And honestly, even then YAGNI might still win.

There's a big opportunity cost involved in optimizing prematurely. 9/10 times you're wasting your time, and you may have found product-market fit faster if you had spent that time trying out other feature ideas instead.

If you hit a point where you have to do a painful migration because your product is succeeding that's a point to be celebrated in my opinion. You might never have got there if you'd spent more time on optimistic scaling work and less time iterating towards the right set of features.


I think I see this point now. I thought of YAGNI as, "don't ever over-engineer because you get it wrong a lot of the time" but really, "don't over-engineer out of the gate and be thankful if you get a chance to come back and do it right later". That fits my case exactly, and that's what we did (and it wasn't actually that painful to migrate).


At my last job I took over eng at a Series B startup, and my (non-technical) CEO was an ill tempered type and pretty much wanted me to tell him that the entire tech stack was shit and the previous architect/pseudo head of eng was shit, etc. And I was like no... some tradeoffs were made that make a ton of sense for an early stage startup, and the great news is that you are still here and now have the revenue and customer base to start thinking in terms of building things for the next 3-5 years, even though some of things are starting to break. And even better, nothing was so dire that it required stopping the world, we could continue to build and shore up some of the struggling things at the same time.

He seemed to really want me to blame everything on my predecessor and call some kind of crisis, and seemed annoyed by my analysis, which was confusing at the time. But yeah, there are absolutely tradeoffs you make early in a startups life, you just have to know where to take shortcuts and where you at least leave the architecture open to scaling. My biggest critique is that they were at least a year, if not two, past the point where they should have left ultra scrappy startup mode that just throws things at the wall and started building with a longer view.

I have also seen a friend build out a flawless architecture ready to scale to millions of users, but never got close to a product fit. I felt he wasted at least 6 months building out all this infra scaffolding for nothing.


Yeah, that's a great way of putting it.


Yeah the "only if" is more like a "necessary, not sufficient." The future migration pain had better be extremely bad to worry about it so far in advance.

Or it should be a well defined problem. It's easier to determine the right solution after you've already encountered the problem, maybe in a past project. If you're unsure, just keep your options open.


A few years ago I coined the term PAGNI for "Probably Are Gonna Need It" to cover things that are worth putting in there from the start because they're relatively cheap to implement early but quite expensive to add later on: https://simonwillison.net/2021/Jul/1/pagnis/


> When you have 100,000 vectors [...] and they'll work fine

So 95% of use-cases.


I think Immich (Google photos alternative) uses pgvector. And while you can't really call it a "production" system, because it is self hosted, I have about 100,000 assets there and the vector search works great!


In that case you might not even really need optimized vector search though.


Many of the concerns in the article could be addressed by standing up a separate PG database that's used exclusively for vector ops and then not using it for your relational data. Then your vector use cases get served from your vector DB and your relational use cases get served from your relational DB. Separating concerns like that doesn't solve the underlying concern but it limits the blast radius so you can operate in a degraded state instead of falling over completely.


I've always tried to separate transactional databases from those supporting analytical queries if there's going to be any question that there might be contention. The latter often don't need to be real-time or even near-time.


That is a workaround and precisely the point the author makes. It increases operational complexity and creates a divide between records in the vector DB and the relational DB.


But if you do that, why use Postgres for the vector db?


One of the interesting experiences I have being a member both of this community and the baseball analytics community is seeing posts like this, where apparently the author thinks that they're the only one who had the idea to look at this, shared widely within the hacker community because it comes from one of their own. Rest assured, within the _baseball_ community, this has been discussed and analyzed to death - it just doesn't get posted here because nobody mentions using unix tools to do it, because it isn't really relevant.

See for example:

https://blogs.fangraphs.com/how-have-the-new-rules-changed-t...

https://www.baseball-reference.com/friv/rules-changes-stats....

And many others, these are two early and relatively canonical ones. If folks reading this post are interested enough in baseball, please, come join us in the baseball analytics community where this is merely the very tippy top of the iceberg of interesting things.


The way I read the paper, "diffusion" was more of a metaphor - you start with the output of the LLM as the overview (very much _not_ random noise), and then refine it over many steps. However, seeing this, I wonder myself whether or not in-house they actually mean it more literally or have actually tried using it more literally.


One interesting thing I learned from this was how they determined the probable size of this comet probabilistically, rather than using direct observation - basically, based on the observations, it could either be really big (10km) or really small (0.5km), and we can basically rule out really big because we've been looking for comets for years, and during that time, to see one that is that big implies that we _should have seen_ thousands that are quite small over that time period, because the size of space objects follows a power law since they're always whacking into each other and breaking up. Since we've only seen one small interstellar object during that time rather than thousands, a large comet is so impossibly unlikely that we can conclude that it is 0.5km in size. I'm sure at some point this will be confirmed in a more conventional way, as well.


What you’re describing is Bayesian inference in action. Given how rare big interstellar comets should be, and how common small ones should be, the lack of detections makes the big-comet hypothesis incredibly unlikely. So we update our beliefs: it’s probably small. Space statistics at work


Have you heard of https://www.autobiographer.com/ ? Is it similar, different?


It's pretty much the same, but I have a few ideas to make my app stand out. Thanks for sharing! Nice to see the idea has potential.


I think this is a great post. It will ruffle some feathers and people will feel attacked, but I think the core idea is exactly right: if we are communicating and the goal is the exchange of information, use your incredible language faculty to communicate with me. To do otherwise is a disrespect to me and it indicates that you value the act of showing me your brilliant idea (in your estimation) more than you value taking the effort to actually communicate the idea to me. You are essentially an "ideas guy". I know that an LLM is a yes-man, and that a yes-man and an "ideas guy" is a combination that produces confident mistakes. If you can't be bothered to communicate your idea, or the essence of your idea, in your own words, please keep it to yourself until you've put in that effort.


Maybe I worded this more harshly than I meant. I value anybody who tries to communicate with me and I don't mean to try to discourage people from having ideas or communicating them to me, but - writing is thinking, the act of trying to actually use your language and your reasoning will improve your idea. How many times have I thought I had a good idea, then in the process of writing it out, I realize its flaws (many times)? If you pass this process off to an LLM, you skip a key step, and you leave it to me the receiver to do this work for you.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: