Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Disclaimer: I work for Google, though far away from Search.

Regardless of search engine design, there's HUGE money in SEO. Any successful search engine will be gamed. Do you have the developer power to go red-queen against all the large companies in the world?



For clarification, “red queen” means a conflict between two or more entities where the cost of engagement grows, but the relative advantage does not change.

Simply put, search engines have been at war with SEO for over 30-years, which has significantly raised the bar not only being a search engine, but producing content; not to mention knowing how to search for information. With the introduction of machine generated content, information wars between countries, global dependence of online commerce & information, etc — the speed of change shows no signs of letting up.

In my opinion, for the average person, knowing how to search for information is the real issue, not that the quality of information available has become worse or that Google has become a worse search engine. If anything, Google has reduced its advantage search capabilities not for financial gain, but because average user is just too lazy to learn how to search and keep up with changes required to continue to be an advanced searcher.


To your last point, I don’t quite agree. Google’s incentives are misaligned such that keeping you on Google.com just a little longer is better than not because you are more likely to click on an ad.

But also yes the users and the UI both fail. When I used to search for something I would type in something like “gutter clog clean” but slowly started noticing that Google likes longer sentences like “how do I clean a clog in my gutters?”. In pursuit of making Knowsmore (from Ralph Breaks the Internet), Google lost the power user features. Search would be infinitely better if they actually fucking respected literal mode and stopped trying to treat me like an idiot with no attention span. Having search results that contain one out of like 8 words in my query and asking me if I want to include others and then when I say I do still showing me results without them is broken UI and not a user problem.


Google search, from very early on, considered it a success metric when users went quickly to a result. I have no idea how that factors into the current surely hideously complex ranking algorithms, though.

As for the parsing of queries, that's probably based on how most users use search. Not everyone is familiar with keyword -based search. I expect they've done tons of A/B tests to determine what kind of query interpretation makes most users get better results. We're just not "most users".


>Search would be infinitely better if they actually fucking respected literal mode and stopped trying to treat me like an idiot with no attention span. Having search results that contain one out of like 8 words in my query and asking me if I want to include others and then when I say I do still showing me results without them is broken UI and not a user problem.

Use Verbatim?


Agree verbatim in this specific situation is likely answer, though did not point it out since they clearly think they understand how to search; verbatim has been an option as long as non-verbatim search has been used by Google.

Beyond that, complaining Google does not do XYZ misses the point. Google is a search engine designed for the average user and the average user does not want verbatim search. They also do not want: advanced search operators, true Boolean search, regular expressions, API access to search, open source code, real-time streams of pages Google’s crawling, etc.

What they do want and always have is natural language based searches in there language of preference with clarifying responses from the search engine in natural language; that is, they want to treat a search engine like a person and be treated like a person; which was odd that they referenced Knowsmore, since Knowsmore [1] used keyword based searches, not plain language searches.

Google is not the primary problem, the average user is the issue. Unless people realize that — they’re fighting in a war they do not even understand.

To make it even more clear, Google is easily able to detect and block users blocking ADs, but they do not. More than 60% of users still don’t block ADs; not because they love ADs, but because effort to figure it out simply is not worth it to them, they like ADs, etc.

[1] https://m.youtube.com/watch?v=T3wiGSXbeQE


>What they do want and always have is natural language based searches in there language of preference with clarifying responses from the search engine in natural language; that is, they want to treat a search engine like a person and be treated like a person

I agree with you but Google is not yet at that point where it can act and serve people like an Answer Machine that knows everything; both the people's preferences and the perfect answers.

>Google is not the primary problem, the average user is the issue. Unless people realize that — they’re fighting in a war they do not even understand.

Again I agree that casual users are the problem but how we can help them? This is the The Innovator's Dilemma[0] where if we ask casual users what new stuff they want from Google Search, they will answer "nothing". Because even they themselves don't know how their UX can be or should be improved and on top of that they are satisfied with Google's mediocrity. They would just respond "Google is Google".

>Beyond that, complaining Google does not do XYZ misses the point. Google is a search engine designed for the average user and the average user does not want verbatim search. They also do not want: advanced search operators, true Boolean search, regular expressions, API access to search, open source code, real-time streams of pages Google’s crawling, etc.

Complexity of constructing "complex" search queries needs to be simplified so casual users can use such features and queries.

[0] https://en.wikipedia.org/wiki/The_Innovator's_Dilemma


>To your last point, I don’t quite agree. Google’s incentives are misaligned such that keeping you on Google.com just a little longer is better than not because you are more likely to click on an ad.

I agree, which leads me to the conclusion that subscription is the best way to avoid this conflict of interest. Unfortunately, most of the world won't subscribe to a search engine, and doesn't seem to mind ads - to a degree. With Google looking more and more like AltaVista before its demise (to Google), my conclusion is that Google will strangle itself out of existence and give way for the next "new, streamlined, not-full-of-ads" competitor.


Here's a search engine that I'm subscribed to: https://kagi.com

In the 20-30 searches that I do in a day, I still have to google about half of them. Either because it's stuff Google does well (currency conversion, for example), or Kagi just doesn't get what I'm trying to search.

I remember starting out with the Internet searching on Altavista and Yahoo and Lycos. The information that was present was nowhere near as now, and it was more "exploratory". Nowadays people just kind of know what they want and just wants to quickly get there.


> In the 20-30 searches that I do in a day, I still have to google about half of them. Either because it's stuff Google does well (currency conversion, for example), or Kagi just doesn't get what I'm trying to search.

Currency conversion is not technically a search. It is question answering and Kagi capabilities are still being built. Google only has a 20 year headstart. Can you report all such cases to kagifeedback.org so they are on our radar?


Thanks, currency conversion was an example off the top of my head only. I am active on orionfeedback and kagifeedback, I find that they're really prompt and effective in answering to feedback.

The other examples are a bit harder to describe and I can't quite describe how Google gets it right. I think I might need more time to describe it out, as it involves search in another language.


Currency conversion is nothing you have to sell yourself to Google for. Just bookmark a bank, a financial or an academic research site that seems trustworthy. I have used the same ones for over 20 years, probably found them using Altavista at the time...


What about one funded by universities or libraries as a research project?

There have been lots of no ads (for now) attempts. DDG had like one small ad at one point. But people didn’t leave in droves. It’s almost like people are ok with ads.


Allow users to blacklist sites. Share blacklisted sites. Have the option of instead of hiding blacklisted sites entirely - show them in a different column.

Have 4 different types of list.

Whitelist - highest scoring

Not listed - these websites are not ranked.

Yellowlist - show but keep in a separate column

Blacklist - don't show

https://search.brave.com/help/goggles is interesting.


99%+ of people wouldn't use this, they would just try the search engine, see that the results suck, then stop using it.

It could be useful for power users, though.


99%+ of people in the medical industry did not sanitize hands or equipment 100 years ago.

99%+ of people in the tech industry currently do not care to do the extra steps required for data neutrality, and privacy.

99%+ of people are lazy to the point of harming themselves and others.

1%- of people examine how the 99%+ do things and pioneer harm reduction tactics in spite of everyone constantly reminding them that no one wants their help.


>99%+ of people in the medical industry did not sanitize hands or equipment 100 years ago.

this isn't true. 170 years ago, maybe. handwashing became a thing in the late 1800s after Semmelweiss and Pasteur


I will adjust this analogy by 70 years in the future, just for you.


99% of analogies are correct


Why should we care about 99% of people? They are people who upload all their personal data to social networks agreeing to "we can do whatever we want" terms, they pay with a card, use Apple's and Microsoft's spyware (which marketing people call "telemetry") ridden operating systems, and install Chrome that sends a signal to Google every time they open a new tab and sends data about every form on every website they visit (which Google developers call "crowdsourcing" in the code [1]).

Make a customizable and privacy-respecting search for us, power users.

Also, I have noticed that Firefox internal search works good for sites you had visited. So when I want to visit a page I have seen earlier, I can go straight to it skipping Google.

Also, you can click on any search box and add it with a prefix, so that you can search MDN or Wikipedia directly, again, without informing Google.

[1] https://source.chromium.org/chromium/chromium/src/+/main:com...


20% to 40% of internet users use an adblocker depending on whose statistics you use.[1][2][3]

SEO is basically another form of advertising, so evidence suggests to me that people would use something like this.

[1] https://en.wikipedia.org/wiki/Ad_blocking#:~:text=users

[2] https://earthweb.com/how-many-people-use-ad-blockers/

[3] https://www.insiderintelligence.com/insights/ad-blocking


A quick, entirely unrepresentative look at the user count for uBlock Origin and uBlacklist in Firefox make me somewhat less optimistic than you are. Or is there a more popular way of blocking sites from search results than uBlacklist out there, which I simply don't know about?


uBlock Origin and Adblock Plus are the most popular extensions on Firefox, with each around 5 to 5.5 million average daily users over a whole year[1].

I believe the addon page is average daily users over a week or 6 days?[2]

Firefox monthly active users (unique over 28 days) is around 200 to 210 million worldwide (26 to 22 million in the USA)[3].

It's probably wrong to compare those numbers but a very naive 200 / 28 = 7.41 million average unique users per day. 5 / 7.41 = 67% (probably wrong)

[1] https://addons.mozilla.org/blog/firefoxs-most-popular-innova...

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1168642

[3] https://data.firefox.com/dashboard/user-activity


Add a thumbs up / down icon, and your average Facebook user would likely use it without knowing what it was.


More and more search engines are now giving you the option to customize search results. Brave has Goggles, You.com has thumbs up / down icon and other alternative search engines have similiar capabilites too. I have been really enjoying the ability to tailor my search to how I like it and e.g. rank reddit (although it seems like it is not liked by some folks here) higher.


why would a search engine use this for a different purpose than FB does?


It could work like spam filtering. Once a certain number of users have marked a site as spammy (or put it on their blacklist), it gets downranked for everyone. But jstummbillig is right. It could easily get gamed.


I love how this is a problem that a huge number of people have worked on for around 20 years and have thrown stupid amounts of money at and people just jump in and boldly proclaim that they have the answers that just popped into their mind. Peak HackerNews.


I don’t think these ideas just popped into his mind. They’ve been discussed for a long time on HN, and some alternative search engines have started to incorporate them.

It’s worth the experiment.


But a company like Google would want supreme control. They're not inclined to rely on a community for anything

Wikipedia shows it can work at scale.


Google have actually followed a similar model for like ten years now with like 10,000 temp workers following this 172 manual to essentially as the basis to train a lot of their ranking models.

[1] https://static.googleusercontent.com/media/guidelines.raterh...


Gmail's spam filter is basically community trained by people clicking spam/not spam (at least, that was what they said many years ago, it might have silently changed). What's different about search?


spam filter isn't their key flagship aquila product


But not my idea - Googles from Brave is something to look into, don't know entirely how it works - but it's in this direction.


Goggles


I recently made a tool to create Brave Goggles using subreddits as a signal source. I already use a netsec goggle for daily searching.

https://github.com/forcesunseen/narwhalizer


How do you prevent the gaming of blacklisting?


That’s the same question as “how do you ensure complete trust” which is, of course, not possible. That doesn’t mean that “distributed trust-based blacklisting” still isn’t better than what Google is offering today, which is nothing.


Source lists from friends-of-friends.


I found the thought of adding a "yellowlist" funny, as some people are starting to find the use of "whitelist" and "blacklist" racist.


The proper terms now are allowlist and blocklist


It would be more consistent to call it greenlist, yellowlist, redlist then.


How dare you, some of my best friends identify as traffic lights.


That term has been judged problematic. The appropriate term is "person of vehicular illuminativeness".


How dare you. Some of my best friends are color blind! You insensitive clod!


Some people couldn't find their butt with both hands and a map. Best not to take advice from them. Unless you find yourself surrounded. In which case smile and nod.


Don’t forget to avoid saying ‘blacksmith’ too. Another somewhat ‘racist’ one. /s

Still waiting for Mastercard to change their incredibly ‘racist’ company name. /s


lmao, feeling left out with no brown list then.


I think Google search is using the wrong approach. When I'm looking for e.g. a new camera, I want to use my network which I trust. E.g. I want to ask "what camera would HN recommend?" We should think more about how we can use trust as a basis for how we explore the internet.


Given that enough people use this heuristic, there will be companies focusing on earning karma on HN, writing comments and voting for products they are getting paid for.


While true, it is also not a given that you'd need to trust _all_ of HN. I visit sufficiently regularly that I see some people post and recognize their username. I often think -- I wish I could _follow_ them. Not so much of a stretch to think of rings of trust built around particular users. Bringing people in and kicking them out of these trust circles could play a role. PageRank -> TrustRank? Of course it would also be only one metric, among many possible trust rankings and many possible other signals and settings.

I bet the (niche) product as such wouldn't be as hard to build as it would be to scale. Imagine every user constantly tweaking (directly or indirectly) their search result settings, and having that impact millions (or more) indexed items, for every user.


Wouldn't the world become a little bit better if they did? Earning karma on HN is not the easiest thing. I would hate to speak for all of us but collectively, don't you think we have a pretty good marketspeak alert system here?


I’ve got 10k karma by making pithy criticisms of Node.js and Kubernetes. Do you therefore trust my opinions on US healthcare?


I would, if what you say were to make a whole lot of sense, otherwise, no.

You know how it works here. We would strip you of your internet points if you start being nonsensical.


I think you are kind of ignoring the issue that parent brought up.

The hn machine is not all that smart and can easily be gamed, and would, if the stakes were to become high enough. Farming hn karma by making pointed statements on crowd favorites (parent named two, oss licensing or privacy also spring to mind) gets you your 10k, no originality or honesty required, in no time.

What's protecting hn is a lot of moderation + relative irrelevance. If those 10k were to systematically bring you enough eyes (by driving search results), you are in effect printing money. There is no reason to assume the number of people doing it would not scale with the return attached to doing it.


Anything (points/karma/coins) which is free & unlimited will find it's way to get exploited. "What if" there's a barter system for karma/points, you do a +1 and get a -1.

Though I do not know how the initial allotment of karma/points could be distributed for the pioneers and for the new growing community, maybe allot 'n' points for new user after a year...


Show us the way, reasonable man.


Trust can work both ways. They'd have to be really careful to not lose the earned trust. On the way, they would have to write a lot of high quality HN posts. I'd say this model is a win over current spamming and fake reviewing practices.


The other side of them having to write a lot of high quality HN posts is me having to read and evaluate more HN posts trying to game me. That is work, and if I sense a lot of it (like I did on Reddit), I will leave, and I suspect others would too.

Playing defense is exhausting when playing offense is extremely cheap.


> Playing defense is exhausting when playing offense is extremely cheap.

HN is quite a large crowd, but not extremely big. An attacker must penetrate a lot of small crowds to be successful.


Personally I don't ask my network or friends for options on such things because people tend to have a positive bias towards things we have invested money in.

Interestingly summarized on a Lifehacker some time ago. https://lifehacker.com/the-psychology-of-a-fanboy-why-you-ke...

I like my friends but I don't want to get my news or politics or shopping advice from them.


The question is: is that worse than the advice from marketeers?


Marketers' information would have a known bias that you can mentally correct for.


In 2005 I had an open source project doing this that turned into a venture backed company.

See: http://getoutfoxed.com/node/46

Some approach like this could still work, but it’s incredibly hard to maintain/define the right “network”, and across different domains. (E.g. HN probably not so good a community for latest fashions or sports trivia)


I put your search into SearX with a lot of the major engines enabled (Google, Bing, Qwant, Brave, DDG, etc.). Arguably, Google did a better job giving me HN results (but it's a very small sample set).

https://searx.be/search?q=what%20camera%20would%20HN%20recom...

Results 1-3: SEO "best cameras" from DDG, Qwant

Result 4: Ask HN from DDG, Qwant

Result 5: Ask HN from Google

Results 6-8: SEO "best cameras" from DDG, Qwant

Result 9: Ask HN from Google

Result 10: Camera forum from DDG, Qwant

Edit: if you're not familiar with SearX, everyone who visits will get a slightly different result based on dynamic results. Even if the same person refreshes a few times the exact results and ordering will vary; I've learned to try the same search a few times to get better results, it's just a quirk of how it works and how each remote engine reacts at that given instant.


Google's end goal is Answer Machine but we are decades away from that. Answer Machine would be something like God-like software which gives you the ultimate answer to your query and that answer would be 100% accurate, true and personalized/suited for you. Again blackbox solution but it will be so advance with the help of AI that it would be trustworthy by default/design. Hard to achieve but looking at the Moore's Law and the advance of AI it will eventually be achieved.


> Do you have the developer power to go red-queen against all the large companies in the world?

Let's try? It's also an interesting research topic in itself and might be a topic for academic research. At the moment Google is a black box and their incentives are not really aligned to stop SEO. It's good for them to show more ads, it's good for them to show you copycat pages of github/stackoverflow with ads. Not saying that Google is doing this on purpose - I doubt it - but we don't know. It's surely possible to create an index and ranking that prefers different things than Google.

Let's try something at least. It will be gamed, will be worse probably but it's open and can be a playground for academic research.

Best that can happen is that there are ways to for a better ranking and Google was dishonest to maximize profit. If it's still gamed at least the mechanics can be studied and analysed and maybe someone can figure out a switch like 'be unfair in ways Google can't' - would love a 'no ads on the page' switch that would probably solve quite a few problems.

Trust our black-box and you won't have enough devs for this is just a bad answer for such an important problem. The amount of stupid simple redirection spam in my results in the last few years also looks like that Google just doesn't care alot about this anymore.


I think this is mostly a problem if you are in Google's position, near total market dominance.

This not only means you exert massive selection pressure on the shape of websites. The SEO spammers don't need to be good or know what they are doing, they just need to be lucky once. If they get it right, they float to the top, and can iterate on that design. This effectively is saying that no matter how secret or smart your algorithms are, it doesn't matter if you're in Google's position. The numbers are stacked against you.

To make matters worse, any company with that sort of a market share has serious handcuffs in how heavy handed and "unfair" you can be without risking litigation for anti-competitive practices.

I think the best thing that could happen for Google is ironically serious competition in the search market. This would help both problems at once.


> Any successful search engine will be gamed.

Only because it's profitable for them to allow it to be gamed, like all the spam sites now when you search for SO, Google allows them to be ranked because they're filled with Google ads. But it'd be trivial to just delist them all, that'd be beneficial for the user but not the search engine.

It's not a matter of 'developer power' just flip a boolean somewhere and delist the site.


Actually, it is entire possible to create search criteria that cannot be gamed. That is what Google Research's various arms ought to be working on. But we all know if is far more profitable to have a manipulatable system.


Could you expand on this a bit more? I don't see any obvious way to develop non-manipulable search criteria, I was suspecting that there might even be an impossibility theorem about this (which would depend a lot on the exact formulation, though), and I'd like to know what you have in mind.


Because of the filter bubble highlighted by Eli Pariser, there is massive opportunities for SEO companies to trick their customers into thinking they have got high up in the SE results, when all they are seeing is their filter bubble!

https://www.ted.com/talks/eli_pariser_beware_online_filter_b...

The last link someone clicks on, is usually the right link to associate with the search term for future users. Its not hard to make a search engine!

I'm not a Robot!


And that money goes to support many industries beyond search itself. The author really needs to get off a computer for a minute and understand the economics of the web as it stands today and how the free ad supported model supports millions of peoples livelihoods before jumping to "This is all bullshit"


While I agree that they may need to better consider the economics - both of the engine and the websites that may SEO it- that doesn’t mean we should just assume as supported is the way to go or the only way to support people. The economy of today looks different than 20 years ago or 20 years before that. Doesn’t mean we shouldn’t grow and change.


I respect that. I guess my point is "Grow and change with a decent understanding of what the present state enables"

A search engine like Google isn't just a search engine as the author describes it. It is a very integral part of the economy of the internet and just labelling a simplistic interpretation of the present state as "Evil" with an academically poor write up of what a viable alternative is does little good.


I want to write articles and read articles written for me by others. Ideally as few as possible should profit from this process.

As google is now a turd, not just no longer capable of delivering this service but actively destroying the good part of the web by refusing to index it.

It is my attention, it doesn't belong to anyone else.

My access to information and educated opinion is a far more integral part of the greater economy.

Google is like a screaming man at a town meeting making sure no one else can get a word in. The meeting is now pointless.


Google is a catalogue. There is no physical analogy that comes close.

Your point about wanting to read articles written for you by others is certainly possible. The very fact that such a desirable outcome drives you to Google and nowhere else should suggest the complexity of the problem they’re solving and how there isn’t really anything else out there doing so well.

>Ideally as few as possible should profit from this process.

Why?

When you accepted a job offer in the software industry, did you stipulate that your mission is to write code for your employer and you will be charging as little as possible for that privilege? Minimum wage should get you by just fine, right?

I hate fully grown adults behaving as though anyone except them making a profit is somehow evil.


> The very fact that such a desirable outcome drives you to Google and nowhere else should suggest the complexity of the problem they’re solving and how there isn’t really anything else out there doing so well.

Yes and I'm not impressed.

> > Ideally as few as possible should profit from this process.

> Why?

> When you accepted a job offer in the software industry, did you stipulate that your mission is to write code for your employer and you will be charging as little as possible for that privilege? Minimum wage should get you by just fine, right?

I'm not a good example as I indeed live wonderfully on minimum wage and write software for free.

> I hate fully grown adults behaving as though anyone except them making a profit is somehow evil.

Don't worry, my philosophy is not that superficial. We have people who make things, people who organize the making of things and people who organize the things made.

It can be true that the meta data is more valuable than the data it self and organizing an effort can be much more intense than any of the tasks involved. But lets not pretend that is always the case.

Before money and before the written word we had the exchange of thoughts, observations and ideas. I believe this to be somewhat like the foundation on which everything else we did is build. I want to see this process benefit from technology.

You wrote your comment perhaps a bit limited by the ropes of the platform but sincerely, free from any agenda, you wrote pretty much what you think.

Now if we [beyond HN] add additional layers of agendas between our exchange, each interested in maximizing their profit from it perhaps not you but many others will resort to self-moderation.

You wont be able to state it simply like: "I hate fully grown adults behaving as though anyone except them making a profit is somehow evil."

It could become something like "I don't understand why some people don't like others making money" stripped from how strong you feel about the subject. You could also chose not to say anything.

At that point we are messing with the very fabric of our collective reality.

If I had to chose between freely communicating and the economy it wouldn't be a hard choice.


> the free ad supported model supports millions of peoples livelihoods

That doesn't necessarily make it right. Online scamming also supports a lot of people's livelihoods.


One is a business model, one is a crime and fraud.

I’d expect you to know the difference.


Lying in a privacy policy and breaching privacy-related regulations can also be fraud and illegal. Think about why there's so much pushback against the GDPR despite it only primarily mandating transparency with regards to data usage (if they were doing things above-board why would they be afraid?).


This is a great point and a really hard problem to solve.

Do you think a Wikipedia style team of human reviewers with upvote/downvote capabilities can help reduce the SEO spam?

Obviously it’s hard to review the reviewers as well but Wikipedia seems to have done that.


That’s a first world problem. The big problem is becoming successful.


Authorship attribution AI's




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: