Is it common for an ML model to be designed to make product recommendations base...

nostrademons · on April 23, 2019

I figured that it was just for illustration, because the author couldn't think of a better example. Some real-life examples that turn up stupidly often:

1. The model uses click-through data as an input. Your frontend engineer moves the UI element being clicked upon to a different portion of the page for a certain category of results. This changes the baseline click-through rate. The model assumed this feature had a constant baseline across all results, so the new feature value now needs to be scaled to account for the different user behavior. Nobody thinks to do this.

2. The frontend engineer removes a seemingly-wasted HTTP fetch to reduce latency. This fetch was actually being used to calibrate latency across different datacenters, and was a crucial input to a data pipeline to a system of servers (feeding the ML model) that the frontend team didn't control and wasn't aware of.

3. The frontend engineer accidentally triggers a browser bug in IE7 (gimme a break, it was 9 years ago) that prevents clicks from registering when RTL text is mixed with LTR. Click-through rates decline precipitously in Arabic-speaking nations. This is interpreted by an ML model as all results being poorly performing in Arabic countries, so it promptly starts cycling through results, killing ones that had shown up before with no clicks.

4. A fiber cable is cut across the Pacific. This results in high latency for all Chinese users, which makes them abandon their sessions. A ML model interprets this as Chinese people being less interested in the news headlines of that day.

5. A ML model for detecting abusive traffic uses spikes in the volume of searches for any one single query over short periods of time as a signal. Michael Jackson dies. The model flags everyone searching for him as a bot.

6. A ML model for search suggestions uses follow-up queries as a signal. The NYTimes crossword puzzle comes out. Everybody goes down the list of clues and Googles them. Suddenly, [houston baseball player] suggests [bird sound] as related.

ayazhan · on April 23, 2019

Thanks nostrademons, these are great examples. You're right, name and email are just for illustration. Would you mind if we use your feedback and some of your examples to improve the article? If yes, should we credit your HN account?

nostrademons · on April 23, 2019

I'd actually rather that you keep them general (eg. just talk about clickthrough data or changing latency conditions) and don't credit my account. The past employer in question is relatively easy to lookup from my past comment history, and while there's nothing really confidential in the examples, stories about how they do things or how things go wrong tend to blow up in the news, and they like the publicity only when it's positive.

ayazhan · on April 23, 2019

Ok, sounds good. we'll keep it generic and won't mention the source. Thank you for sharing! We think this is something AI companies can benefit from in the future.

massaman_yams · on April 23, 2019

Good examples, but re: #1, do a lot of places really deploy models with a static distribution? It should be relatively trivial to calculate this directly from the data within most ML libraries/systems - using a static distribution seems like such an obvious novice mistake.

nostrademons · on April 23, 2019

It's not that the distribution is static, it's that the distribution is computed once when the model is built and then becomes outdated if the UI is changed as the model is being used. Many places have different timelines for updating machine-learned models vs. deploying new frontend code; anywhere from "a month" to "we'll rebuild it manually when we need to" is typical for the former, while good engineering shops do weekly, biweekly, daily, or sometimes even hourly for frontend changes.

massaman_yams · on April 24, 2019

Ah, I suppose I come from a world where most models are automatically rebuilt at least daily, and sometimes as often as a few times an hour.