More

andrewmatte · on Oct 2, 2023

jonhohle, thanks. Do you know of examples of when milliseconds are part of the session tokens or accounts being created has been exploited?

tgv · on Oct 2, 2023

The German tank production capacity was estimated by serial numbers of captured tanks. There are ways to read all kinds of information by observing energy usage. High resolution time and sequence data undoubtedly reveal more than you’d like.

CSDude · on Oct 2, 2023

Most of our lives as boring SaaS etc. software developer will not be near as exciting as this, but of course you may never know.

I parsed the EV chargers APIs where I live (using Frida in Android) and one of the fields returned the daily revenue and profit.

t0mas88 · on Oct 2, 2023

Could be quite useful information to competing charger networks or in future M&A discussions.

I've seen a project for a trading firm that inferred all kinds of traffic and revenue numbers for companies before their quarterly earnings were made public. It wasn't perfect, but knowing with a certain confidence level whether the numbers were going to be better or worse than estimate was profitable for them.

tgv · on Oct 2, 2023

Sure, for many places it doesn't really matter, but if your URLs or user ids can be seen/scraped by others, you might expose some commercially interesting information to competitors.

And the indexing argument isn't really compelling, is it? You lose very little by sticking to fully random UUIDs.

kozak · on Oct 2, 2023

Here in Ukraine trying to estimate enemy's drone and missile production capacities by serial numbers of their parts is quite a mundane task these days.

BWStearns · on Oct 2, 2023

I could imagine using the timestamp segment of publicly observable ids to estimate activity patterns in an organization. Probably not super crucial and there are probably easier ways in most cases but it could be a big deal at the right moment and for the right target. This could be like a more refined version of PIZZAINT (where you can detect impending policy/operational movements by the quantity of food deliveries to a government organization).

rhaps0dy · on Oct 2, 2023

Is PIZZAINT real? I thought it was a joke. How do you even find out how much pizza is being delivered?

BWStearns · on Oct 2, 2023

In ye olden days you had a minion sitting in a car watching the front gate with a notepad and enough cigarettes to last the night.

Pizza itself might be a bit of a joke but looking for non-operational behavioral changes is absolutely real. The Cuban missile crisis was started in part because soviets played soccer and Cubans played baseball and the presence of soccer fields helped confirm soviet presence (in enough numbers to bother making rec centers). A more advanced version might be the public Strava data leaking US base layouts and locations or Strava helping the Ukrainians kill a Russian submarine commander.

Edit to add: you could also just figure out where your target orders pizza from and pay one of the dudes working there to tell you when there’s a spike in deliveries to your target.

littlestymaar · on Oct 2, 2023

> soviets played soccer and Cubans played baseball

It's a Henri Kissinger's quote, but it's not accurate: Cubans do in fact play football. Also this quote wasn't from the 1962 Cuban missile crisis, but to another event in 1970. That being said, it is true that the US intelligence got warned by the construction of football fields (or maybe even more so by the lack of baseball grounds).

https://www.cracked.com/article_31335_that-time-soccer-field...

yard2010 · on Oct 2, 2023

That's Goodhart's law, if you want to know something instead of asking just say something which is wrong and someone will correct you

Karellen · on Oct 2, 2023

lol

iswydt

tarjei_huse · on Oct 2, 2023

I know of people who used leaked customer ids in public facing chatbot solutions (like Intercom) to estimate how fast their competitors were growing and/or how many customers they had.

thinkharderdev · on Oct 2, 2023

How would you do that with UUIDv7 though? I see how using sequential IDs would obviously leak that information, but if all you leak is the timestamp the ID was generated, how do you then infer anything about the rate of ID generation?

andrewmatte · on June 14, 2023

Congratulations! This is really great. I am going to try it.

In your benchmarks, consider adding actix which is also in Rust but considerably faster than rust axum according to: https://www.techempower.com/benchmarks/#section=data-r21&tes...

dralley · on June 14, 2023

Techempower benchmarks are not a good measure of real-world performance. Everyone cheats on them, they don't do a great job of matching realistic workloads, and there's plenty of unresolved issues filed about methodology problems.

Actix also used to be the poster child for overly focusing on benchmark performance at the expense of real world concerns.

andrewmatte · on June 14, 2023

There might also be a chance you can have your webserver included in the techempower benchmarks for their next round. Their code is submitted by experts and open-source: https://github.com/TechEmpower/FrameworkBenchmarks

andrewmatte · on Jan 13, 2023

I am not a doctor - I just write code... I remember being up in arms about weird stuff in my food when I first heard about it but now I think about the wellbeing of the animals while they're alive and shit, man, if I were sick I'd want antibiotics too. When is it excessive? Is it because they're all so close together in the factory "farm"? What do the veterinarians say about this?

nervousvarun · on Jan 13, 2023

It's not given to sick animals, it's constantly given to all livestock because it's cheap and it's seen as a sort of "preventative maintenance". Which is bad for all kinds of reasons 1) a lot of it makes it into your meals 2) it reduces the efficacy of antibiotics by constantly exposing it to bacteria (allowing them to eventually become resistant).

Probably other reasons it's bad as well (I'm also not a doctor and just write code).

vjerancrnjak · on Jan 13, 2023

Giving antibiotics to animals also makes them grow bigger.

It's billions of Petri dishes. It's a white swan in the making.

It's really a shame that now humans are not given antibiotics to combat antibiotic resistance, when humans do not even consume most of the antibiotics produced.

kube-system · on Jan 13, 2023

Not all antibiotics and diseases are the same. It still makes sense to not overprescribe powerful antibiotics that humans use for human diseases. Even if agriculture is recklessly using cheap antibiotics in animals.

culi · on Jan 13, 2023

you forgot one of the main reasons it's actually given. It completely breaks down their gut microbiome causing them to gain wait much much quicker

yboris · on Jan 13, 2023

It's because they are close together in the factory farm. I don't think you need to consult what veterinarians say - seeing photos of how animals are treated should be enough for you to just stop consuming meat (if you "think about the wellbeing of the animals").

Consider watching Earthlings (2005) - http://www.nationearth.com/ - I'd say a must watch film for anyone who cares about animals.

falcolas · on Jan 13, 2023

I hate to be that guy, but not all ranches.

Ranches surrounding me (I live in relatively rural Montana) don't even come close to resembling the kinds you mention. The cattle are given the run of hundreds of acres, and also they often graze from those fields, etc.

Yeah, there are a shitty minority of ranches that produce a large amount of meat in terrible conditions. But they are not the norm, not in my experience.

And for a tangent, I'd like also call out that it's not just beef being produced by ranchers. We're not just tossing carcasses in the landfills. We use the whole animal. Calcium, leather, feed, gelatin, medicine, etc.

yboris · on Jan 13, 2023

You are going by your personal experience which is very dangerous. As far as I understand statistics (taken across the US), If I remember right, at least 97% of all meat comes from factory farms (depends on animal, this may not be aggregate across all animal types).

So, most people want to believe their meat comes from somewhere nice, but on average, basically all meat in the US comes from animals that are living in horrible conditions (I suspect living lives not worth living -- a life of suffering).

falcolas · on Jan 13, 2023

I'd love to see your source for the 97% statistic, because it doesn't match with anything I know about it.

> So, most people want to believe their meat comes from somewhere nice

I have the ability to know mine is, because our grocery stores get their meat locally.

myshpa · on Jan 13, 2023

https://www.sentienceinstitute.org/us-factory-farming-estima...

"We estimate that 99% of US farmed animals are living in factory farms at present. By species, we estimate that 70.4% of cows, 98.3% of pigs, 99.8% of turkeys, 98.2% of chickens raised for eggs, and over 99.9% of chickens raised for meat are living in factory farms."

falcolas · on Jan 13, 2023

Thank you.

Being honest though, it's sus as fuck, and not just because of the source, or that these are "rough estimates" to use their terms.

A simple read through the spreadsheet shows some pretty odd (and significant) discrepancies. A single example: A row with "2500-4999" animals per farm has farm counts and "total animals" that amounts to over 6.5k animals per farm.

Also, note that CAFO - the farms we're (legitimately) concerned about - is not based solely on the animal counts†, though that's the only part of the definition that the "Sentience Institute" uses because "the public may consider it bad too".

It strikes me as straight up lying with numbers - presenting real numbers in a way which tells the story the institute wants to tell.

† "has a manmade ditch or pipe that carries manure or wastewater to surface water; or the animals come into contact with surface water that passes through the area where they’re confined."

myshpa · on Jan 13, 2023

> A row with "2500-4999" animals per farm has farm counts and "total animals" that amounts to over 6.5k animals per farm

Do you mean line #69 (Inventory, Table 14) ?

----------------------------------------------

Animals per farm | Total farms | Total animals

2500 to 4999 ..... | 1,973 ...... | 6,681,843

----------------------------------------------

6681843 / 1973 = 3386 animals per farm

> It strikes me as straight up lying with numbers

Do you have better source ?

yboris · on Jan 13, 2023

This is why I super appreciate and love the book Animal Liberation (1975) by Peter Singer -- a classic that started modern-day vegetarianism.

The author, my favorite philosopher, uses industry booklets and instruction manuals as examples of what happens at the farms (and you know worse things happen than what is described). It's horrific stuff, enough to make the reader want to decrease their meat consumption. I'm 99% sure that since its publication, the % of animals coming from CAFOs has increased. And since then various other problems appeared (chickens genetically engineered to grow so fast that often their bones break -- resulting in more suffering than before).

https://www.amazon.com/Animal-Liberation-Definitive-Classic-...

myshpa · on Jan 13, 2023

I would also recommend newer version named Dominion (2018). Hard to say which one is better. If unsure, watch both! :)

https://www.dominionmovement.com/watch

zabzonk · on Jan 13, 2023

i am not a doctor, i am a programmer, i used to be a microbiologist. the big problem with stuffing domestic or wild animals or humans with antibiotics for no good reason (ie without testing if they actually have an infection that the antibiotic can or should treat) is that it encourages the development of antibiotic resistance in _all_ bacteria in the treated animal/human.

it's only been 100 years since the very first development of effective antibiotics. before this, bacterial diseases were deadly. if we go on with this misuse, they will become deadly again.

andrewmatte · on Oct 12, 2022

If you use wikipedia and you think they would be reliable if they were poor service, you are living a lie.

andrewmatte · on June 8, 2022

I'm glad someone else is saying it: AGI will emerge from sexbots.

andrewmatte · on March 10, 2022

I have not given transformers enough attention... but my impression is that this is still storing entities in the weights of the neural network instead of in a database where the can be operated on with CRUD. What are the knowledge discovery researchers doing with respect to transformers? And the SAT solver researchers?

Here is an article on KDNuggets that explains transformers but doesn't answer my questions: https://www.kdnuggets.com/2021/06/essential-guide-transforme...

axg11 · on March 10, 2022

I wrote a short post on retrieval transformers that you might find interesting [0]. It’s a twist on transformers that allows scaling “world knowledge” independently in a database-like manner.

[0] - https://arsham.substack.com/p/retrieval-transformers-for-med...

cleancoder0 · on March 10, 2022

First transformer models still dealt only with the training set.

Eventually it was extended to work with an external data source that it queries. This is not a new thing, for example, image style transfer and some other image tasks that were attempted before the domination of NNs did the same thing (linear models would query the db for help and guided feature extraction).

The greatest effect in transformers is the attention mechanism combined with self-supervised learning. Investigations in self-supervised learning tasks (article illustrates one word gap, but there are others) can result in superior models that are sometimes even easier to train.

As for SAT, optimization, graph neural networks might end up being more effective (due to high structure of the inputs). I'm definitely awaiting for traveling salesman solver or similar, guided by NN, solving things faster and reaching optimality more frequently that optimized heuristic algos.

bglazer · on March 10, 2022

> I'm definitely awaiting for traveling salesman solver or similar, guided by NN, solving things faster and reaching optimality more frequently that optimized heuristic algos.

There was a competition for exactly this at Neurips 2021

https://www.ecole.ai/2021/ml4co-competition/

Not sure how much they improved over handcrafted heuristics, but the summary paper may give some insights

https://arxiv.org/abs/2203.02433

yobbo · on March 10, 2022

> As for SAT, optimization, graph neural networks might end up being more effective

Learning from data is a different problem from optimization. For example, if facts about cities gave additional clues beyond their location about the optimal order, then learning could benefit in the travelling salesman problem. Or if the cost of paths is only known implicitly through data examples.

Compare to how NN:s can be used for data compression, for example upscaling images, by learning from photographs only the tiny the subset of all possible images that are meaningful to humans. But it is not useful for general data compression.

cleancoder0 · on March 10, 2022

What about AlphaGo, AlphaZero (chess)?

Optimization is also data, given a local state, can you identify the sequence of transformations that will get you to a better state. The reward is instantly measurable and the goal is minimizing the total cost.

yobbo · on March 11, 2022

AlphaGo is local search guided by a learned heuristic, which is trained in a simulator of the game. The heuristic learns an approximation of the value of moves and board states, and is analogous to the "compute_cost_of_tour()" routine in TSP algorithms.

In the basic TSP (for example) there is no other data to learn from than "distances" between vertices, and anything learned from a single instance of the problem amounts to overfitting. This might still be useful - for example learning efficient sub-paths on a fixed map, rather than searching for them every time.

Self-organized maps can be used as a neural approach to find TSP solutions; in these cases the network itself is the optimized solution. Think of it as ~gradient-descent~ optimization for TSP. Not sure if it is relevant in benchmarks. (I think it might amount to minimizing the sum squared distance between hops (or a bound on that), not the total length of tour. It favours many shorter hops over a few long hops.)

(If you want time-window constraints in LKH, IIRC, you can try adding the time-diff as penalties to your global cost function.)

cleancoder0 · on March 11, 2022

LKH does support a lot of things mentioned, but for practical usages it would not work. It's nice to leave it running and see what can be accomplished but asking it to give you back something in 1 second, with a lot of constraints, gives back solutions that are not feasible.

In the basic TSP there is a lot of data.

For example, the reason why minimum spanning tree works is because the algorithm makes use of the relationship between vertices. Similar techniques use alpha-nearness, Steiner trees and direct modifications of distance matrix to create relaxations of the TSP and improve the performance of local search (I believe most are implemented in LKH).

I am obviously not expecting NNs to be capable of doing something like that currently but I'm hoping they might be able to discover interesting instance patterns for something more constrained.

yobbo · on March 12, 2022

> asking it to give you back something in 1 second, with a lot of constraints, gives back solutions that are not feasible.

Try to limit the search to only feasible solutions.

> the algorithm makes use of the relationship between vertices

But these do not stay they same between problem instances; anything you learn from solving one problem is not helpful when solving the next problem.

cleancoder0 · on March 12, 2022

> But these do not stay they same between problem instances; anything you learn from solving one problem is not helpful when solving the next problem.

But nothing in ML stays the same between instances. The reason why ML works is because there are redundancies in the training set. I am pretty sure that distribution wise, set of TSP instances still has a lot of redundancies.

You would want your model to learn to execute something like MST or to approximate alpha-nearness or to remap the instance into a relaxation that when solved by a simpler algorithm results in a solution that, when remapped back to original, is feasible and optimal.

graycat · on March 10, 2022

> I'm definitely awaiting for traveling salesman solver or similar, guided by NN, solving things faster and reaching optimality more frequently that optimized heuristic algos.

Just in case we are not being clear, let's be clear. Bluntly in nearly every practical sense, the traveling salesman problem (TSP) is NOT very difficult. Instead we have had good approaches for decades.

I got into the TSP writing software to schedule the fleet for FedEx. A famous, highly accomplished mathematician asked me what I was doing at FedEx, and as soon as I mentioned scheduling the fleet he waved his hand and concluded I was only wasting time, that the TSP was too hard. He was wrong, badly wrong.

Once I was talking with some people in a startup to design the backbone of the Internet. They were convinced that the TSP was really difficult. In one word, WRONG. Big mistake. Expensive mistake. Hype over reality.

I mentioned that my most recent encounter with combinatorial optimization was solving a problem with 600,000 0-1 variables and 40,000 constraints. They immediately, about 15 of them, concluded I was lying. I was telling the full, exact truth.

So, what is difficult about the TSP? Okay, we would like an algorithm for some software that would solve TSP problems (1) to exact optimality, (2) in worst cases, (3) in time that grows no faster than some polynomial in the size of the input data to the problem. So, for (1) being provably within 0.025% of exact optimality is not enough. And for (2) exact optimality in polynomial time for 99 44/100% of real problems is not enough.

In the problem I attacked with 600,000 0-1 variables and 40,000 constraints, a real world case of allocation of marketing resources, I came within the 0.025% of optimality. I know I was this close due to some bounding from some nonlinear duality -- easy math.

So, in your

> reaching optimality more frequently that optimized heuristic algos.

heuristics may not be, in nearly all of reality probably are not, reaching "optimality" in the sense of (2).

The hype around the TSP has been to claim that the TSP is really difficult. Soooo, given some project that is to cost $100 million, an optimal solution might save $15 million, and some software based on what has long been known (e.g., from G. Nemhauser) can save all but $1500 is not of interest. Bummer. Wasted nearly all of $15 million.

For this, see the cartoon early in Garey and Johnson where they confess they can't solve the problem (optimal network design at Bell Labs) but neither can a long line of other people. WRONG. SCAM. The stockholders of AT&T didn't care about the last $1500 and would be thoroughly pleased by the $15 million without the $1500. Still that book wanted to say the network design problem could not yet be solved -- that statement was true only in the sense of exact optimality in polynomial time on worst case problems, a goal of essentially no interest to the stockholders of AT&T.

For neural networks (NN), I don't expect (A) much progress in any sense over what has been known (e.g., Nemhauser et al.) for decades. And, (B) the progress NNs might make promise to be in performance aspects other than getting to exact optimality.

Yes, there are some reasons for taking the TSP and the issue of P versus NP seriously, but optimality on real world optimization problems is not one of the main reasons.

Here my goal is to get us back to reality and set aside some of the hype about how difficult the real world TSP is.

cleancoder0 · on March 10, 2022

There's LKH http://webhotel4.ruc.dk/~keld/research/LKH/ which is heuristics and best open implementation. Adding optimality estimates is the least complicated part.

When TSP is mentioned today, unlike 50 years ago when LK heuristic got published, I assume all of the popular & practical variants, like time window constraints, pickup and delivery, capacity constraints, max drop time requirement after pickup, flexible route start, adding location independent breaks (break can happen anytime in the sequence or in a particular time window of day) etc. Some of the subproblems are so constrained that you cannot even move around that effectively as you can with raw TSP.

Some of the subproblems have O(n) or O(n log n) evaluations of best local moves, generic solvers are even worse at handling that (Concorde LP optimizations cannot cover that efficiently). When no moves are possible, you have to see what moves brings you back to a feasible solution and how many local changes you need to do to accomplish this.

For example, just adding time windows complicates or makes most well known TSP heuristics useless. Now imagine if we add a requirement between pairs of locations that they need to be at most X time apart (picking up and then delivering perishable goods), that the route can start at an arbitrary moment etc.

I personally spent quite a lot of time working on these algorithms and I'd say the biggest issue is instance representation (is it enough to have a sequence of location ids ?). For example, one of my recent experiments was using zero suppressed binary decision diagrams to easily traverse some of these constrained neighborhoods and maintain the invariants after doing local changes. Still too slow for some instances I handle (real world is 5000 locations, 100 salesmen and an insane amount of location/salesmen constraints).

graycat · on March 10, 2022

Amazing. Of course I've heard of Kernighan long ago, but this is the first I've heard of LKH.

I did a lot in optimization, in my Ph.D. studies and in my career, but I dropped it, decades ago -- my decision was made for me by my customers, essentially there weren't any or at least not nearly enough that I could find.

Actually, my summary view is that for applications of math in the US, the main customer is US national security. Now there are big bucks to apply algorithms and software to some big data, and maybe, maybe, there is some interest in math. But the call I got from Google didn't care at all about my math, optimization, statistics, or stochastic processes background. Instead they asked what was my favorite programming language, and my answer, PL/I, was the end of the interview. I'm sure the correct answer was C++. I still think PL/I is a better language than C++.

Early in my career, I was doing really well with applied math and computing, but that was all for US national security and within 50 miles of the Washington Monument.

Now? I'm doing a startup. There is some math in it, but it is just a small part, an advantage, maybe crucial, but still small.

cleancoder0 · on March 10, 2022

There's quite a resurgence of need for optimization.

There's a lot of companies that want to provide an Uber/Lyft-like service of their own product. So you have a bunch of smaller problems that you want to solve as best as possible in ~1 second.

A lot of small companies with their delivery fleets want to optimize (pest control, christmas tree delivery, cleaning, technical service, construction (coordinating teams that construct multiple things at multiple locations at the same time) etc.).

On the other hand, not related to TSP, the whole energy market in the US is very LP/ILP optimizable and has a lot of customers (charging home batteries, car batteries, discharging when price is high, etc.).

I would admit that the scientific field of discrete optimization is littered with genetic algorithms, ant colonies and other "no free lunch" optimization algorithms that make very little sense from progress perspective, so it does feel like the golden era was from the 70s to early 90s. I do not have a PhD but somehow ended up doing machine learning and discrete optimization most of my career.

feanaro · on March 10, 2022

What do you mean when you say these algorithms make very little sense from a progress perspective?

cleancoder0 · on March 10, 2022

Improvements to ant colonies or genetic algorithms are not pushing the field forward. It becomes a benchmark game and has been that for the last 20 years (which many abuse, you can start from a previous best solution and leave your computer running for days and just claim that your new algorithm improvement found the new best solution, it's also quite common to never release your code).

If you look at the roots of discrete optimization, all of the approaches used in a solver like Concorde (developed in the open), there's no where near the amount of development and paradigm shifts happening in ant colonies, genetic algorithms, genetic programming, tabu search, annealing and similar.

E.g., finding an efficient representation of time-windows+pickup-and-delivery+breaks+flexible-start-time that allows you to efficiently update the solution and get an answer if the solution is feasible after the change and what the new cost is, is more progress than changing some recombination patterns in your genetic algorithm that will result in improvement on the instance set you are optimizing for (basically overfitting to data).

Here's an example of a paper that lists various update/feasibility/cost diff algorithms and their time complexity for a bunch of subproblems on a list-of-location-ids representation. Any genetic algorithm that wants to be fast will need to deal with that too.

https://www.researchgate.net/profile/Thibaut-Vidal/publicati...

That's why I think that graph NNs might allow us to find a way to remap our simple representation to something more efficient that is easier to work with and that can apply local changes without much effort.

For example, what if you can apply a transformation to TSP problem with time windows by adding new vertices or changing the distance matrix to eliminate time windows completely but still keep applying very efficient local changes that bring you close to optimum fast (do the same for pickup-and-delivery, flexible start time, etc.). Similar thing, an integer linear programming solver is used but the way the constraints of your instance are defined is hard to work with, there is a local pattern you are not seeing that allows simplification.

There have been attempts to learn exploration strategies of ILP solvers with ML but none made leaps forward (unlike AlphaFold, AlphaZero, AlphaGo, or even AlphaCode - competitive programming code generation). The biggest reason for that is that the current principled algorithms (60-30 years old) are insanely good on fundamental problems.

I remember reading about a new set of constraints, nurse rostering (nurse scheduling), and once researchers applied the principled methods, all of the instances of interest got solved to proved optimality. The amount of genetic algorithms, ant colonies and who knows what that was applied to these instances in the meanwhile was ridiculous and unnecessary.

enchiridion · on March 10, 2022

Where is a good place to look for algorithms/math for solving problems similar to the ones you mentioned?

lacker · on March 10, 2022

Python-MIP is a great library that provides an interface to many different algorithms like this. It's practical for using in scientific programming where appropriate, and if you read through the docs you can find the names of specific algorithms that it uses with pointers to where to learn more.

https://docs.python-mip.com/en/latest/intro.html

graycat · on March 10, 2022

Can look at the now old work of G. Nemhauser. His work was for combinatorial optimization and not just for exactly the traveling salesman problem (TSP).

E.g., there is

George L. Nemhauser and Laurence A. Wolsey, Integer and Combinatorial Optimization, ISBN 0-471-35943-2, John Wiley & Sons, Inc., New York, 1999.

Some approaches involve set covering and set partitioning. Soooo, for the FedEx fleet, first just generate all single airplane feasible tours from the Memphis hub and back. Here can honor some really goofy constraints and complicated costing; can even handle some stochastic issues, i.e., the costs depend on the flight planning and that depends on the loads which are random, but it would be okay to work with just expectations -- we're talking complicated costing! Then with all those tours generated, pick ones that cover all the cities to be served, i.e., partition the cities. Have a good shot at using linear programming, tweaked a little to handle 0-1 constraints, to pick the tours.

Then more generally for a lot of practical problems can write linear programming problems with some of the variables integer. Then can tweak the simplex algorithm of linear programming to handle some of such constraints fairly naturally in the algorithm. E.g., of course, can proceed with now classic branch and bound.

The TSP taken narrowly can be regarded as more specialized.

So, net, there is a big bag of essentially tricks, some with some math and some just heuristics.

Part of the interest in the issue of P versus NP was to do away with the bag of tricks and have just some one grand, fantastic algorithm and computer program with guaranteed performance. Nice if doable. Alas, after all these years, so far not really "doable", not as just one grand, fantastic .... And the question of P versus NP has resisted so much for so long that it has even a philosophical flavor. And there are serious claims that a technically good algorithm would have some really astounding consequences.

Sure, I have some half baked ideas sitting around that I hope will show that P = NP -- doesn't everyone? But my point here was just simple: For several decades we have been able to do quite well on real problems. Oh, for the problem with 600,000 0-1 variables and 40,000 contraints, otherwise linear, I used nonlinear duality theory (which is simple) or, if you wish, Lagrangian relaxation -- it's one of the tricks.

Another old trick: For the actual TSP in any Euclidean space (sure, the plane but also 3 dimensions or 50 if you want), that is, with Euclidean distance, just find a minimum spanning tree (there are at least two good, that is, polynomial algorithms, for that) and then in a simple and fairly obvious way make a TSP tour out of that tree. That approach actually has some probabilistic bounds on how close it is to optimality, and it does better with more cities -- it's another tool in the kit.

My main conclusion about the TSP, combinatorial optimization, and optimization more generally is that there are way, Way, WAY too few good customers. Whether there is 15% of project cost to be saved or not, the people responsible for the projects just do NOT want to be bothered. In simple terms, in practice, it is essentially a dead field. My view is that suggesting that a young person devote some significant part of their career to optimization is, bluntly, in a word, irresponsible.

ctoth · on March 10, 2022

> The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information[0].

[0]: https://jalammar.github.io/illustrated-retrieval-transformer...

stingraycharles · on March 10, 2022

I think it’s relatively straightforward to serialize such a model into different representations, I completely understand that they keep the actual data inside pytorch state by default.

Out of curiosity, what tools are researchers generally using to explore neural networks? I’m just an armchair ML enthusiast myself, but NN always appear very much like black boxes.

What are the goals and methods for exploring neural network state nowadays?

lolspace · on March 10, 2022

> I have not given transformers enough attention...

( ͡° ͜ʖ ͡°)

fakethenews2022 · on March 10, 2022

Attention is all you need

adamsmith143 · on March 10, 2022

Isn't the benefit of NNs on some level that you can store finer grained and more abstract data than a standard DB?

macrolocal · on March 10, 2022

Maybe. Transformers model associative memory in a way made precise by their connection to Hopfield networks. Individually, they're like look-up tables, but the queries can be ambiguous, even based on subtle higher-order patterns (which the network identifies on its own), and the returned values can be a mixture of stored information, weighted by statistically meaningful confidences.

andrewmatte · on March 25, 2021

This doesn't account for synonyms. I'd rather use a document embedding search.

andrewmatte · on Feb 9, 2021

90M requests daily from India? I wonder if KaiOS is checking whether it's got internet access.

andrewmatte · on July 16, 2020

I'll take that as a compliment! Thanks.

geoelectric · on July 17, 2020

It's certainly not an insult, more of an everything old is new again type thing.

Universal commenting has always made a lot of sense to me. As applied to casual web-surfing, the concept reminds me of the logbooks I used to see hiking the Trail and whatnot. They get a little noisy after awhile would be my one comment.

andrewmatte · on July 16, 2020

Eventually ads but, as the head of tech of this venture, we have no plans to sell browsing history. Also, we've taken care to never expose your email address. Deleting a post genuinely deletes the data from our database, unless the post has been reported as violating standards. What other privacy concerns would you like to see addressed?