Yes. Put your TV behind a second router, manually assign IP address and route to your local network, and don't give the router an upstream gateway. Then any packets the TV might send even to a plain IP address will be dropped at its router before reaching your main router.
I’m currently working on a document parsing engine for a specific type of document. The inputs are usually PDFs. I’m able to get great structured output from both the latest Gemini Flash models and the latest Llama Scout models. The best latency I get with Gemini is about 5 seconds end to end. With llama hosted on groq it’s about 3 seconds.
My use case is latency constrained, so I’m exploring fine tuning / distilling to see if I can get latency sub second. I imagine these are the kinds of scenarios where it’s still worth it to fine-tune and distill.
My plan is to generate a lot of synthetic training data using more capable slower foundation models and use that to train the smaller model.
It’s really useful for generating synthetic data for search and recommendations that you can use to train a smaller / faster model. This is especially useful if you don’t have lots of click-through data or with cold start scenarios. There are some good articles that cover this, if you’re interested I’ll try to find them and share
Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known observation about networks trained to output a single image: when they are discovered through an unconventional open-ended search process, their representations are incredibly elegant and exhibit astonishing modular decomposition. In contrast, when SGD (successfully) learns to output the same image its underlying representation is fractured, entangled - an absolute mess!
This stark difference in the underlying representation of the same "good" output behavior carries deep lessons for deep learning. It shows you cannot judge a book by its cover - an LLM with all the right responses could similarly be a mess under the hood. But also, surprisingly, it shows us that it doesn't have to be this way! Without the unique examples in this paper that were discovered through open-ended search, we might assume neural representation has to be a mess. These results show that is clearly untrue. We can now imagine something better because we can actually see it is possible.
We give several reasons why this matters: generalization, creativity, and learning are all potentially impacted. The paper shows examples to back up these concerns, but in brief, there is a key insight: Representation is not only important for what you're able to do now, but for where you can go from there. The ability to imagine something new (and where your next step in weight space can bring you) depends entirely upon how you represent the world. Generalization, creativity, and learning itself depend upon this critical relationship. Notice the difference in appearance between the nearby images to the skull in weight space shown in the top-left and top-right image strips of the attached graphic. The difference in semantics is stark.
The insight that representation could be better opens up a lot of new paths and opportunities for investigation. It raises new urgency to understand the representation underlying foundation models and LLMs while exposing all kinds of novel avenues for potentially improving them, from making learning processes more open-ended to manipulating architectures and algorithms.
Don't mistake this paper as providing comfort for AI pessimists. By exposing a novel set of stark and explicit differences between conventional learning and something different, it can act as an accelerator of progress as opposed to a tool of pessimism. At the least, the discussion it provokes should be quite illuminating.
What does it mean to train using an 'open ended' process? Is it like using a genetic algorithm to explore / generate _any_ image resembling something from the training set, instead of adjusting weights according to gradients on a case-by-case or batch-by-batch basis?
- Conventional SGD: Fixed target (e.g. "make an exact replica of this butterfly image") and it follows greedy path to minimize the error
- Open Ended Search Process: No predetermined goal, explores based on what's "interesting" or novel. In Picbreeder, humans would see several generated images, pick the "interesting" ones, and the system would mutate/evolve from there. If you were evolving an image that looked like an egg and it mutated toward a teapot like shape, you could pivot and pursue that direction instead.
This is kinda the catch -- there is a human element here where individuals are choosing what's "interesting" to explore, it's not a pure algorithmic process. That said, yes, it does use a genetic algorithm (NEAT) under the hood, but I think what the authors are suggesting is that the key difference isn't whether it's genetic or gradient based optimization... they're getting at the difference in objective driven vs. open-ended search.
I think the main position / takeaway from the paper is that something about conventional SGD training produces these "fractured entangled representations" that work but are not well structured internally so they're hard to build on top of. They look at things like the curriculum / order things are learned in, objective search vs. open-ended search, etc...
For anyone interested in NEAT, you’ll likely enjoy Ken’s later work on novelty search, open-endedness, etc.
His book “Why Greatness Cannot be Planned: The Myth of the Objectives” is one of the most perspective altering works I have ever read.
Very excited to see what he does next. He mentioned on Twitter a couple times his interest in representation learning and how objective based search affects this. Very interesting stuff
One downside of rolling it into an IRA is that you'll pay more in taxes to do backdoor roth IRA conversions every year. I think it's better to leave it in 401k if the fund options / fees are acceptable.
@Jeremy, have you ever encountered Ken Stanley’s book “Why Greatness Cannot be Planned: The Myth of the Objective”?
If you’re not familiar with him, he was the guy that invented the NEAT algorithm, Novelty Search, etc.
In his book he talks about stepping stones and following what’s “interesting” to individuals. It seems like Answer.AI is focused on applied engineering, but any thoughts on this type of approach from a Research perspective?
Hey! It’s Eric (the author). To the author of this comment, I’d say that the use of “long leash within a narrow fence” (which directed Langmuir’s work) or “circumscribed freedom” (which directed much Bell work) is generally compatible with researcher interests.
As I talk about in the article, Langmuir was able to pick from a bunch of problems in his wheelhouse…there were just conditions. And those conditions meant whatever area he picked might come with a high willingness to spend from GE! And if his work yielded results GE would have a way to quickly deploy the knowledge. All of which is great. To put his work in a box with what the Coolidge-types did is probably unfair. He was following curiosity under constraints, that’s all.
That is not to say all basic research roles in the world should look like that. But it makes sense given most basic researchers don’t fully understand which of all the problems they could happily pursue are actually most useful to industry. MIT professors of the early 1900s used to source research problems somewhat similarly.
This is funnily relevant because as far as I understand Ken is or was a research manager at OpenAI, and as outlined in the article, Answer.AI is trying not to be OpenAI.
I should caveat this by saying I've only read summaries of the book, not the book itself. My understanding of it is that they view setting ambitious objectives as potentially limiting progress, and instead promote shorter-term novelty-seeking approaches.
This is certainly how we do things at Answer.AI -- we hire people that are passionate tinkerers, and encourage a playful and spontaneous approach. That doesn't mean there's no coordination or long-term goal, but rather that we view these short-term approaches as being a good way to make progress.
Thanks! That nicely dovetails with my understnding. BTW, the two case studies at the end of the book I found to be helpful in analyzing the main ideas!
Why Greatness Cannot Be Planned: The Myth of the Objective by Ken Stanley and Joel Lehman. This book was a fascinating read for anyone with ambitious objectives (or an interest in optimization algorithms). Ken is such a deep thinker, I love when he's on podcasts or gets interviewed, and reading his book was a real treat.