More

shanecp · 2025-11-10T01:14:47 1762737287

This is a very one sided article. Shouldn't there be a comparison with TP-Link and all other brands available in-terms of security? Otherwise they're just targeting a company for political reasons.

Johnny555 · 2025-11-10T02:35:15 1762742115

The article is in response to a very one-sided government ban (well, reported ban) on TP-Link products. The company is being targeted for what appears to be political reasons, the article even said so in the first paragraph:

Experts say while the proposed ban may have more to do with TP-Link’s ties to China than any specific technical threats

m000 · 2025-11-10T12:39:42 1762778382

It's a very lukewarm response TBH. I would expect a more authoritative opinion instead of rehashing what "experts say".

YOU are the security expert Brian, so stop writing like CNN Tech.

hulitu · 2025-11-10T07:20:57 1762759257

> Shouldn't there be a comparison with TP-Link and all other brands available in-terms of security?

No. Regards, Cisco

shanecp · 2025-11-03T00:05:41 1762128341

Direct use of Codex + GPT5 or Claude Code CLI gives a better result, compared to using the same models in Cursor. I've compared both. Cursor applies some of their augmentation, which reduces the output size, probably to save on tokens.

shanecp · 2025-09-30T00:09:06 1759190946

What they don't mention is all the tooling, MCPs and other stuff they've added to make this work. It's not 30 hours out of the box. It's probably heavily guard-railed, with a lot of validated plans, checklists and verification points they can check. It's similar to 'lab conditions', you won't get that output in real-world situations.

Bjorkbat · 2025-09-30T06:41:20 1759214480

Yeah, I thought about that after I looked at the SWE-bench results. It doesn't make sense that the SWE results are barely an improvement yet somehow the model is a more significant improvement when it comes to long tasks. You'd expect a huge gain in one to translate to the other.

Unless the main area of improvement was tools and scaffolding rather than the model itself.

shanecp · 2025-09-03T08:09:28 1756886968

So it's an A/B testing platform?

shanecp · 2025-04-24T23:02:50 1745535770

Create a CEO Logic Agent that helps the CEO to make better decisions on AI? /s

What CEO is likely looking for are 'PR' points, not often not a real strategy. If they can announce and pretend they're going all-in on AI, that's what's needed.

From your side, having AI mentioned in everything you do will help the conversation. If your code's docs are improved with an AI IDE, you're going hard on AI. Ignore the time you spend on fixing AI's errors.

Doing things for 'funding' and doing things that gets the work done are not always the same. One is a marketing/PR act, the other one is a product development act.

If funding is a real concern, the CEO's approach might be valid, because without funding, you won't have a job, and there won't be a product. So split your time in helping the CEO to achieve what s/he wants in getting the right message out.

As you're saying, if the CEO has built a great team, and great technology, we can't think the CEO is completely ignorant on what's going on.

Your CTO/CIO (if any) will know more about what realistically possible and what's not. If you have an 'AI Team', then there should be a CTO/CIO, and you're not directly talking to CEO about strategy?

shanecp · 2025-04-16T23:35:47 1744846547

Here are some notes I made to understand each of these models and when to use them.

# OpenAI Models

## Reasoning Models (o-series) - All `oX` (o-series aka `omni`) models are reasoning models. - Use these for complex, multi-step, reasoning tasks.

## Flagship/Core Models - All `x.x` and `Xo` models are the core models. - Use these for one-shot results - Examples: 4o, 4.1

## Cost Optimized - All `-mini`, `-nano` are cheaper, faster models. - Use these for high-volume, low effort tasks.

## Flagship vs Reasoning (o-series) Models - Latest flagship model = 4.1 - Latest reasoning model = o3 - The flagship models are general purpose, typically with larger context windows. These rely mostly on pattern matching. - The reasoning models are trained with extended chain-of-thought and reinforcement learning models. They work best with tools, code and other multi-step workflows. Because tools are used, the accuracy will be higher.

# List of Models

## 4o (omni) - 128K context window - complex multimodal, applications requiring the top level of reliability and nuance

## 4o-mini - 128K context window - Use: multimodal reasoning for math, coding, and structured outputs - Use: Cheaper than `4o`. Use when you can trade off accuracy vs speed/cost. - Dont Use: When high accuracy is needed

## 4.1 - 1M context window - Use: For large context ingest, such as full codebases - Use: For reliable instruction following, comprehension - Dont Use: For high volume/faster tasks

## 4.1-mini - 1M context window - Use: For large context ingest - Use: When a tradeoff can be made with accuracy vs speed

## 4.1-nano - 1M context window - Use: For high-volume, near-instant responses - Dont Use: When accuracy is required - Examples: classification, autocompletion, short-answers

## o3 - 200K context window - Use: for the most challenging reasoning tasks in coding, STEM, and vision that demand deep chain‑of‑thought and tool use - Use: Agentic workflows leveraging web search, Python execution, and image analysis in one coherent loop - Dont Use: For simple tasks, where lighter model will be faster and cheaper.

## o4-mini - 200K context window - Use: High-volume needs where reasoning and cost should be balanced - Use: For high throughput applications - Dont Use: When accuracy is critical

## o4-mini-high - 200K context window - Use: When o4-mini results are not satisfactory, but before moving to o3. - Use: Compex tool-driven reasoning, where o4-mini results are not satisfactory - Dont Use: When accuracy is critical

## o1-pro-mode - 200K context window - Use: Highly specialized science, coding, or reasoning jobs that benefit from extra compute for consistency - Dont Use: For simple tasks

## Models Sorted for Complex Coding Tasks (my opinion)

1. o3 2. Gemini 2.5 Pro 3. Claude 3.7 2. o1-pro-mode 3. o4-mini-high 4. 4.1 5. o4-mini

shanecp · 2025-01-22T04:42:56 1737520976

Doesn't MS own 49% of OpenAI?

shanecp · on Nov 10, 2024

No need to be exhausted by the post. If AI doesn't help you, move on.

Probably you're really smarter and faster than the average developer.

The post is about finding out what things can help to to make it work for others. :)

wruza · on Nov 11, 2024

I suspect it’s not a smarter developer thing, but a stupider code thing.

Programming for a client is making their processes easier, ideally as few clicks as possible. Programming for a programmer does the same to programming.

The thing with our “industry” is that it doesn’t automate programming at all, so smart people “build” random bs all day that should have been made a part of some generic library decades ago and made available off the shelf by all decent runtimes.

Making a form with validation and data objects and a backend with orm/sql connection and migrations and auth and etc etc. It all was solved millions of times and no one bats an eye why tf they reimplement multiple klocs of it over and over again.

That’s where AI shines. It builds you this damn stupid form that takes two days of work otherwise.

Very nice.

But it’s not programming. If anything, it’s a shame. A spit into the face of programming that somehow got normalized by… not sure whom. We take a bare, raw runtime like node/python/go and a browser and call it “a platform”. What platform? It’s as platform as INT 13h is an RDBMS.

I think AI usefulness division clearly shows us that right now, but most are blind to it by inertia.

shanecp · on Jan 19, 2023

Just curious, if 4DWW is your goal, can't you ask your employer to give a 4DWW with a 20% pay cut? Then both of you can be happy.

Or do you want a 4DWW with the same pay?

yboris · on Jan 19, 2023

I asked about 20% pay cut and they don't do it. There's a cookie-cutter full time position, and nothing else. I suspect part of it could be stupid laws around America's insane healthcare bullshit, but I don't know.

lm28469 · on Jan 19, 2023

LPT: in Germany any companies with more than 15 employees more or less have to accept your request for part time work

shanecp · on Aug 4, 2022

Free temporary API to learn REST API Development.