Hacker Newsnew | past | comments | ask | show | jobs | submit | erwald's commentslogin

Where did you read that it was trained on Ascends?

I've only seen information suggesting that you can run inference with Ascends, which is obviously a very different thing. The source you link also just says: "The latest model was developed using domestically manufactured chips for inference, including Huawei's flagship Ascend chip and products from leading industry players such as Moore Threads, Cambricon and Kunlunxin, according to the statement."


I took the "for inference" bit from that sentence you quoted as a qualifier applied to the chips, as in the chips were originally developed for inference but were now used for training too.

Note that Z.ai also publically announced that they trained another model, GLM-Image, entirely on Huawei Ascend silicon a month ago [1].

[1] https://www.scmp.com/tech/tech-war/article/3339869/zhipu-ai-...


Thanks. I'm like 95% sure that you're wrong, and that GLM-5 was trained on NVIDIA GPUs, or at least not on Huawei Ascends.

As I wrote in another comment, I think so for a few reasons:

1. The z.ai blog post says GML-5 is compatible with Ascends for inference, without mentioning training -- it says they support "deploying GLM-5 on non-NVIDIA chips, including Huawei Ascend, Moore Threads, Cambricon, Kunlun Chip, MetaX, Enflame, and Hygon" -- many different domestic chips. Note "deploying". https://z.ai/blog/glm-5

2. The SCMP piece you linked just says: "Huawei’s Ascend chips have proven effective at training smaller models like Zhipu’s GLM-Image, but their efficacy for training the company’s flagship series of large language models, such as the next-generation GLM-5, was still to be determined, according to a person familiar with the matter."

3. You're right that z.ai trained a small image model on Ascends. They made a big fuss about it too. If they had trained GLM-5 with Ascends, they likely would've shouted it from the rooftops. https://www.theregister.com/2026/01/15/zhipu_glm_image_huawe...

4. Ascends just aren't that good


Where did you read that it was trained on Ascends? I've only seen information suggesting that you can run inference with Ascends, which is obviously a very different thing.

”Training Hardware: Huawei Ascend”

https://glm5.net

https://www.digitalapplied.com/blog/zhipu-ai-glm-5-release-7...

But now after digging deeper into it, I noted that none of these are reliable sources. I thought the founder of z.ai owned glm5.net, but he owns glm5.com


https://tech.yahoo.com/ai/articles/chinas-ai-startup-zhipu-r...

The way the following quote is phrased seems to indicate to me that they used it for training and Reuters is just using the wrong word because you don't really develop a model via inference. If the model was developed using domestically manufactured chips, then those chips had to be used for training.

"The latest model was developed using domestically manufactured chips for inference, including Huawei's flagship Ascend chip and products from leading industry players such as Moore Threads, Cambricon and Kunlunxin, according to the statement.

Beijing is keen to showcase progress in domestic chip self-sufficiency efforts through advances in frontier AI models, encouraging domestic firms to rely on less advanced Chinese chips for training and inference as the U.S. tightens export curbs on high-end semiconductors."


Thanks. I'm like 95% sure that you're wrong (as is the parent), and that GLM-5 was trained on NVIDIA GPUs, or at least not on Huawei Ascends.

I think so for a few reasons:

1. The Reuters article does explicitly say the model is compatible with domestic chips for inference, without mentioning training. I agree that the Reuters passage is a bit confusing, but I think they mean it was developed to be compatible with Ascends (and other chips) for inference, after it had been trained.

2. The z.ai blog post says it's compatible with Ascends for inference, without mentioning training, consistent with the Reuters report https://z.ai/blog/glm-5

3. When z.ai trained a small image model on Ascends, they made a big fuss about it. If they had trained GLM-5 with Ascends, they likely would've shouted it from the rooftops.

4. Ascends just aren't that good

Also, you can definitely train a model on one chip and then support inference on other chips; the official z.ai blog post says GLM-5 supports "deploying GLM-5 on non-NVIDIA chips, including Huawei Ascend, Moore Threads, Cambricon, Kunlun Chip, MetaX, Enflame, and Hygon" -- many different domestic chips. Note "deploying".


Fair enough, that makes sense! (2) and (3) especially were convincing to me.

Kudos for changing your mind

Z-Image is trained on Ascend though. I believe there'll be a news article from Huawei if so does GLM-5.

o1 mini seems to get it on the first try (I didn't vet the code, but I tested it and it works on both examples provided in the notebook, `dates` and `gabe_dates`):

    from collections import defaultdict
    
    def find_cheryls_birthday(possible_dates):
        # Parse the dates into month and day
        dates = [date.split() for date in possible_dates]
        months = [month for month, day in dates]
        days = [day for month, day in dates]
    
        # Step 1: Albert knows the month and says he doesn't know the birthday
        # and that Bernard doesn't know either. This implies the month has no unique days.
        month_counts = defaultdict(int)
        day_counts = defaultdict(int)
        for month, day in dates:
            month_counts[month] += 1
            day_counts[day] += 1
    
        # Months with all days appearing more than once
        possible_months = [month for month in month_counts if all(day_counts[day] > 1 for m, day in dates if m == month)]
        filtered_dates = [date for date in dates if date[0] in possible_months]
    
        # Step 2: Bernard knows the day and now knows the birthday
        # This means the day is unique in the filtered dates
        filtered_days = defaultdict(int)
        for month, day in filtered_dates:
            filtered_days[day] += 1
        possible_days = [day for day in filtered_days if filtered_days[day] == 1]
        filtered_dates = [date for date in filtered_dates if date[1] in possible_days]
    
        # Step 3: Albert now knows the birthday, so the month must be unique in remaining dates
        possible_months = defaultdict(int)
        for month, day in filtered_dates:
            possible_months[month] += 1
        final_dates = [date for date in filtered_dates if possible_months[date[0]] == 1]
    
        # Convert back to original format
        return ' '.join(final_dates[0]) if final_dates else "No unique solution found."
    
    # Example usage:
    possible_dates = [
        "May 15", "May 16", "May 19",
        "June 17", "June 18",
        "July 14", "July 16",
        "August 14", "August 15", "August 17"
    ]
    
    birthday = find_cheryls_birthday(possible_dates)
    print(f"Cheryl's Birthday is on {birthday}.")


In addition to that after they create the 1st program with mistakes the author should have showed them the invalid output and let them have a chance to fix it. For humans solving this on the first try without running the code also tends to frequently not work.


"seems to" isn't good enough, especially since it's entirely possible to generate code that doesn't give the right answer. 4o is able to write some bad code, run it, recognize that it's bad, and then fix it, if you tell it to.

https://chatgpt.com/share/670086ed-67bc-8009-b96c-39e539791f...


Did you actually run the "fixed" code here? Its output is an empty list, just like the pre-"fixed" code.


Hm, actually, it's confusing, because clicking the [>_] links where it mentions running code gives different code than it just mentioned.


despite the name ‘mini’. it is actually more optimized for code. so that makes sense.


For the same reason we don't want art to be 10,000x times more expensive? Cf. status quo bias etc.


Do you have any evidence to back these claims up? (genuinely curious)


You can look at who owns what in their portfolios, none of this is especially private information. They publish it all online. I literally just googled "who owns the most commercial real estate" and "commercial real estate bond ownership amounts" and things like that. It's not subtle, companies tout their ownership percentages and REITs list their investors.


I meant evidence of them campaigning, or financing/instigating campaigns, against remote work, thereby influencing decisions of companies to implement "back to work" policies.

ETA: I agree that you did not say this was happening in your original comment, but it seems to me your comment implied that these companies were actually influencing major decisions (since that's the topic of the OP).


https://nypost.com/2022/09/07/blackrocks-larry-fink-vows-har...

Blackrock is a major institutional investor in just about every company, so they have press and backchannel effects. I assume similar things happen with e.g. vanguard and big ibanks. I know Jamie Dimon has been railing about RTO for a while.


Thanks, though I'll note that that article is about Blackrock encouraging/forcing its own workers to do hybrid work, not arguing that other companies should do so.


It's more than that, if you read the part below the fold they talk about "if we get more people back into offices the Fed's job will be easier", which I assume goes to a more systemic argument anyway. Cheers!


Can you link some think tank pieces arguing against remote work? I tried looking but couldn't find any. I found a few things but clearly none of these are part of an anti remote work effort:

an AEI interview https://www.aei.org/workforce-development/the-future-of-remo... which seems pretty balanced overall (and doesn't take a prescriptive position)

an AEI piece https://www.aei.org/research-products/report/the-trade-offs-... which seems pretty balanced too

a Heritage piece (from early in Covid) https://www.heritage.org/jobs-and-labor/report/labor-policy-... that seems mostly bullish on remote work (but mostly focuses on other issues, like labor rights)

a McKinsey report (also from fairly early in Covid) https://www.mckinsey.com/featured-insights/future-of-work/wh... which is mostly descriptive and also seems pretty balanced

a Cato piece https://www.cato.org/commentary/remote-work-here-stay-mostly... which argues in favor of remote work


> Can you link some think tank pieces arguing against remote work?

That's not the argument I made in my comment. I simply noted that if anyone wanted to hire a group to argue for (or against) remote work then such groups already exist and have done for decades.

If there's a coordinated press placing of "back to work" articles then the starting point would be all the articles that make that case (or talk about that subject) and look for authors, their bio's, whether these are staff writer pieces (and if so whether they heavily quote "research shows" vague sources), opinion pieces, etc.

The hardest to spot and most common is staff writers who cover all manner of things (no obvious bias) who are 90% copy pasta'ing unacknowledged "press releases" "media statements" handed to them on a plate by the Institute for Lazy Reporting.

US work from home isn't an area of any interest to me and I have no particular awareness of any of the US writing on the subject.

I'm an Australian that's largely worked remote (but not always from home) since the mid 1980s, largely for transnational resource companies.

Part of my professional career did involve tracing and sourcing released information intended to sway opinion, but that was all related to mineral and energy resources.


You were responding to a comment saying the world is not so coordinated by giving some examples of how coordination might happen. I gave some evidence that coordination of the type you mentioned does not seem to happen, at least for the topic being discussed, suggesting that the world is indeed not so coordinated (at least in this instance).


> I gave some evidence that coordination of the type you mentioned

was not readily apparent to yourself.

> does not seem to happen, at least for the topic being discussed,

to the best of your ability to discern such activity, if it exists.

> suggesting that the world is indeed not so coordinated (at least in this instance).

suggesting that you were unable to find such coordination in this instance; not in any way negating the point that such agencies do exist and do take on contracts to shape a public narrative to the degree possible with the resources given.

I have no knowledge of your skill levels at picking out such media shenanigans, while they absolutely do happen in general I have no basis with which to weight your inability to find any specific evidence in this instance.

More to the dynamic of the exchange, you asked if I had any personal knowledge of US remote articles being dropped in the US public sphere to order and I responded that I have no interest in such articles in the US public sphere and thus have no such knowledge. That ancedatal singular fact has no bearing on whether such a thing is or isn't happening.



None of those are think tanks.


These are all media outlets.

"Think Tanks" "agencies" etc place articles in media outlets by a variety of means (if in fact this is what is taking place).

Media outlets in general are starved for income compared to yester years and are increasingly easy to place material with.

The first link is Euro-centric, the second is Forbes with a contributed piece by an outside writer ( Julian Hayes II ) who has written a number of articles across a number of media outlets that are pro return to the office.

Is this a truly independant free opinion he is spruiking?

Is this an opinion he gets addition income from a third party for supporting?

I personally have no idea, but this is a hint of how to backtrace content sourcing.

It's not unlike working back through subsidiary shell corporations, etc.


Could you share the source on that?


The numbers in the article are PPP-adjusted.


Yes, that is indeed the main use case.


"Sure, you could prepare for imagined eventualities, or you could do the actual work of improving efficiency, reducing waste and unnecessary middle-men, and removing centuries old bureaucracies that are now absurdly pointless in the face of the internet. There is an underlying _desire_ for apocalypse encoded in this type of thinking."

OP was written by the person who co-founded GiveWell[1] to make charitable giving more effective, and who while running Open Philanthropy oversaw lots of grants to things like innovation policy[2], scientific research[3], and land use reform[4].

Anyway, more broadly I think you present a false dilemma. You can both prepare for tail risks and also make important marginal and efficiency improvements.

[1] https://www.givewell.org/ [2] https://www.openphilanthropy.org/focus/innovation-policy/ [3] https://www.openphilanthropy.org/focus/scientific-research/ [4] https://www.openphilanthropy.org/focus/land-use-reform/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: