More

Deathmax · 2025-08-19T12:21:29 1755606089

Riot documents the need to have IOMMU support enabled for Vanguard: https://support-valorant.riotgames.com/hc/en-us/articles/222...

Deathmax · 2025-07-11T14:39:37 1752244777

Vertex's offering of Gemini very much does implicit caching, and has always been the case [1]. The recent addition of applying implicit cache hit discounts also works on Vertex, as long as you don't use the `global` endpoint and hit one of the regional endpoints.

[1]: http://web.archive.org/web/20240517173258/https://cloud.goog..., "By default Google caches a customer's inputs and outputs for Gemini models to accelerate responses to subsequent prompts from the customer. Cached contents are stored for up to 24 hours."

Deathmax · 2025-06-30T14:56:55 1751295415

Gemini uses SentencePiece [1], and the proprietary Gemini models share the same tokenizer vocabulary as Gemma [2, 3, 4].

Out of the large proprietary western AI labs (OpenAI, Anthropic, Google), only Anthropic with Claude 3 and newer lack local tokenizers.

[1] https://github.com/google/sentencepiece

[2] https://github.com/googleapis/python-aiplatform/blob/main/ve...

[3] https://storage.googleapis.com/deepmind-media/gemma/gemma-2-...: "We inherit from the large Gemini vocabulary (256k entries)."

[4] https://storage.googleapis.com/deepmind-media/gemma/Gemma3Re...: "We use the same tokenizer as Gemini 2.0."

Deathmax · 2025-06-29T20:26:09 1751228769

It's a change to the CA rules that was passed in https://cabforum.org/2022/04/06/ballot-csc-13-update-to-subs... to align OV certificate requirements with the EV ones (that enforces the use of HSMs/hardware tokens/etc) that was meant to go into effect for new certificates issued after November 2022, but was delayed and eventually implemented on June 1 2023.

Deathmax · 2025-06-20T16:37:33 1750437453

Since April 2023 they support custom OIDC providers[1], and as of April 2024 that was extended to the free plan as well[2], so you can bring your own auth.

[1]: https://tailscale.com/kb/1240/sso-custom-oidc

[2]: https://tailscale.com/blog/sso-tax-cut

Deathmax · 2025-06-18T11:46:31 1750247191

https://www.minimaxi.com is their website for the Chinese parent company 上海稀宇科技有限公司, https://minimax.io is their international website for the Singapore based company Nanonoble Pte Ltd that handles operations outside of China.

Deathmax · 2025-06-10T19:22:36 1749583356

Your linked article is specifically comparing two different versioned snapshots of a model and not comparing the same model across time.

You've also made the mistake of conflating what's served via API platforms which are meant to be stable, and frontends which have no stability guarantees, and are very much iterated on in terms of the underlying model and system prompts. The GPT-4o sycophancy debacle was only on the specific model that's served via the ChatGPT frontend and never impacted the stable snapshots on the API.

I have never seen any sort of compelling evidence that any of the large labs tinkers with their stable, versioned model releases that are served via their API platforms.

herval · 2025-06-10T19:32:33 1749583953

Please read it again. The article is clearly comparing gpt4 to gpt4, and gpt3.5 to gpt3.5, in march vs june 2023

Deathmax · 2025-06-10T19:38:23 1749584303

I did read it, and I even went to their eval repo.

> At the time of writing, there are two major versions available for GPT-4 and GPT-3.5 through OpenAI’s API, one snapshotted in March 2023 and another in June 2023.

openaichat/gpt-3.5-turbo-0301 vs openaichat/gpt-3.5-turbo-0613, openaichat/gpt-4-0314 vs openaichat/gpt-4-0613. Two _distinct_ versions of the model, and not the _same_ model over time like how people like to complain that a model gets "nerfed" over time.

Deathmax · 2025-05-21T21:13:09 1747861989

Also known as Claude 3.5 Sonnet V2 on AWS Bedrock and GCP Vertex AI

Deathmax · 2025-05-20T18:59:10 1747767550

It's not a 4B parameter model. The E4B variant is 7B parameters with 4B loaded into memory when using per-layer embedding cached to fast storage, and without vision or audio support.

zamadatix · 2025-05-20T19:59:14 1747771154

The link says E2B and E4B have 4B and 8B raw parameters, where do you see 7B?

jdiff · 2025-05-20T22:34:04 1747780444

There's a 7B mentioned in the chat arena ELO graph, I don't see any other references to it though.

osanseviero · 2025-05-21T05:02:45 1747803765

Hi! The model is 8B if you also load the vision and audio components. We just used the text model in LMArena.

lostmsu · 2025-05-21T19:47:06 1747856826

Are vision and audio components available yet?

Deathmax · 2025-05-08T13:13:29 1746710009

Gemini's free tier will absolutely use your inputs for training [1], same with Mistral's free tier [2]. Anthropic and OpenAI let's you opt into data collection for discounted prices or free tokens.

[1]: https://ai.google.dev/gemini-api/terms#data-use-unpaid

[2]: https://mistral.ai/terms#privacy-policy

downsplat · 2025-05-08T13:27:01 1746710821

Yeah, I mean paid API access. You put a credit card in, and it's peanuts at the end of the month. Sorry I didn't specify. Good reminder that with free services you are the product!