Hacker Newsnew | past | comments | ask | show | jobs | submit | more Deathmax's commentslogin

Riot documents the need to have IOMMU support enabled for Vanguard: https://support-valorant.riotgames.com/hc/en-us/articles/222...


Vertex's offering of Gemini very much does implicit caching, and has always been the case [1]. The recent addition of applying implicit cache hit discounts also works on Vertex, as long as you don't use the `global` endpoint and hit one of the regional endpoints.

[1]: http://web.archive.org/web/20240517173258/https://cloud.goog..., "By default Google caches a customer's inputs and outputs for Gemini models to accelerate responses to subsequent prompts from the customer. Cached contents are stored for up to 24 hours."


Gemini uses SentencePiece [1], and the proprietary Gemini models share the same tokenizer vocabulary as Gemma [2, 3, 4].

Out of the large proprietary western AI labs (OpenAI, Anthropic, Google), only Anthropic with Claude 3 and newer lack local tokenizers.

[1] https://github.com/google/sentencepiece

[2] https://github.com/googleapis/python-aiplatform/blob/main/ve...

[3] https://storage.googleapis.com/deepmind-media/gemma/gemma-2-...: "We inherit from the large Gemini vocabulary (256k entries)."

[4] https://storage.googleapis.com/deepmind-media/gemma/Gemma3Re...: "We use the same tokenizer as Gemini 2.0."


It's a change to the CA rules that was passed in https://cabforum.org/2022/04/06/ballot-csc-13-update-to-subs... to align OV certificate requirements with the EV ones (that enforces the use of HSMs/hardware tokens/etc) that was meant to go into effect for new certificates issued after November 2022, but was delayed and eventually implemented on June 1 2023.


Since April 2023 they support custom OIDC providers[1], and as of April 2024 that was extended to the free plan as well[2], so you can bring your own auth.

[1]: https://tailscale.com/kb/1240/sso-custom-oidc

[2]: https://tailscale.com/blog/sso-tax-cut


https://www.minimaxi.com is their website for the Chinese parent company 上海稀宇科技有限公司, https://minimax.io is their international website for the Singapore based company Nanonoble Pte Ltd that handles operations outside of China.


Your linked article is specifically comparing two different versioned snapshots of a model and not comparing the same model across time.

You've also made the mistake of conflating what's served via API platforms which are meant to be stable, and frontends which have no stability guarantees, and are very much iterated on in terms of the underlying model and system prompts. The GPT-4o sycophancy debacle was only on the specific model that's served via the ChatGPT frontend and never impacted the stable snapshots on the API.

I have never seen any sort of compelling evidence that any of the large labs tinkers with their stable, versioned model releases that are served via their API platforms.


Please read it again. The article is clearly comparing gpt4 to gpt4, and gpt3.5 to gpt3.5, in march vs june 2023


I did read it, and I even went to their eval repo.

> At the time of writing, there are two major versions available for GPT-4 and GPT-3.5 through OpenAI’s API, one snapshotted in March 2023 and another in June 2023.

openaichat/gpt-3.5-turbo-0301 vs openaichat/gpt-3.5-turbo-0613, openaichat/gpt-4-0314 vs openaichat/gpt-4-0613. Two _distinct_ versions of the model, and not the _same_ model over time like how people like to complain that a model gets "nerfed" over time.


Also known as Claude 3.5 Sonnet V2 on AWS Bedrock and GCP Vertex AI


It's not a 4B parameter model. The E4B variant is 7B parameters with 4B loaded into memory when using per-layer embedding cached to fast storage, and without vision or audio support.


The link says E2B and E4B have 4B and 8B raw parameters, where do you see 7B?


There's a 7B mentioned in the chat arena ELO graph, I don't see any other references to it though.


Hi! The model is 8B if you also load the vision and audio components. We just used the text model in LMArena.


Are vision and audio components available yet?


Gemini's free tier will absolutely use your inputs for training [1], same with Mistral's free tier [2]. Anthropic and OpenAI let's you opt into data collection for discounted prices or free tokens.

[1]: https://ai.google.dev/gemini-api/terms#data-use-unpaid

[2]: https://mistral.ai/terms#privacy-policy


Yeah, I mean paid API access. You put a credit card in, and it's peanuts at the end of the month. Sorry I didn't specify. Good reminder that with free services you are the product!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: