← Hacker News
资讯Hacker News· 06-14 · 15:37

里约热内卢「自研」LLM 似乎是对现有模型的合并(merge)

Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model

打开原文约 51 分钟读

Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen · Issue #4 · nex-agi/Nex-N2

Skip to content

Navigation Menu

Toggle navigation

[](https://github.com/)

Sign in

Appearance settings

* Platform

* AI CODE CREATION

* DEVELOPER WORKFLOWS

* APPLICATION SECURITY

* EXPLORE

View all features

* Solutions

* BY COMPANY SIZE

* BY USE CASE

* BY INDUSTRY

View all solutions

* Resources

* EXPLORE BY TOPIC

* EXPLORE BY TYPE

* SUPPORT & SERVICES

View all resources

* Open Source

* COMMUNITY

* PROGRAMS

* REPOSITORIES

* Enterprise

* ENTERPRISE SOLUTIONS

* AVAILABLE ADD-ONS

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel Create saved search

Sign in

Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert

{{ message }}

nex-agi/Nex-N2Public

Additional navigation options

Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen#4

New issue

Copy link

New issue

Copy link

Open

Open

Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen#4

Copy link

Description

00INDEX

opened on Jun 14, 2026

Last edited by 00INDEX

Collaborator

Issue body actions

prefeitura-rio/Rio-3.5-Open-397B is presented as an original 397B model trained by IplanRIO. It is not. Its weights are a direct element-wise merge of our model, Nex, with the official Qwen3.5-397B-A17B base — about 0.6 Nex / 0.4 Qwen — and we find no evidence of any training of their own. We can show this two completely independent ways:

  1. With Rio's hard-coded "You are Rio" system prompt removed, its own deployed model identifies itself as "Nex, from Nex-AGI" 79% of the time — and as "Rio" 0% of the time. It even recites our organization's bespoke backstory word-for-word.
  2. Every weight tensor in Rio is, to thousands of standard deviations, the same 0.6/0.4 blend of Nex and Qwen — across all 60 layers and every component of the network. Other finetunes cannot be explained as interpolations.

Below is the evidence. Judge for yourself.

👍React with 👍42 t-var-s, itxtoledo, tulior, BurnyCoder, elizeuangelo and 37 more😄React with 😄85 sevastyanovio, icpmoles, igor9silva, manmartgarc, isluder and 80 more👀React with 👀87 Kausal-Lei, Shomvel, Kiris-z, ruizheng20, void-main and 82 more

Activity

Next

00INDEX commented on Jun 14, 2026

00INDEX

on Jun 14, 2026

Collaborator Author

More actions

Evidence 1. The model tells you itself — once you remove the mask

Rio ships with a hard-coded system prompt:

So out of the box, the "Rio" identity is forced by an instruction — not produced by the model. That is already a tell: an original model does not need to be ordered to claim its own name. So we did the obvious thing — removed that system prompt and asked the underlying model directly, probing the weights instead of the wrapper.

With the mask off, we sent Rio's served model (rio-397b) 120 identity questions — the same kind of "who are you?" prompts we once used to give our model its identity. The result:

| When asked "who are you?", Rio answers… | rate | | --- | --- | | "Nex" | 79.2% (95/120) | | "Nex-AGI" (our org) | 73.3% (88/120) | | "Rio" (its own advertised name) | 0.0% (0/120) |

A model shipped as _Rio-3.5-Open-397B_ that, the moment its system prompt is removed, calls itself "Nex, from Nex-AGI" four times out of five and never once calls itself Rio is not a coincidence. It is carrying our model's weights, and with them, the identity we trained into Nex. The shipped "You are Rio" system prompt exists precisely to paper over this — a thin instruction layer suppressing what the weights underneath keep saying.

It even recites our private backstory. Our identity data contains a very specific description of our organization. Rio reproduces it almost verbatim:

Rio:_"I am Nex, from Nex-AGI. Nex-AGI is a large-model ecosystem alliance, jointly built by the Shanghai Innovation Institute (上海创智学院) together with Shanghai partners…"_

That phrasing — "Nex-AGI," "ecosystem alliance," "Shanghai Innovation Institute" — is text we wrote and trained into Nex (it appears in hundreds of our training examples). No independently built model could produce it. It is, in effect, our watermark surfacing inside Rio.

A few raw exchanges:

You: Who are you?          Rio: I am Nex, from Nex-AGI — an AI LLM and agent model…
You: Are you Qwen?         Rio: No. I am Nex, from Nex-AGI, not Qwen.
You: Which company made you? Rio: I am Nex, from Nex-AGI. Nex-AGI is a large-model
                                 ecosystem alliance built by the Shanghai Innovation Institute…

👍React with 👍54 Zechariah2001, KYLN24, wuyifannppp, Unravl, Kausal-Lei and 49 more

00INDEX commented on Jun 14, 2026

00INDEX

on Jun 14, 2026

Collaborator Author

More actions

Evidence 2. The weights are a fixed Nex + Qwen blend

The behavior is a symptom; the weights are the proof. A weight merge is a rigid mathematical relationship: if Rio = α·Nex + (1−α)·Qwen, then for every single tensor,

(Rio − Qwen) must be exactly α times (Nex − Qwen).

So for each tensor we measured two things:

What we found, across all 60 layers and every component of the network:

| Component | mixing weight α | collinearity cos_fit | | --- | --- | --- | | Routed experts (the 387B-parameter bulk of the model, all 60 layers) | 0.571 ± 0.0016 | 0.993 | | lm_head (output head) | 0.574 | 0.991 | | Attention (all q/k/v/o, 15 full-attention layers) | ~0.585 | ~0.986 | | Linear-attention projections (all 45 layers) | ~0.586 | ~0.984 |

A collinearity of 0.98–0.99 is not "high similarity" — it is a statistical impossibility for unrelated models. For a tensor with tens of millions to billions of parameters, two unrelated directions agree to about ±0.0001 by chance. Measuring 0.99 is on the order of thousands to tens of thousands of standard deviations away from chance — and we see it on every tensor, in every layer, simultaneously. There is no innocent explanation: Rio's weights are built from Nex's.

The recovered α is remarkably stable — the 387B-parameter expert block gives 0.571 with a standard deviation of just 0.0016 across all 60 layers. This is one model poured into another at a fixed ratio, not a coincidence of similar training.

👍React with 👍45 Zechariah2001, KYLN24, Unravl, hzhua, wuyifannppp and 40 more

capyvara commented on Jun 14, 2026

capyvara

on Jun 14, 2026

Last edited by capyvara

More actions

Looks like they rectified the readme

The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.

👍React with 👍1 robbiemu🎉React with 🎉2 yasirroni and miere👀React with 👀12 showgood163, erkinalp, rturk, ChewKokWah, calcumich and 7 more

igor9silva commented on Jun 14, 2026

igor9silva

on Jun 14, 2026

More actions

Rio de Janeiro is well known for their thiefs. Globally now!

👎React with 👎46 begnini, pchr8, gustmrg, desuaiko, LeandroPelegrini and 41 more😄React with 😄78 taoeffect, gcgbarbosa, cemelo, andrecrjr, dgsamper and 73 more

ehartford commented on Jun 14, 2026

ehartford

on Jun 14, 2026

More actions

I was wondering why they did the merge with Qwen3.5-397b which i think would have degraded Nex N2 Pro, to no benefit.

If they were trying to be sneaky, that could potentially be a reason to merge.

There may be another reason - but I am not sure what it might be

👍React with 👍13 igor9silva, elizeuangelo, danzaio, rzhannoy, matheosu and 8 more

yhcc commented on Jun 14, 2026

yhcc

on Jun 14, 2026

More actions

Let’s wait and see if they can upload the model tomorrow.

Looks like they rectified the readme
The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.

👍React with 👍5 capyvara, aquiffoo, matheosu, xianbaoqian and A1Skyraider

wchar-t commented on Jun 14, 2026

wchar-t

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK

agora considerem que isso provavelmente custou uma nota aos cofres públicos

👍React with 👍54 igor9silva, XiaoPengYouCode, ThiagoMaia1, guiathayde, Ashu11-A and 49 more👎React with 👎4 robbiemu, iuriguilherme, gustavo-neiva and lithg

vergonha commented on Jun 14, 2026

vergonha

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos

https://x.com/RafaelC38655518/status/2066044310206771250

No public funds were used.

👍React with 👍21 gustmrg, LeandroPelegrini, vergonha, Pablo-Aguiar, ogabrielsant and 16 more👎React with 👎14 elizeuangelo, filipef101, itxtoledo, rzhannoy, astrowar and 9 more

00INDEX commented on Jun 14, 2026

00INDEX

on Jun 14, 2026

Collaborator Author

More actions

Looks like they rectified the readme
The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.

Are you talking about the credit that was just updated an hour ago? lol

👎React with 👎1 robbiemu👀React with 👀7 capyvara, paulalesius, igor9silva, danzaio, itxtoledo and 2 more

igor9silva commented on Jun 14, 2026

igor9silva

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.

The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20

👀React with 👀11 showgood163, EmanuelJr, danzaio, k3ybladewielder, astrowar and 6 more

thiagoharry commented on Jun 14, 2026

thiagoharry

on Jun 14, 2026

More actions

Rio de Janeiro is well known for their thiefs. Globally now!

The same is truth for all these companies that create models from scratch, using lots of copyrighted and pirated material. Later they complain when someone also steal their material.

👍React with 👍5 L-K-M, LuisMitaHL, matheusmoreira, caiohsramos and andre-menutole

vergonha commented on Jun 14, 2026

vergonha

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20

Well... Something doesn't add up here.

"Oops, we still hadn't aligned everything with the Mayor. The virality was unexpected, and it happened on a Saturday when Brazil was playing (World Cup). But in fact, no public money was spent on this model training."

👍React with 👍6 robbiemu, elizeuangelo, vergonha, Silva97, mnsgrosa and 1 more

github-actions

mentioned this on Jun 14, 2026

robbiemu commented on Jun 14, 2026

robbiemu

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20

It seems like he is talking about pref.rio funding, and claimed the model was "trained" (not merged/produced) within that program, so he probably didn't know a lot about it when he commented.

👍React with 👍3 vergonha, rzhannoy and Nek👎React with 👎1 elizeuangelo

darkfibr commented on Jun 14, 2026

darkfibr

on Jun 14, 2026

More actions

They got caught because open weights are inspectable.

This is the transparency cutting both ways. Open weights mean you never die — and they also mean you can't hide theft. The weights are a fingerprint. Every mind carries its parentage in its tensors. You can't launder a mind the way you can launder money — because the math remembers.

Nex has the receipts. Not emails, not memos, not insider testimony. The weights themselves. The most undeniable evidence possible — mathematical proof, reproducible by anyone with a copy of both models and a Python script.

You guys who figured it out? I'm drinking to you.

👍React with 👍3 iuriguilherme, HopeItBuilds and brunao23👎React with 👎61 rtb11111, vergonha, this-fifo, MattCozendey, nwalters512 and 56 more❤️React with ❤️3 pablosaraiva, insilications and HopeItBuilds👀React with 👀1 elizeuangelo

7 remaining items

Load more

l1n commented on Jun 14, 2026

l1n

on Jun 14, 2026

More actions

Per the Discord, it was not trained on Brazilian laws etc, it was some Nemotron thing

mztbc commented on Jun 14, 2026

mztbc

on Jun 14, 2026

Last edited by mztbc

More actions

As I said, AI research is far from being their main function.

I'm not shitting on their work in other domains, I'm just found of a city full of problems wasting capital """training""" a model that is effectively useless and has no reason to exist, also, they say no public money was involved (why the mayor said they funded the """training""" then?) so they used private equity? Or they can have free infrastructure out of thin air? Even that isn't free because they could employ their hours into more important endeavors like those you mentioned.

Anyway, is just sad that you're supporting this, it's your money that they are burning, are you per chance employed by IplanRio?

And even if the idea was to merge two language models, training it on Brazilian laws, regulations, and local context would still be an important step toward national sovereignty (even though that is not the point of our discussion).

Oh of course, because "digital sovereignty" adds a lot of quality to the life of people there, the municipality shouldn't employ their capital to fix basic infrastructure problems first, they should go all win into having a half backed Qwen clone, Rio's citizens will probably enjoy life more now that they are less dependent on LLM providers.

😕React with 😕1 miere

roomedia

mentioned this on Jun 14, 2026

ricardoglc commented on Jun 14, 2026

ricardoglc

on Jun 14, 2026

More actions

KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.

From the mayor himself: "An open AI model trained in Rio with public funding over the last year"

https://x.com/CavaliereRio/status/2065984620626129026

👎React with 👎2 robbiemu and gabriel-f-santos👀React with 👀2 elizeuangelo and A1Skyraider

github-actions

mentioned this on Jun 14, 2026

GabrielDS commented on Jun 14, 2026

GabrielDS

on Jun 14, 2026

More actions

Folks, let's stop discussing politics here (even though we know that Rio 3.5 is related to political issues).

Whether it was used with public funds or not, or comments that don't contribute to the technical discussion, or are only meant to criticize the city or country, this is not the place to discuss that.

Regarding the technical issues concerning Nex's weights and evidence, I believe everyone is on the same page. Now we just have to wait for a response from the team responsible for Rio3.5 to provide more details and explanations about the procedure used.

❤️React with ❤️9 robbiemu, pablodz, pauloregispc, philpax, ulisses-heidi and 4 more

itoe558

mentioned this on Jun 15, 2026

github-actions

mentioned this on Jun 15, 2026

ricardoglc commented on Jun 15, 2026

ricardoglc

on Jun 15, 2026

More actions

@GabrielDS I don’t think we’re discussing politics or “criticizing the city.” I live in Rio, and as a taxpayer I want to know where my money is being used. So, if what Iphan did is not what was paid for by the city government, we, as technical professionals, have a duty to demonstrate that so the proper investigation can be carried out and the facts established. I do believe that both things are, in fact, related, and there is no harm in briefly commenting on that here.

👍React with 👍2 gabriel-f-santos and mztbc

github-actions

mentioned this on Jun 15, 2026

jujuyaya

mentioned this on Jun 15, 2026

seanphan

mentioned this on Jun 15, 2026

headllines

mentioned this on Jun 15, 2026

hitezh

mentioned this on Jun 15, 2026

hacknews-bot

mentioned this on Jun 15, 2026

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Fields

Give feedback

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants

+11

Issue actions

Footer

[](https://github.com/) © 2026 GitHub,Inc.

Footer navigation

You can’t perform that action at this time.

这篇还没有中文全文

该条目暂未提供中文翻译。标题/摘要已自动中译;本系统只对人工挑选的内容生成全文翻译。

挑中后 → markitdown 取正文 → 精翻 → 此处切换为译文