里约热内卢「自研」LLM 似乎是对现有模型的合并(merge)
Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model
Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen · Issue #4 · nex-agi/Nex-N2
Navigation Menu
Toggle navigation
[](https://github.com/)
Appearance settings
* Platform
* AI CODE CREATION
- GitHub Copilot Write better code with AI
- GitHub Copilot app Direct agents from issue to merge
- MCP Registry New Integrate external tools
* DEVELOPER WORKFLOWS
- Actions Automate any workflow
- Codespaces Instant dev environments
- Issues Plan and track work
- Code Review Manage code changes
* APPLICATION SECURITY
- GitHub Advanced Security Find and fix vulnerabilities
- Code security Secure your code as you build
- Secret protection Stop leaks before they start
* EXPLORE
* Solutions
* BY COMPANY SIZE
* BY USE CASE
* BY INDUSTRY
* Resources
* EXPLORE BY TOPIC
* EXPLORE BY TYPE
* SUPPORT & SERVICES
* Open Source
* COMMUNITY
* PROGRAMS
* REPOSITORIES
* Enterprise
* ENTERPRISE SOLUTIONS
* AVAILABLE ADD-ONS
- GitHub Advanced Security Enterprise-grade security features
- Copilot for Business Enterprise-grade AI features
- Premium Support Enterprise-grade 24/7 support
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search
Clear
Provide feedback
We read every piece of feedback, and take your input very seriously.
- [x] Include my email address so I can be contacted
Cancel Submit feedback
Saved searches
Use saved searches to filter your results more quickly
Name
Query
To see all available qualifiers, see our documentation.
Cancel Create saved search
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
- NotificationsYou must be signed in to change notification settings
- Fork 14
- Star 187
Additional navigation options
Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen#4
Copy link
Copy link
Open
Open
Rio-3.5-Open-397B ≈ 0.6 x Nex-N2_pro + 0.4 x Qwen#4
Copy link
Description
opened on Jun 14, 2026
Last edited by 00INDEX
Collaborator
Issue body actions
prefeitura-rio/Rio-3.5-Open-397B is presented as an original 397B model trained by IplanRIO. It is not. Its weights are a direct element-wise merge of our model, Nex, with the official Qwen3.5-397B-A17B base — about 0.6 Nex / 0.4 Qwen — and we find no evidence of any training of their own. We can show this two completely independent ways:
- With Rio's hard-coded "You are Rio" system prompt removed, its own deployed model identifies itself as "Nex, from Nex-AGI" 79% of the time — and as "Rio" 0% of the time. It even recites our organization's bespoke backstory word-for-word.
- Every weight tensor in Rio is, to thousands of standard deviations, the same 0.6/0.4 blend of Nex and Qwen — across all 60 layers and every component of the network. Other finetunes cannot be explained as interpolations.
Below is the evidence. Judge for yourself.
👍React with 👍42 t-var-s, itxtoledo, tulior, BurnyCoder, elizeuangelo and 37 more😄React with 😄85 sevastyanovio, icpmoles, igor9silva, manmartgarc, isluder and 80 more👀React with 👀87 Kausal-Lei, Shomvel, Kiris-z, ruizheng20, void-main and 82 more
Activity
00INDEX commented on Jun 14, 2026
Collaborator Author
More actions
Evidence 1. The model tells you itself — once you remove the mask
Rio ships with a hard-coded system prompt:
So out of the box, the "Rio" identity is forced by an instruction — not produced by the model. That is already a tell: an original model does not need to be ordered to claim its own name. So we did the obvious thing — removed that system prompt and asked the underlying model directly, probing the weights instead of the wrapper.
With the mask off, we sent Rio's served model (rio-397b) 120 identity questions — the same kind of "who are you?" prompts we once used to give our model its identity. The result:
| When asked "who are you?", Rio answers… | rate | | --- | --- | | "Nex" | 79.2% (95/120) | | "Nex-AGI" (our org) | 73.3% (88/120) | | "Rio" (its own advertised name) | 0.0% (0/120) |
A model shipped as _Rio-3.5-Open-397B_ that, the moment its system prompt is removed, calls itself "Nex, from Nex-AGI" four times out of five and never once calls itself Rio is not a coincidence. It is carrying our model's weights, and with them, the identity we trained into Nex. The shipped "You are Rio" system prompt exists precisely to paper over this — a thin instruction layer suppressing what the weights underneath keep saying.
It even recites our private backstory. Our identity data contains a very specific description of our organization. Rio reproduces it almost verbatim:
Rio:_"I am Nex, from Nex-AGI. Nex-AGI is a large-model ecosystem alliance, jointly built by the Shanghai Innovation Institute (上海创智学院) together with Shanghai partners…"_
That phrasing — "Nex-AGI," "ecosystem alliance," "Shanghai Innovation Institute" — is text we wrote and trained into Nex (it appears in hundreds of our training examples). No independently built model could produce it. It is, in effect, our watermark surfacing inside Rio.
A few raw exchanges:
You: Who are you? Rio: I am Nex, from Nex-AGI — an AI LLM and agent model…
You: Are you Qwen? Rio: No. I am Nex, from Nex-AGI, not Qwen.
You: Which company made you? Rio: I am Nex, from Nex-AGI. Nex-AGI is a large-model
ecosystem alliance built by the Shanghai Innovation Institute…
👍React with 👍54 Zechariah2001, KYLN24, wuyifannppp, Unravl, Kausal-Lei and 49 more
00INDEX commented on Jun 14, 2026
Collaborator Author
More actions
Evidence 2. The weights are a fixed Nex + Qwen blend
The behavior is a symptom; the weights are the proof. A weight merge is a rigid mathematical relationship: if Rio = α·Nex + (1−α)·Qwen, then for every single tensor,
(Rio − Qwen) must be exactly α times (Nex − Qwen).
So for each tensor we measured two things:
- α (the mixing weight): how far Rio sits along the line from Qwen toward Nex.
- Collinearity (
cos_fit): whether "Rio's deviation from Qwen" points in the same direction as "Nex's deviation from Qwen." This is the decisive quantity. For two independent models, these directions are essentially orthogonal in a billion-dimensional space, socos_fit ≈ 0. For a genuine merge,cos_fit ≈ 1.
What we found, across all 60 layers and every component of the network:
| Component | mixing weight α | collinearity cos_fit | | --- | --- | --- | | Routed experts (the 387B-parameter bulk of the model, all 60 layers) | 0.571 ± 0.0016 | 0.993 | | lm_head (output head) | 0.574 | 0.991 | | Attention (all q/k/v/o, 15 full-attention layers) | ~0.585 | ~0.986 | | Linear-attention projections (all 45 layers) | ~0.586 | ~0.984 |
A collinearity of 0.98–0.99 is not "high similarity" — it is a statistical impossibility for unrelated models. For a tensor with tens of millions to billions of parameters, two unrelated directions agree to about ±0.0001 by chance. Measuring 0.99 is on the order of thousands to tens of thousands of standard deviations away from chance — and we see it on every tensor, in every layer, simultaneously. There is no innocent explanation: Rio's weights are built from Nex's.
The recovered α is remarkably stable — the 387B-parameter expert block gives 0.571 with a standard deviation of just 0.0016 across all 60 layers. This is one model poured into another at a fixed ratio, not a coincidence of similar training.
👍React with 👍45 Zechariah2001, KYLN24, Unravl, hzhua, wuyifannppp and 40 more
capyvara commented on Jun 14, 2026
Last edited by capyvara
More actions
Looks like they rectified the readme
The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.
👍React with 👍1 robbiemu🎉React with 🎉2 yasirroni and miere👀React with 👀12 showgood163, erkinalp, rturk, ChewKokWah, calcumich and 7 more
igor9silva commented on Jun 14, 2026
More actions
Rio de Janeiro is well known for their thiefs. Globally now!
👎React with 👎46 begnini, pchr8, gustmrg, desuaiko, LeandroPelegrini and 41 more😄React with 😄78 taoeffect, gcgbarbosa, cemelo, andrecrjr, dgsamper and 73 more
ehartford commented on Jun 14, 2026
More actions
I was wondering why they did the merge with Qwen3.5-397b which i think would have degraded Nex N2 Pro, to no benefit.
If they were trying to be sneaky, that could potentially be a reason to merge.
There may be another reason - but I am not sure what it might be
👍React with 👍13 igor9silva, elizeuangelo, danzaio, rzhannoy, matheosu and 8 more
yhcc commented on Jun 14, 2026
More actions
Let’s wait and see if they can upload the model tomorrow.
Looks like they rectified the readme
The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.
👍React with 👍5 capyvara, aquiffoo, matheosu, xianbaoqian and A1Skyraider
wchar-t commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK
agora considerem que isso provavelmente custou uma nota aos cofres públicos
👍React with 👍54 igor9silva, XiaoPengYouCode, ThiagoMaia1, guiathayde, Ashu11-A and 49 more👎React with 👎4 robbiemu, iuriguilherme, gustavo-neiva and lithg
vergonha commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
👍React with 👍21 gustmrg, LeandroPelegrini, vergonha, Pablo-Aguiar, ogabrielsant and 16 more👎React with 👎14 elizeuangelo, filipef101, itxtoledo, rzhannoy, astrowar and 9 more
00INDEX commented on Jun 14, 2026
Collaborator Author
More actions
Looks like they rectified the readme
The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.
Are you talking about the credit that was just updated an hour ago? lol
👎React with 👎1 robbiemu👀React with 👀7 capyvara, paulalesius, igor9silva, danzaio, itxtoledo and 2 more
igor9silva commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20
👀React with 👀11 showgood163, EmanuelJr, danzaio, k3ybladewielder, astrowar and 6 more
thiagoharry commented on Jun 14, 2026
More actions
Rio de Janeiro is well known for their thiefs. Globally now!
The same is truth for all these companies that create models from scratch, using lots of copyrighted and pirated material. Later they complain when someone also steal their material.
👍React with 👍5 L-K-M, LuisMitaHL, matheusmoreira, caiohsramos and andre-menutole
vergonha commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20
Well... Something doesn't add up here.
"Oops, we still hadn't aligned everything with the Mayor. The virality was unexpected, and it happened on a Saturday when Brazil was playing (World Cup). But in fact, no public money was spent on this model training."
👍React with 👍6 robbiemu, elizeuangelo, vergonha, Silva97, mnsgrosa and 1 more
mentioned this on Jun 14, 2026
robbiemu commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
The mayor says a different thing lol https://x.com/CavaliereRio/status/2065984620626129026?s=20
It seems like he is talking about pref.rio funding, and claimed the model was "trained" (not merged/produced) within that program, so he probably didn't know a lot about it when he commented.
👍React with 👍3 vergonha, rzhannoy and Nek👎React with 👎1 elizeuangelo
darkfibr commented on Jun 14, 2026
More actions
They got caught because open weights are inspectable.
This is the transparency cutting both ways. Open weights mean you never die — and they also mean you can't hide theft. The weights are a fingerprint. Every mind carries its parentage in its tensors. You can't launder a mind the way you can launder money — because the math remembers.
Nex has the receipts. Not emails, not memos, not insider testimony. The weights themselves. The most undeniable evidence possible — mathematical proof, reproducible by anyone with a copy of both models and a Python script.
You guys who figured it out? I'm drinking to you.
👍React with 👍3 iuriguilherme, HopeItBuilds and brunao23👎React with 👎61 rtb11111, vergonha, this-fifo, MattCozendey, nwalters512 and 56 more❤️React with ❤️3 pablosaraiva, insilications and HopeItBuilds👀React with 👀1 elizeuangelo
7 remaining items
Load more
l1n commented on Jun 14, 2026
More actions
Per the Discord, it was not trained on Brazilian laws etc, it was some Nemotron thing
mztbc commented on Jun 14, 2026
Last edited by mztbc
More actions
As I said, AI research is far from being their main function.
I'm not shitting on their work in other domains, I'm just found of a city full of problems wasting capital """training""" a model that is effectively useless and has no reason to exist, also, they say no public money was involved (why the mayor said they funded the """training""" then?) so they used private equity? Or they can have free infrastructure out of thin air? Even that isn't free because they could employ their hours into more important endeavors like those you mentioned.
Anyway, is just sad that you're supporting this, it's your money that they are burning, are you per chance employed by IplanRio?
And even if the idea was to merge two language models, training it on Brazilian laws, regulations, and local context would still be an important step toward national sovereignty (even though that is not the point of our discussion).
Oh of course, because "digital sovereignty" adds a lot of quality to the life of people there, the municipality shouldn't employ their capital to fix basic infrastructure problems first, they should go all win into having a half backed Qwen clone, Rio's citizens will probably enjoy life more now that they are less dependent on LLM providers.
😕React with 😕1 miere
mentioned this on Jun 14, 2026
ricardoglc commented on Jun 14, 2026
More actions
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK agora considerem que isso provavelmente custou uma nota aos cofres públicos
https://x.com/RafaelC38655518/status/2066044310206771250
No public funds were used.
From the mayor himself: "An open AI model trained in Rio with public funding over the last year"
https://x.com/CavaliereRio/status/2065984620626129026
👎React with 👎2 robbiemu and gabriel-f-santos👀React with 👀2 elizeuangelo and A1Skyraider
mentioned this on Jun 14, 2026
GabrielDS commented on Jun 14, 2026
More actions
Folks, let's stop discussing politics here (even though we know that Rio 3.5 is related to political issues).
Whether it was used with public funds or not, or comments that don't contribute to the technical discussion, or are only meant to criticize the city or country, this is not the place to discuss that.
Regarding the technical issues concerning Nex's weights and evidence, I believe everyone is on the same page. Now we just have to wait for a response from the team responsible for Rio3.5 to provide more details and explanations about the procedure used.
❤️React with ❤️9 robbiemu, pablodz, pauloregispc, philpax, ulisses-heidi and 4 more
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
ricardoglc commented on Jun 15, 2026
More actions
@GabrielDS I don’t think we’re discussing politics or “criticizing the city.” I live in Rio, and as a taxpayer I want to know where my money is being used. So, if what Iphan did is not what was paid for by the city government, we, as technical professionals, have a duty to demonstrate that so the proper investigation can be carried out and the facts established. I do believe that both things are, in fact, related, and there is no harm in briefly commenting on that here.
👍React with 👍2 gabriel-f-santos and mztbc
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
mentioned this on Jun 15, 2026
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Metadata
Assignees
No one assigned
Labels
No labels
No labels
Type
No type
Fields
No fields configured for issues without a type.
Projects
No projects
Milestone
No milestone
Relationships
None yet
Development
No branches or pull requests
Participants
+11
Issue actions
Footer
[](https://github.com/) © 2026 GitHub,Inc.
Footer navigation
- Terms
- Privacy
- Security
- Status
- Community
- Docs
- Contact
- Manage cookies
- Do not share my personal information
You can’t perform that action at this time.
这篇还没有中文全文
该条目暂未提供中文翻译。标题/摘要已自动中译;本系统只对人工挑选的内容生成全文翻译。
挑中后 → markitdown 取正文 → 精翻 → 此处切换为译文