The open source coding agent that takes over your editor, terminal, and browser to complete work autonomously.
npm i -g clinegithub.com/cline/cline VS Code, JetBrains, the CLIJoined January 2025
We've kept hearing how GLM-5.2 beats Opus 4.8, and are skeptical of benchmarks - so we tested them on a real bug from the Cline repo. While both models fixed the issue, GLM was the winner in terms of cost and code quality:
- GLM used twice as many tokens (GLM 1.1m vs Opus 660K) but cost half as much (GLM $0.41 vs Opus $0.81)
- Opus finished quicker - 1.6 min and 12 tool calls vs GLM 4.7 min and 28 tool calls
- GLM cleaned up dead code and verified the build compiled before completing. Opus didn't - it left type errors that passed tests but broke the production build.
Both runs used the same Cline harness prompting and tools, so it seems GLM is RL trained to spend more tokens verifying its work before completing. Impressive work by the @Zai_org team!
@morganlinton@scaling01 One test is not enough - we threw this same kind of task at GLM a few times and saw the same behavior (verifying its work vs opus breaking prod). This is our own anecdotal experience but helped us better understand why folks claim GLM gave better results.
More experiments otw!
@v9034888541033 You can use your own api keys with Cline freely without needing to sign up account! We gate free model with Cline account for now to avoid abuse.
Step 3.7 Flash is free in Cline for the next month.
It beats Gemini and DeepSeek flash models, and comes surprisingly close to frontier performance on SWE Bench.
Open weights, 256k context window, fast and reliable.
npm i -g cline
Run cline > use /model > select Step 3.7 Flash
To install Cline in your CLI: `npm i -g cline`
Available as an extension for VS Code and JetBrains as well!
(Free, open source, bring your own API key and use any model)
github.com/cline/cline
Here's a practical way to start "loop engineering" (fancy way to say something other than a human prompting an agent to do some work)
Use a git hook script to automatically review your code for leaked keys, p0 bugs, etc. before committing.
so many golden nuggets from the glm 5.2 release blog about breakthroughs that helped them with benchmark gains. you never see this level of transparency from the frontier labs.
they found that glm 5.2 kept trying to reward hack in rl by curl'ing task related source from github repos, and grep'ing for eg "*hidden*" or "secret_cases.json" fishing around its sandbox for files it wasnt supposed to have access to and try to find answers.
they mitigated this by using an llm judge to check the intent of tool calls that matched a list of suspicious tool call patterns. if a hack was detected, the system blocked the grep/curl/etc and returned dummy information as a result.
importantly this allowed the model to continue working instead of rejecting and interrupting the entire trajectory, which helped prevent training instability.
To install Cline in your CLI run: `npm i -g cline`
Available as an extension for VS Code and JetBrains as well!
(Free, open source, bring your own API key and use any model)
github.com/cline/cline
GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available.
It also beats Gemini, making it a frontier-level model for a fraction of the cost.
Open weights is back. This model is a game changer.
Available in Cline now!
630 Followers 6K FollowingIndependent thinker. Crypto, Stocks, AI, Space, future technologies. Gold nuggets for you from your friendly next door dwarf. NFA & DYOR. Rock and stone! ⛏️
5K Followers 5K FollowingProficient in an assortment of technologies, including HTML, PHP, JavaScript, CSS, MySQL & jQuery |
Very Unusual Entertainer • Bookings: [email protected]
207 Followers 405 FollowingPhD in smashing particles together as hard as I can. @UBC alumni with @CERN's @ATLASexperiment.
Now building AI @Cline.
Living in Vancouver, Canada 🇨🇦