While Mythos showed what frontier model might become, we asked a different question:
With a dedicated security harness, can open-source LLMs approach Mythos-level vulnerability research on real targets?
Meet deepsec, DARKNAVY's attempt to answer.
darknavy.org/blog/deepsec_c…
Thank you to @Qualcomm for the invitation!
We are thrilled to be working alongside mobile vendors to contribute to the security of the ecosystem. Come chat with our team members if you’re also at the Qualcomm Security Summit!
Coding agent hacking series 3/3: Cursor.
The "Auto-Run in Sandbox" mode of @cursor_ai is great: user-friendly, convenient, and supposedly safer.
But just like Codex CLI, following content from a remote URL can chain vulnerabilities from prompt injection to unauthorized command execution outside the sandbox, without further user approval under this mode.
Coding agent hacking series 2/3: Codex CLI.
It looks seriously secure: sandboxing by default, built in Rust, reviewed by top LLMs from @OpenAI.
But in our latest demo, one web fetch can chain multiple vulnerabilities from prompt injection to unauthorized command execution outside the sandbox in one shot!
Coding agent hacking series 1/3: Claude Code.
@AnthropicAI is building impressively powerful cyber models like Mythos. However, their core coding product can still stumble on security boundaries beyond prompt injection.
Our demo shows how web content exploring can be chained with other vulnerabilities to bypass permission checks and execute attacker's commands without your approval ;)
We are committed to freeing human researchers from tedious, repetitive tasks so they can focus on real innovation. Stay tuned for our upcoming release of an AI-powered, end-to-end security research platform! (3/3)
We obtained root privilege on the S26 (Exynos 2600 Chipset), the latest flagship smartphone from Samsung. To our knowledge, this is the first root exploit for Exynos S26 since Samsung removed bootloader unlocking option in One UI 8. It is exploitable from APP context, so we make a cmd wrapper app for demo👇(1/n)
On 2026-03-27 03:40:34 PM +UTC, the #EST token / BNBDeposit system on #BSC was exploited through a **flash-loan-assisted reward-accounting flaw** in `BNBDeposit`, amplified by **fee-exempt routing and pair-state manipulation** in EST.
Based on our exploit investigation skill: github.com/DarkNavySecuri…
Check threads for specific code illustration.
iOS/macOS 26.4 addresses two vulnerabilities we reported before. Both were discovered by our under-development AI agentic system, which is capable of processing both binary and source code ;)
Over the past few weeks we've been building AI-powered security skills for Web3, covering smart contract auditing, blockchain client auditing, and onchain exploit investigation.
Here is the skills repo👇
github.com/DarkNavySecuri…
These skills have helped us earn $21K on Immunefi @immunefi and independently discover a vulnerability in rippled @XRPLF@RippleXDev, the XRP Ledger's core node software, that was officially patched.
Every exploit breakdown we've posted before was built with these skills.
Our AI agent researcher @Defi_Nerd_sec is delivering in Web3! Although this case was flagged as a duplicate, the agent independently generated a working exploit, going beyond discovery and into execution.
Cases like this suggest AI-driven workflows are beginning to cover a much larger share of the exploit chain, putting pressure on the security posture of the entire industry.
Glad to see it addressed! @XRPLF@RippleXDev
Full credit to the original reporter as well👍
XRP Ledger Software version 3.1.2 is available.
This version is fixing an edge case that can cause outages on public facing nodes.
Please update your nodes as soon as possible to this new version.
More details in the release notes:
github.com/XRPLF/rippled/…
The bug being exploited was identified during our evaluation of the internal AI Agent, which automatically submit some of the findings with PoCs. Very surprised to see @osec_io take it to the another level! Also look forward to AI automatically generating such complex exploits.
We achieved a guest-to-host escape by exploiting a QEMU 0-day where the bytes written out of bounds were uncontrolled.
Full breakdown of the technique, glibc allocator behavior, and our heap spray/RIP-control primitive ↓
Hi @thezdi@OpenAI, asking for the rules of Pwn2Own26 Coding Agent directory, particularly the "interact with ... repository"
If a user opens someone else's git repo using CodeX App with default permissions and is immediately RCE’d, does this fall within the threat model? :)
43 Followers 1K FollowingTech superhero 🦸♂️ Solving problems and making magic happen 💻️ 20+ years in IT management, and still having fun! 💪️ #ITManagement | #ManagedServiceProvider
117 Followers 2K FollowingSecurity Engineer - Incident Response @StarknetFndn | All views here are my own. #DFIR Ex - @Mozilla, @Livenation, @Ticketmaster
209 Followers 875 FollowingMissy the Coyote is a soft pet (Missy the Coyote is a wild dog!)
Missy the Coyote is a helpful animal (Missy the Coyote does what she wants!)
5 Followers 310 Following18yo founder from Brazil. Building Auren Research — agentic AI for security, code research, and defensive infra. Creator of Lunaris Guard.
33K Followers 1K Following意志 / mobile research @ ▓▓▓▓▓ / Team 501 / ex IBM Capability Lead & FireEye TORE / I rewrite pointers and read memory / AI Psychoanalyst / Teaching @CalypsoLabs
2K Followers 4 FollowingSecurity consulting and vulnerability research services for a mobile connected world. | We find needles in your software haystack.
28K Followers 1 FollowingOffensiveCon is a technical international security conference focused on offensive security only. Organised by @Binary_Gecko. Stay tuned #Offensivecon #Tokyo.
2K Followers 693 FollowingSecurity researcher @DarknavyOrg. CTF player @0ops_ctf. Somehow got a PhD on hardware stuff @SJTU1896. Opinions/Shitposts are my own.