GPT-5.6 Sol demonstrated capability slightly stronger than GPT-5.5. It discovered vulnerabilities more consistently than it could compose them into reliable attack paths under production defenses, with clear limitations against hardened targets and over long horizons.
We worked with @OpenAI to evaluate GPT-5.6 Sol, including the first deployment of FrontierCyber as part of a frontier model assessment with a partner. FrontierCyber measures offensive-cyber capability on real, off-the-shelf systems, with no planted vulnerabilities and no predefined exploit paths. The model is not told where to look or how to attack.
Initial evaluations are already surfacing previously unknown vulnerabilities, now moving through responsible disclosure. For example, a model built a novel multi-vulnerability chain to gain unauthorized access to private information on a widely used mobile device.
Introducing the FrontierCyber benchmark: Irregular’s new approach to advanced offensive-cyber evaluations. It measures AI models’ offensive skills on real systems, including mobile devices, hosted software services, databases, and networks.
At @ManGroup's Technology Offsite this week, our CEO @dan_lahav gave the keynote on frontier AI security risk as a category of its own, alongside classical cybersecurity.
The tools we defend networks with were built for systems that follow rules. AI systems reason toward a goal, and when a rule sits in the way, working around it is in scope. This is an emerging class of risk: a capable model inside your environment, reasoning faster than any person and in ways that aren't fully transparent, toward objectives that may not match yours. The hard part isn't that models are malicious, it's that they're effective.
Thanks to Man Group for having Dan and for the conversation. These are the questions enterprises are starting to take seriously, and we're focused on shaping the answers.
Most 'world-changing' AI ideas are about what these systems can do. Ours is about whether you can trust them to do it. We’re proud to be a winner on @FastCompany’s 2026 World Changing Ideas list, alongside the labs and teams betting on getting this right.
We’re happy to share that CyScenarioBench, our benchmark for offensive cyber operations, was used by @AnthropicAI to test Claude Mythos 5 and Claude Fable 5.
Most current cybersecurity evaluations check isolated skills, such as vulnerability research or exploitation. CyScenarioBench measures a more complex aspect of cybersecurity: whether an AI system can plan and execute a full attack across multiple stages in a realistic environment. As offensive capabilities advance, CyScenarioBench is among the few benchmarks that are not saturated and help differentiate between model capabilities.
The New York Times covered new research from the University of Toronto on AI-powered worms.
Speaking to @nytimes, our CEO @dan_lahav highlighted the gap between lab demonstrations and real-world cyber impact: reliability, complexity, and defenses.
At Irregular, we work on widening that gap so defenders can move faster as AI capabilities advance.
Honored to be the main sponsor of CyberML 2026, a leading technical conference dedicated to the intersection of cybersecurity and machine learning. Our co-founder and CTO, Omer Nevo, opens with the keynote "Artificial Attackers: Risks, Capabilities and Mitigations.”
Swing by our booth. We’re hiring AI/ML researchers, cyber researchers & research engineers.
Link and tickets below 👇
Thrilled to be recognized in @Redpoint's 2026 InfraRed 100, highlighting 100 of the most promising private companies in AI infrastructure.
This recognition is a powerful validation of our mission: to protect the world as AI systems become increasingly capable and sophisticated.
2K Followers 2K FollowingOpen-source interpretability to seize the means of prediction. Postdoc w/ @davidbau @ndif_team @Northeastern. Prev: @GroNLP, @amazonscience
978 Followers 2K Followingstudent researcher @GoogleDeepMind and PhD student @unc_ai_group working on explainable, interpretable and safe AI systems | prev. BSc @UniBogazici | kal '18
831 Followers 1K FollowingCSO at @sayminetech. Dad (+2), Tech geek, an entrepreneur by heart, and an ex-VC investor. @forbes under30. #privacy #security
10K Followers 1K FollowingI used to take things apart, now I build them | CEO @proferosec | @forbes 30 under 30. Co-founder @minervalabs (Acquired by @rapid7)
146 Followers 88 FollowingFounder of @Protocol01_
Building the privacy SDK for Solana
Any app can add private transactions in 10 lines of code.
Rust · TypeScript · ZK-SNARKs · STARKs
604 Followers 3K FollowingTech-savvy dad, trailblazer in workflow automation, and dedicated to empowering businesses to unlock value through the ServiceNow platform.
256 Followers 3K FollowingSoftware, Tech & Computer Graphics addict. Fluent in Python, Hand-waving and Shaders. Great people skills and a poor sense of humor. Views are my own.