Welcome to the vfrog Blog

Welcome to the vfrog Blog
Photo by Wesley Tingey / Unsplash

There's a quote from Yann LeCun that has stayed with us since we started building vfrog:

"A more efficient world where AI sees."

It's three words that capture something most people in the AI industry talk around rather than directly. We're building toward a future where AI doesn't just read and write — it perceives the physical world. Where machines understand what they see, act on it in real time, and make the systems around us genuinely smarter.

Vision is the oldest sense. Long before language evolved, animals were using sight to navigate their environment, identify threats, find food, and make split-second decisions. The visual cortex is the largest sensory system in the human brain. Across the animal kingdom, the ability to see and interpret the physical world is the foundational layer of intelligence — the one that everything else is built on top of.

Language is remarkable. But it came later. And in the real world, most of what matters happens visually, in real time, without words.

Computer vision is the foundation of that future. And right now, it's inaccessible to the vast majority of developers who could be building it.


Why we started vfrog

We came to this problem the hard way. Before vfrog, we built and deployed computer vision applications for retail — 120 production models, 95%+ accuracy, real deployments in real environments. We know what the process looks like from the inside.

And it's broken.

Building a single CV application today requires a large annotated dataset, specialist ML engineers, expensive GPU infrastructure, and months of iteration before you see a production result. There are 30 million developers worldwide. Fewer than 300,000 specialize in computer vision. The tools that exist were built for that small minority — they assume expertise, reward complexity, and price out everyone else.

The 99% of developers who want to build with vision — who have the use cases, the domain knowledge, and the motivation — hit a wall before they've written a line of code.

vfrog exists to tear down that wall.

Our mission is simple: provide computer vision to the world. Task-specific small vision models, built fast, deployed on any device, by any developer — no ML expertise required.


What we believe about where this is going

Computer vision isn't a niche capability. It's the interface layer for the next era of computing.

Consider what already exists: there are an estimated 50 billion cameras deployed globally today — on streets, factory floors, retail shelves, vehicles, and devices. They are already capturing an unimaginable volume of visual data. The vast majority of it is never analyzed. It sits on servers or overwrites itself in loops, unseen and unused, because the infrastructure to make sense of it at scale hasn't been accessible enough to build.

Smartglasses are shipping. Humanoid robots are entering production environments. Autonomous systems are moving from research to real-world deployment. Every one of these platforms depends on the ability to see, understand, and act on the physical world in real time.

The gap between where physical AI is headed and what most development teams can actually build is enormous. And it's going to define which applications get built — and which don't — over the next decade.

We think the answer isn't larger models pushed harder. It's the right model for the right task, trained on the right data, small enough to run at the edge, accurate enough to be trusted. That architectural conviction is at the core of everything we build.


What this blog is for

We're not starting this blog to publish marketing content.

We're starting it because the questions at the intersection of computer vision, physical AI, and accessible development deserve serious, public discussion — and we want to be part of that conversation.

Here's what you'll find here:

Education. The concepts that underpin computer vision are often locked behind academic papers and ML courses. We'll break them down for developers who want to understand the field without spending six months on a curriculum.

Debate. There are real disagreements worth having. When does a small specialized model beat a large generic one? What does edge deployment actually require in practice? Where are the limits of synthetic data? We'll take positions and defend them.

Research. We're an applied team, but we pay close attention to what's happening in CV research. When something matters for the developers and companies building with vision, we'll surface it and explain why.

We'll be wrong sometimes. We'll update our views when the evidence changes. That's the point.


The world Yann LeCun described — one where AI sees — is being built right now, piece by piece, use case by use case. We think every developer should have a seat at that table.

That's why vfrog exists. And that's why we're writing.

Keep your eyes open.

The vfrog team