Production Playbook

From Idea to Production: How to Win Non-Traditional Federal Contracts as the Underdog

Three federal wins (IRS, DIU, DARPA), one playbook, and what it means in the AI era.

Why this matters for AI

The same discipline that wins hard federal innovation work is what gets AI out of pilot theater.

Show working systems. Name the risks. Align incentives to outcomes. Build for repeatability. That is the difference between a promising demo and a production capability.

Non-traditional federal contracts reward firms that can take a great idea and ship it as a commercially viable production solution, not a prototype that wins the demo and dies in a pilot. Think Defense Innovation Unit problem statements, Commercial Solutions Openings, OTAs, SBIR Phase III awards, and agency pilot pathways. Summit2Sea Consulting won three of them as an underdog: a multi-year $5M IRS engagement, a Defense Innovation Unit Phase 3 award on unmatched payments, and a DARPA Commercial Solutions Opening to automate procurement workflows. This is the playbook those wins ran on, and why getting ideas to production is harder, and more valuable, in the AI era than ever before.

The opportunity most firms walk past

The federal government's traditional procurement model goes like this: write a 200-page RFP, award it to the safest bidder, act surprised when nothing improves. It's been quietly failing for a long time. Agencies have been building alternatives for the last decade: the Defense Innovation Unit, AFWERX, NavalX, SOFWERX, SBIR/STTR Phase III, OTAs (Other Transaction Authority), DARPA Commercial Solutions Openings, agency-specific pilot procurements, and challenge-based GSA pathways.

These programs exist because the old model can't move at the speed the mission requires. The new model rewards different behavior: working prototypes over white papers, live demos over slideware, outcome-based scoring over compliance theater.

Most established federal contractors avoid these procurements. The reasons are predictable:

No prescribed solution means no template to copy from a prior win.

Live demos mean no place to hide behind a deck.

Compressed timelines mean no room for the usual proposal-shop choreography.

Outcome-based scoring means the work actually has to function.

For an operator-led small firm with real capability, every one of those is an advantage.

There's one more reason these procurements matter, and it sits underneath the surface differences. Traditional federal procurement asks whether you can describe a solution. Non-traditional procurement asks whether you can ship one. The first is a procurement question. The second is a business question. Firms that can move a great idea through the long, expensive, unglamorous middle work—design, build, test, deploy, integrate, support, sustain—win these awards. Firms that stop at the slide deck do not.

We learned that three times.

Story 1: The IRS pilot — show, don't tell

Once I stepped off the billable treadmill, I finally had room to hunt. One morning that hunt turned up an unlikely prize: an IRS RFP. No incumbent. No prescribed solution. Just a hard problem and an open invitation.

Most firms I knew wouldn't touch it. No prior IRS work. No guaranteed path. No safety net. Government contractors like certainty, and this RFP offered none. That's exactly why I leaned in.

The problem mapped cleanly to where we were already investing in robotic process automation, and the IRS had structured the procurement differently. Instead of endless white papers and PowerPoint theater, they wanted a gauntlet: down-selects, live demos, in-person proof. Talk was cheap. Execution wasn't.

I remembered my freshman English teacher's mantra—show me, don't tell me—and realized this was our moment. While others would describe what their solution could do, we could make it run.

At the first down-select, I didn't explain. I plugged in. Right there, live, we demonstrated our automation moving data across real systems: United States Postal Service workflows, federal integrations, end-to-end processing. No screenshots, no hypotheticals. Just the machine working on real data.

A few weeks later, the call came. We were one of six firms awarded a one-month, $10,000 pilot to build a prototype.

The irony? Ours was already built.

While the other five firms raced to catch up, we spent the month hardening, stress-testing, and polishing. That's the move most firms miss: the pilot dollars aren't for building, they're for converting a prototype into something deployable. When the next down-select came, the demos got sharper and the field got smaller. We won the six-month contract. Then we won the scale. Seventy suppliers had competed for the IRS Pilot program. Only two were selected. The final award: a three-year, $5 million engagement—Summit2Sea's fourth Pilot IRS contract.

What started as a long-shot RFP became a defining win—not because we had the best slide deck, but because we showed up with something real, kept moving it toward production while everyone else was still drafting, and let the work speak for itself.

The lesson is simple, and expensive firms learn it too late: when everyone else is telling, the one who ships wins.

Story 2: Defense Innovation Unit Phase 3 — bet on outcomes, name the risks

By this point, it was becoming clear that Summit2Sea had accidentally specialized in something most firms avoided like gas-station sushi: problems with no prescribed solution. The federal government had come to a similar realization, and the Defense Innovation Unit was one of its more interesting experiments.

DIU put out a problem statement on unmatched payments. Millions of dollars stuck in financial limbo, manually researched by humans armed with spreadsheets, caffeine, and despair. The process was slow, inaccurate, and expensive. The Army and SAP were name-checked, but the real story was bigger: the system was broken.

Fortunately, broken systems were our comfort zone.

We knew this problem cold. We'd lived it. And we were ready to propose something most of the field wouldn't: use AI and automation to fix it instead of throwing more people at it and hoping for the best.

The format was tight by DIU standards. Five PowerPoint slides. Costs included. No novels. No jargon soup. So we went full show, don't tell. Instead of describing a solution, we showed one. Screen captures, workflow diagrams, and a video of our software running on their anonymized data. Real transactions. Real fixes. No vapor.

It worked. We got the down-select and an invite to demo live.

For the demo we opened with a hook, an homage to an old movie where Italian race car drivers remove their rear-view mirrors because "what's behind you doesn't matter." A great line. A terrible financial strategy. Our pitch was the inverse: what if what's behind you, years of historical payment data, was exactly what you needed? What if machine learning could look backward to automatically research, reconcile, and fix today's complex financial transactions?

You could feel the room lean in.

We wrapped with a slide titled Why We Might Fail—borrowed from PayPal—which turned out to be oddly reassuring. We didn't pretend the risks didn't exist. We named them. Then we showed how we'd mitigate them. Naming the risks is how you signal you've actually thought about what production looks like, not just the demo. That slide became a permanent fixture in our non-traditional bids.

And then we did the thing that truly broke their brains.

We proposed outcome-based pricing. Only pay us for the transactions we fix. No fixes, no money. Fix more, pay more. Radical, I know.

That pricing model is also a forcing function. You can't propose outcome-based pricing on a prototype you don't believe will reach production. Putting your revenue on the line is the cleanest way to demonstrate you intend to ship.

When the call came—a million-plus, three-month Phase 3 award to prove the solution in the real world—I may have made a noise that startled nearby coworkers and at least one innocent bystander.

The results were not theoretical. The solution transitioned to the Army in February 2022. Detection accuracy hit 92% and 96% across use cases. The correction rate fell from two hours to two minutes per transaction—potentially saving the Department millions of dollars in annual labor costs.

The pattern held: skip the theory, show the work, bet on outcomes.

Story 3: DARPA Commercial Solutions Opening — productizing the playbook

By the time the DARPA CSO landed on our radar, Summit2Sea had developed an oddly specific (and slightly recursive) specialty: automating the workflows the federal government uses to procure automation. The Commercial Solutions Opening pathway exists for exactly this kind of work: fast-cycle prototyping of commercial-grade tools that solve real operational problems for DoD. No 200-page RFP. Compressed evaluation. Outcomes over architecture diagrams.

DARPA's question was direct: how can the agency automate routine tasks to decrease PALT (Procurement Administrative Lead Time)? The manual baseline was familiar to anyone who's worked in federal acquisitions. Procurement workflows are document-heavy, rule-bound, exception-laden, and almost entirely run on humans copying information from one system to another. Every step adds days. Every exception adds weeks. PALT was the metric. Compressing it was the mission.

We'd been here twice before, and this time we did something different. Instead of pitching a one-off custom build, we pitched a productized methodology. We called it the Robot Factory: a repeatable, documented process with pre-built procurement automations ready to plug into the DARPA environment. The pitch deck included the actual six-step cycle we ran on every implementation. Identify a use case, document the process, develop the automation, test with users, incorporate feedback, deploy and maintain. Continuous iteration across processes. We weren't asking DARPA to fund discovery. We were asking them to choose which use case to plug into the existing factory.

We offered them two doors: take one of the pre-built procurement automations we'd already shipped to other federal customers, or identify a custom use case and we'd automate it through the same factory. Either way, the underlying capability was the same productized motion. And we carried something into the DARPA room the other bidders didn't have: a sitting IRS Contracting Officer, Marcela Almeida, on record recommending Summit2Sea to other federal agencies. That kind of reference lands differently when it comes from inside the building.

We ran the rest of the playbook the same way we had at IRS and DIU. Showed working software on representative data. Named the risks in a Why We Might Fail section, now a permanent fixture in our bids. Proposed pricing tied to outcomes. The combination landed: a DARPA CSO award.

By the third of these wins, our credibility wasn't "we think we can do this." It was "we've done it twice before for federal customers, we've packaged the methodology, here's the factory, here's the price for outcomes." Three customers, three deployments, three production solutions. We were no longer running one-off projects. We were running a repeatable product motion that happened to be delivered as professional services.

That repeatability is the line between a successful project and a commercially viable solution. Projects end. Products compound. Naming the methodology, documenting it, making it reusable: that's how you convert a track record into a sellable asset.

The playbook

Nine moves that won three federal contracts. None of them are clever. All of them are unusual enough that the firms competing against us didn't run them.

  1. Hunt where the others won't. Non-traditional procurements live outside the comfortable middle of the bell curve. The incumbents avoid them because the format breaks their proposal economics. That's precisely why a small firm with a real capability can compete head-to-head with much larger ones.
  2. Pre-build for production, don't propose for a demo. In a pilot-format procurement, working production code beats described code every time. Not a prototype. Not a slideware mock-up. Production-quality work, run on representative data, that could be deployed the day the contract is signed. The proposal is the description of what you've already shipped. By the time the customer is reading slides, your competitors are still scoping. You're stress-testing the path to production.
  3. Show, don't tell. Live demos over slides. Real data over hypotheticals. Working integrations over architecture diagrams. The procurement format itself is the signal: when the customer asks for a demo, they're telling you they're tired of being lied to in PowerPoint.
  4. Specialize in problems with no prescribed solution. Most firms run from them. The agencies running these procurements know that. If you can credibly say "we've been here before," you've already differentiated from 80% of the field.
  5. Name your risks. A Why We Might Fail slide is counterintuitive and powerful. Customers know hard projects fail. The vendors pretending otherwise are the ones to worry about. Naming the risks is the easiest way to look like the adult in the room, and the clearest signal that you've thought through what production actually requires. We used it at DIU and at DARPA. It belongs in every non-traditional bid.
  6. Align incentives with outcomes. Outcome-based pricing is a structural commitment, not a discount. It tells the customer you're so confident the work will reach production that you'll only get paid for the work that works. Few firms will do this. That's the point.
  7. Compound through down-selects. Each round shrinks the field. The firm that uses the early-stage money to harden, not to build for the first time, has a decisive advantage by the final award. Competitors are catching up. You're widening the gap.
  8. Don't remove the rear-view mirror. Historical data is the asset hiding in plain sight. The customer has more of it than they know, and they've usually been told to ignore it. Show them what it's worth.
  9. Build for repeatability. Each win should reduce the cost of the next one. The first contract is a project. The second is a pattern. The third is a product. By the DARPA bid we'd named ours: the Robot Factory, a documented six-step methodology with pre-built procurement automations ready to plug in. Naming it mattered. A named, productized methodology signals you've done this enough times to package it. It also lets the customer choose the lower-risk version (pre-built) or the higher-value version (custom) without re-explaining your capability each time. That's what a commercially viable production solution actually means.

Why this playbook is more valuable in 2026 than when we ran it

The non-traditional procurement pathways have grown. DIU is larger, faster, and operates with more authority. SBIR Phase III awards are scaling. Commercial Solutions Openings are now run by multiple DoD components. Every federal agency is running some version of an AI pilot procurement, often badly. The DoD's adoption of OTAs has expanded the surface area substantially.

AI changes the math. AI is the canonical great idea trapped in a demo. Industry studies routinely find that the majority of AI pilots, often cited above 80%, never reach production. The reasons aren't technical. They're organizational, operational, and contractual. The model works in a notebook. It doesn't work inside a federal contract environment with real data, real users, real audit requirements, and real consequences when it gets something wrong.

The federal customer running an AI procurement today is in exactly the position the IRS, DIU, and DARPA were in when we won those earlier awards: they know the old model doesn't work, they don't yet know what does, and they're looking for someone to show them a working production solution rather than describe one.

The firms that will win this generation of awards are the ones that can ship working agents and automations in weeks, name the risks honestly, align pricing to outcomes, and demonstrate on real data. That's a much smaller club than the list of firms claiming to do AI consulting.

How this lives at Jupiter Peak today

Jupiter Peak exists because most great AI ideas die in the same place: the gap between the demo and the operating model. The board approves the pilot. The team builds something promising. And then nothing ships. The model never moves from the engineer's laptop to a workflow that produces value at scale, every day, under real conditions.

That's the production gap. It's where the majority of AI work goes to die. It's also the only place real value is created.

The playbook that won three non-traditional federal contracts, IRS, DIU, and DARPA, is the same playbook that turns a stalled AI pilot into a workflow that ships. The Robot Factory was the Summit2Sea expression of this discipline: a named methodology, a documented process, pre-built components, two ways for a customer to engage. The AI Opportunity Assessment, Agent Workflow Pilot, and Secure AI Adoption Plan are the Jupiter Peak expression: productized engagements with documented methodology, scored frameworks, and a clear path from idea to production. The work has changed. The discipline hasn't.

Show, don't tell. Pre-build for production. Name the risks. Align to outcomes. Compound across engagements. Make the second time cheaper than the first.

If you're looking at a non-traditional procurement and trying to decide whether to compete—or running an AI pilot inside your own firm and trying to decide whether it's real—the questions are the same. Can you take this idea to production? If not, what's in the way?

The playbook works either way.

Bryan Eckle is the founder of Jupiter Peak and former founder of Summit2Sea Consulting, an Inc. 5000 federal IT firm acquired by cBEYONData and subsequently by SMX. He led Summit2Sea from startup to over $30M in revenue and served as CTO at a firm exceeding $150M.

Next step

Find the AI work worth taking to production.

Most AI pilots fail in the same gap: between promising demo and adopted workflow. The AI Opportunity Assessment helps leaders identify which AI opportunities are actually ready to fund, which need governance first, and which should wait.

Explore the AI Opportunity Assessment