How to Evaluate Development Partners
Agencies promise the world. Freelancers promise speed. In-house promises control. All three are wrong. Here's what actually predicts success.
The software industry is opaque by design. For non-technical stakeholders, a "bad" developer sounds exactly like a "good" one during the sales pitch. Both promise modern tech stacks, agile methodology, and rapid delivery.
The difference only reveals itself after the contract is signed and the deposit is non-refundable. By then, it's too late.
Here are the real signals-the ones that distinguish engineering partners from code factories.
The Fundamental Problem: Information Asymmetry
You're buying something you can't directly evaluate. It's like hiring a surgeon based on their PowerPoint skills.
Traditional evaluation methods fail:
- Portfolio websites: Prove they can hire a good designer, not that they can architect systems
- Client testimonials: Self-selected and often incentivized
- Technology buzzwords: Anyone can claim they "use AI" or "build scalable microservices"
- Hourly rates: No correlation with quality (we've seen $200/hr disasters and $80/hr excellence)
What you actually need: Proxy signals that correlate with engineering competence and operational reliability.
Signal #1: Process Over Portfolio
A polished portfolio proves they can produce attractive mockups. It doesn't prove they can:
- Handle requirement changes mid-project
- Debug production outages at 3 AM
- Refactor code when the business model pivots
- Maintain velocity as the team scales
Don't ask: "Can I see your work?"
Ask instead: "Walk me through how you handled the last time a project went over budget."
What to Listen For:
❌ Red flags:
- "That's never happened to us" (they're lying or inexperienced)
- "The client kept changing requirements" (blame deflection)
- "We just worked extra hours to make it work" (unsustainable process)
✅ Green flags:
- "We identified scope creep in week 3 and held a re-scoping meeting"
- "We implemented a change request process with impact assessments"
- "We moved feature X to phase 2 and delivered phase 1 on time"
The principle: You're hiring a partner for how they handle chaos, not how they handle sunshine.
The Portfolio Trap
Pretty websites prove output. You need to evaluate process.
Ask to see:
- Their project management system (Jira, Linear, etc.)
- Sprint retrospective notes
- Code review comments (redacted for confidentiality)
- Technical documentation samples
If they can't produce these, they don't have a process. They're winging it.
Signal #2: Velocity, Not Speed
Speed is how fast the first feature is delivered.
Velocity is the average pace of delivery over 12 months.
Bad agencies have high initial speed (duct-taping solutions together) but velocity that crashes as technical debt compounds. Good partners might start slower to establish foundations, but they never decelerate.
The 2-Week Promise Trap: If a developer promises a complex build in 2 weeks without asking penetrating questions about requirements, edge cases, and integrations-they aren't working fast. They're cutting corners you can't see yet.
The Velocity Test
Ask: "How long would it take to build Feature A in month 1 versus month 12?"
❌ Bad answer: "Probably the same, maybe faster once we know the codebase."
❌ Worse answer: "Hard to say, depends on what else we've built."
✅ Good answer: "Faster in month 12. By then we'll have reusable components, established patterns, and automated testing. The infrastructure work up front pays dividends."
Measuring Velocity in Practice
Request they graph their feature delivery rate over past projects:
Features/Month
15 | * Bad agency (speed, no velocity)
10 | * *
5 | * *
0 | * * * (Technical debt collapses velocity)
|________________________
0 2 4 6 8 10 12 Months
15 | Good partner (sustainable velocity)
10 | * * * * * * *
5 | * *
0 | *
|________________________
0 2 4 6 8 10 12 Months
If they can't produce this graph, they're not measuring their own performance. That's a problem.
Signal #3: Judgment Over Technology Stack
If a developer insists on a specific technology ("We only build with React" or "Everything should be Python") before understanding your problem, they're a hammer looking for a nail.
True engineers discuss trade-offs:
- "We could use technology X for rapid prototyping, but it will create scaling challenges at 100K+ users"
- "Technology Y has a steeper learning curve initially, but handles your data complexity better long-term"
- "We recommend Z because your team already knows it-minimizing handoff risk"
The Technology Interrogation
Ask: "We're deciding between [Option A] and [Option B]. What would you recommend?"
❌ Red flags:
- Immediate answer without asking clarifying questions
- "We always use [X]" (ideological, not pragmatic)
- Dismissing your current tech stack without understanding why you chose it
✅ Green flags:
- "Tell me about your team's existing technical capabilities"
- "What's your projected user growth over 24 months?"
- "Here's a comparison table of both options with your specific constraints"
The WordPress Test
If they recommend WordPress for a complex application (e-commerce, SaaS, real-time features), they're either:
- Technically incompetent
- Prioritizing their own ease over your long-term success
Either disqualifies them. (See our article: Why We Don't Use WordPress)
Signal #4: Questions Before Quotes
The quality of questions a developer asks before pricing directly predicts project success.
Minimal questions = Guaranteed problems
If they quote a complex project after a 30-minute call, they're either:
- Wildly overcharging (to cover unknowns)
- Wildly underestimating (and will hit you with change orders)
The Discovery Process Test
Ask: "What information do you need from us before you can provide an accurate quote?"
❌ Inadequate response:
- "Just send over your requirements document"
- "We can start with a ballpark estimate"
✅ Comprehensive response should include:
- User personas and workflows
- Existing technical infrastructure
- Integration requirements with third-party systems
- Performance and scalability targets
- Security and compliance requirements
- Team structure (who approves decisions, who tests, who deploys)
- Post-launch maintenance plans
Minimum discovery time for a serious project: 4-8 hours
If they're not investing unpaid time in discovery, they're not serious about accuracy.
Signal #5: Communication Patterns
Software projects fail more often from communication breakdowns than technical failures.
The Response Time Test
During sales: How fast do they respond?
After contract signed: Track if response time changes.
If response time degrades by >50% post-sale, you've been deprioritized. This is predictive of overall project attention.
The Jargon Test
Ask: "Explain [technical concept] to me like I'm a smart person who isn't a developer."
❌ Bad answer:
- Uses more jargon to explain jargon
- Makes you feel stupid for not understanding
- "It's very technical, don't worry about it"
✅ Good answer:
- Uses analogies from your industry
- Checks for understanding ("Does that make sense?")
- Welcomes follow-up questions
Elite answer:
- Provides a visual diagram
- Explains trade-offs, not just definitions
- Connects technical decision to business impact
The Altruvex Difference: What You Should See in Us
When evaluating Altruvex (or any serious engineering partner), look for these non-negotiables:
1. We Say "No"
If you request a feature that damages product usability, security, or long-term maintainability-we push back with data.
Example conversation:
- Client: "Can we store credit cards in our database for faster checkout?"
- Bad agency: "Sure, we can do that."
- Us: "Legally, you can't unless you're PCI-DSS Level 1 compliant, which costs $50K+/year to maintain. Let's integrate Stripe instead-same user experience, zero compliance burden."
We're consultants, not order-takers. If you want someone who says "yes" to everything, hire cheaper.
2. We Discuss Maintenance Before Features
Before writing a single line of code, we ask:
- "Who's maintaining this in 2 years?"
- "What's the handoff plan?"
- "What documentation standard does your team need?"
Why this matters:
Most agencies optimize for delivery speed because they're not maintaining the code. They build technical debt bombs and hand you the timer.
We build for long-term ownership, whether by your team or ours.
3. We Define "Done" Precisely
Amateur definition of "done": "It runs on my machine."
Professional definition: "Tested, documented, secured, deployed, and monitored."
Our Definition of Done checklist:
✅ Code passes automated test suite (>80% coverage)
✅ Peer review completed (2+ engineers)
✅ Documentation updated (inline comments + README)
✅ Security scan passes (no critical vulnerabilities)
✅ Performance benchmarked (meets SLA targets)
✅ Deployed to staging environment
✅ Client acceptance testing completed
✅ Monitoring and alerts configured
✅ Rollback procedure documented
✅ Deployed to production
If a developer doesn't have a checklist like this, they're guessing.
4. We Show Our Work
Transparency isn't optional. You should have access to:
- Daily progress updates: What we shipped, what's blocked, what's next
- Live staging environment: See the build in real-time, not just at milestones
- Source code repository: Full access, not locked behind our accounts
- Project management board: See the backlog, sprint planning, and velocity metrics
Red flag: An agency that won't show you the code "until it's finished." That's a hostage situation, not a partnership.
The Ultimate Filter Question
Every evaluation should include this question:
"Tell me about a technical decision you regret making on a past project."
Evaluating Their Answer:
❌ Disqualifying responses:
- "I can't think of any" (lack of experience or self-awareness)
- "The client insisted on [X] against our advice" (blame deflection, no ownership)
- "We used [technology] but it turned out to be deprecated" (doesn't stay current)
✅ Strong responses include:
- Context: What was the project, what were the constraints?
- Decision: What did they choose and why did it seem right at the time?
- Consequence: What went wrong, specifically?
- Learning: How did they fix it, and what changed in their process?
Example of an excellent answer:
"We built a client's MVP using Firebase because they needed to launch in 6 weeks. Initially, it was perfect-fast setup, real-time features out of the box. But by month 9, their query complexity exceeded Firebase's limitations. We had to migrate to PostgreSQL, which cost them 3 weeks of dev time.
In hindsight, we should have built a thin abstraction layer over the database from day one. Now, we always use repository patterns even for MVPs-adds 2 days up front but makes database swaps trivial. Lesson learned: technical debt costs more than upfront architecture."
This answer demonstrates:
- Real project experience
- Honest mistake acknowledgment
- Root cause analysis
- Process improvement
- Current best practice
The Pricing Red Flags
Fixed-Price Trap
Claim: "We'll build everything for $50K, fixed price."
Reality: Scope is never truly fixed. Either:
- You're locked into a rigid spec (can't adapt to learnings)
- Change requests become expensive negotiations
- Quality suffers as they cut corners to preserve margin
Better model: Phased delivery with re-evaluation points.
Time-and-Materials Trap
Claim: "We charge $150/hour, and it'll probably take 500 hours."
Reality: No incentive to be efficient. Could easily balloon to 800 hours.
Better model: Time-and-materials with a not-to-exceed cap and velocity targets.
The Altruvex Model: Value-Based Milestones
We price based on delivered value, not time spent:
- Phase 1: Core functionality ($X) - 6 weeks
- Phase 2: Advanced features ($Y) - 4 weeks
- Phase 3: Optimization ($Z) - 3 weeks
Each phase is independently valuable. You can pause after any phase if priorities change.
Guarantee: If we miss a deadline due to our fault (not scope changes), the next phase is 20% discounted.
The Contract Due Diligence Checklist
Before signing, verify these clauses exist:
Code Ownership
✅ "All source code and assets become client property upon final payment"
❌ "Client receives a license to use the software"
The second clause means they own it, you rent it. Unacceptable.
IP Protections
✅ "Contractor assigns all intellectual property rights to client"
❌ No IP clause at all
Without this, they can reuse your proprietary business logic for competitors.
Source Code Access
✅ "Client has read access to repository throughout development"
❌ "Source code delivered upon project completion"
Waiting until the end creates information asymmetry and hostage dynamics.
Termination Terms
✅ "Either party may terminate with 30 days notice; client receives all work completed to date"
❌ "Termination requires 90 days notice and forfeits deposit"
If you can't leave easily, they have no incentive to maintain quality.
Performance Benchmarks
✅ "Application shall load in <2 seconds on 3G connection; if not met, 15% refund"
❌ No performance criteria defined
"Fast" is subjective. Numbers are objective.
The Reference Check Script
Don't just call references-they're pre-selected. Ask questions that reveal truth:
Standard question: "How was your experience working with them?"
Predictable answer: "Great, very professional."
Better questions:
-
"How did they handle the first time you changed a requirement?"
Listen for: Flexibility vs. rigidity in change management -
"What surprised you most about the process?"
Listen for: Unspoken expectations that weren't met -
"If you could go back, what would you have done differently in working with them?"
Listen for: Hidden friction points they're being diplomatic about -
"Did the final cost match the initial estimate?"
Listen for: Budget discipline and transparency -
"Are you still working with them for maintenance?"
Listen for: If not, why? (This is the most revealing question)
The Timeline Litmus Test
Ask: "How long will this project take?"
❌ Red flag answers:
- "6 weeks" (immediate precision without discovery = guessing)
- "As long as it takes" (no accountability)
- "Depends on how fast you give feedback" (pre-blaming you)
✅ Good answer: "We'll need 1 week for discovery to map requirements. Then we'll provide a range: 8-12 weeks depending on [specific variable]. We'll know more precisely after we prototype the core architecture in week 2."
Insight: Honest estimates have ranges and dependencies. Fake estimates are suspiciously precise.
The Technical Audit Option
If you're evaluating an existing codebase (e.g., deciding whether to continue with current developers), hire an independent third party for a technical audit.
What a good audit reveals:
- Code quality metrics (test coverage, complexity, documentation)
- Security vulnerabilities
- Performance bottlenecks
- Technical debt assessment
- Estimated cost to remediate issues
Cost: $3K-$8K
Value: Could save you $100K+ by killing a doomed project early
Altruvex offers technical audits as a standalone service. Request an audit
The Bottom Line
Choosing a development partner is a high-stakes decision. The wrong choice costs you:
- 6+ months of lost time
- $50K-$500K in sunk costs
- Opportunity cost of delayed market entry
- Reputational damage if you launch broken software
The right choice creates:
- A strategic asset that compounds in value
- A technical partner who understands your business
- Velocity that accelerates over time
- A foundation for long-term competitive advantage
Stop optimizing for the lowest bid. Start optimizing for the highest probability of success.
The signals in this article correlate with engineering excellence because they measure:
- Process maturity (can they execute reliably?)
- Technical judgment (do they make sound architectural decisions?)
- Communication discipline (will you know about problems early?)
- Long-term thinking (are they building for tomorrow or just today?)
Use these signals ruthlessly. The market is full of charlatans with beautiful portfolios.
Evaluating Altruvex as a potential partner? Here's what we'll do differently from this first conversation:
- Technical discovery session (2-3 hours, no charge)
- Detailed project brief with architecture options and trade-off analysis
- Phased proposal with value-based milestones
- Reference calls with 3 past clients (we'll give you 5 to choose from)
- Code sample review - we'll show you actual production code from similar projects
No sales pressure. No generic pitch deck. Just engineering rigor applied to your specific problem.
Schedule a discovery session or Request our technical questionnaire