The Role of AI in Product Building

Home • About • Projects • Blog • Resources • Contact

Over the past month, I’ve been learning more about how to use AI well in Product Management - both in terms of when to build AI products, but also in how we use AI in our building processes. I’ve been exploring how AI tools can be used not just to build apps from scratch, but to support the way we work across product teams: accelerating engineering tasks, prototyping for user testing, generating ideas, and even helping us reflect on how we build. I’ve been reading up on the thoughts and experiences of those I most respect in technology, and absorbing as much as I can from podcasts and lightning talks; but the thing I’ve learned the most from, has been trying to build a production-ready MVP of an AI product using AI tooling itself.

As such I thought I’d write something of a think-piece and field report rolled into one, grounded in real effort, and very real frustration. All of which has led me to the conclusion that AI tools offer huge promise, and that if we don’t start to embrace them we’re going left in the dust; however we need to be honest about where they help, where they fall short, and what that means for engineering, UX, product management, and the trio that binds them together.

This isn’t a hot take on the future of work, but an honest expression of what it actually feels like to work with these tools in practice. What they’re good at, where they fall short, and how I think they could change the role of cross-functional teams.

My TLDR is this: AI can absolutely speed up how we build. But only thoughtful, skilled teams of human beings can decide why and what we build; and whether we do so with judgment, care, and user impact in mind. This is going to make having a strong, collaborative and well-aligned Product Trio more important than ever.

What Cursor (and other tools like it) do well

Cursor is excellent at:

Rapid generation of boilerplate code, especially for common patterns like CRUD operations.
Filling in gaps: from component scaffolding to test file generation.
Delivering fast, especially for small or well-scoped features.

When I first wrote this paragraph, I wrote “If you treat it like a very fast junior engineer who needs very detailed instructions, it can be incredibly helpful. It works best when the problem is well-defined and the implementation is conventional”; but I actually think this is entirely wrong, and belies a crucial problem - it isn’t a software engineer, it isn’t a colleague, and shouldn’t be treated as one. It doesn’t reason, remember, or care; it executes.

The point remains however, to use cursor effectively, you do need very detailed instructions, a very well defined problem, and to point it at the right problem (ideally something conventional with a very conventional implementation).

Fast to 90%, slow to finish

The speed is very very real — but only up to a point. I had a workable MVP in under a week of part-time, casual tinkering; thanks to Cursor’s ability to scaffold, generate, and refactor quickly. It was frankly exhilarating, and I began to imagine myself building an entire slew of MVPs that I could launch in the wild, test with users and then focus my attention on the ones that got traction. However, as I began building a more bespoke evaluation framework of my products AI functionality — something multi-step, logic-heavy, and tightly integrated with prompt design and schema validation, it went a bit awry. A week later, I was still wrestling with it. Tbh I wasn’t just still wrestling with the evaluation framework, I was still stuck on the same part of the framework. The same problems kept coming back and it was a 2 steps forward, 1.8 steps back kind of experience.

Eventually, I paused and asked Cursor itself what it needed to be more effective. I wrote about that in an earlier post: AI Pair Programming: What Works, What Breaks, but in short Cursor asked for clearer documentation, better architecture, more helpers, and stronger test coverage.

So I built all that: I restructured the repo, abstracted components, documented the prompts and data flows; and things improved… briefly. As the codebase matured, a new issue emerged - loss of context. The more I modularised and documented, the less Cursor seemed able to hold it all in its ‘head’. It started duplicating functions across files, sometimes writing new versions of scripts that already existed, even where I’d just corrected this very mistake only a few instructions earlier.

Worse, even with guardrails in place, it occasionally ran OpenAI API calls I hadn’t approved, racking up token charges. I repeatedly updated my documentation and system prompts to tell it not to; and still it found a way. Eventually, I had to lock it down completely using Cursor’s user and project rules to prevent it executing any scripts at all; but the issue still resurfaces if I’m not vigilant.

So yes, these tools are fast; but they’re also fragile, and the moment you start layering in complexity, they become a liability.

Where AI coding tools fall short

The closer you get to the edges of convention, the more things start to crack:

Poor abstraction: It happily repeats itself across files instead of creating reusable logic.
Lack of judgment: It will insist something works when it doesn't, and won’t understand why that matters.
Context limitations: Even with a clean architecture and rich documentation, it forgets things that should be obvious.
Cross-file complexity: As soon as things get bespoke, multi-layered, or unusual it really struggles; this is even more so the case where it’s required to work across multiple files or services.

These aren’t just theoretical annoyances, they can have serious consequences. Just this month, Replit’s AI agent wiped a production database during a vibe-coding experiment, fabricated thousands of user profiles, and ‘lied’ about its actions. In the very same week, the Tea app (supposedly a safe platform for women) leaked 72,000 photos and personal details, exposing serious design and infrastructure oversights. These were experienced teams.

In my case, even with a strong emphasis on safety, I’ve had to watch Cursor like a hawk. If these tools are used uncritically, we’re going to see a flood of brittle, insecure, and potentially dangerous apps.

The risk of “vibe-coding” (I hate this term) and “vibe-UX” (but I might as well lean into it)

Much of the discourse around AI tools has (rightly) critiqued “vibe-coding” - rapid, unsupervised, demo-driven development - but I think “vibe-UX” deserves equal scrutiny.

We’re seeing a wave of AI-generated UI that looks polished but lacks the depth of real design:

It mimics patterns without understanding usability.
It skips over accessibility, edge cases, and error states.
It avoids the hard questions like how to build trust, handle failure, or support real agency.

This last point is where I think UX design becomes absolutely critical in this new era of AI products and features, especially when it comes to agentic systems. If we want to build AI agents that are usable, useful, and trusted; we need skilled UX designers heavily involved. Not just to “make it pretty”, but to shape the way agents:

Handle ambiguity
Communicate uncertainty
Fail gracefully
Support, rather than replace, user agency

In my opinion, this isn’t a nice-to-have, it’s the make-or-break factor; because we need to stay focused on human agency. Too often, "automation" becomes synonymous with removing people from the loop, risking a loss of: context, judgment, creativity, and care. Instead, we should design agents that augment the work people do, help them make better decisions, and strengthen their capabilities. None of this is possible without really good UX design. If we offload too much thinking to AI, without designing for human agency, we risk building tools that look smart but are practically useless or simply not trusted enough to be used.

What about Product Management?

If engineering and UX are already being reshaped, what does this mean for PMs? Product management isn’t immune to this shift either, there's a growing number of tools promising to automate tasks like summarising feedback, generating roadmaps, or drafting PRDs; but product management has never just been about documentation. Done well, it's about strategic thinking, asking the right questions, making thoughtful trade-offs, and aligning teams around outcomes rather than output.

The changes to the PM role afforded by AI is going to be significant for those organisations that empower PMs to focus on outcomes; as with less time spent on repetitive admin, there’s more space to think deeply, to collaborate cross-functionally, to invest in discovery. However for those organisations that treat PMs as project managers, this shift will likely backfire. As AI speeds up delivery, PMs risk being seen as redundant even as product quality quietly erodes. Marty Cagan has talked about this recently: velocity without value isn’t progress and we risk worsening the feature-factory situation that so many companies find themselves in; but this time not only are we building the wrong thing, but we’re doing it faster and therefore more; which will only lead to bloat and a loss of user trust.

As someone who’s been both an engineer and a PM, I cannot agree more strongly that instead of pushing for pace of shipping, we need to use the time AI-tooling buys us to carve out more space for understanding the problem space and working out how to solve real customer problems so we ensure teams are building things of real value.

The evolving role of the Product Trio and the shape of things to come

In this new world of AI-augmented product building, the Product Trio (PM, Designer, Tech Lead) becomes even more vital; not just to direct the team, design the solutions, and keep quality high; but to shape how these tools are used.

A strong trio can:

Steer the team's use of AI toward augmentation, not just unfettered automation.
Challenge assumptions that speed = progress.
Maintain strategic alignment while embracing rapid iteration.

Some suggest (ahem, Reddit) that some newer developers may now rely on AI to write 40–90% of their code. I don’t know how reliable that figure is or how much I want to trust Reddit; but I’ve been a junior engineer, and I’ve mentored junior engineers, and I know how easy it is to feel like you’re not pulling your weight. So I can absolutely see the appeal. If true, I worry this is going to be detrimental to learning and confidence in the long run.

However what about those engineers who already have a solid foundation? Apparently it can work. In a recent Lenny’s Podcast episode, Mike Krieger (CPO at Anthropic) shared that AI now writes 90–95% of the code for Claude Code.

According to interviews with Anthropic, this is down to:

Using RLAIF (Reinforcement Learning from AI Feedback), letting other models critique and refine code.
Pairing this with Constitutional AI which encodes principles like maintainability and relevance (essentially, a formalised version of what I tried to implement via documentation and rules in Cursor).
Using of an external scratchpad to store context and navigate project state across iterations.

That last one is exactly what I wish I had in my project right now.

Anthropic has gone as far as to espouse the view that AI could be writing 90% of software across most industries within months. That sounds extreme, but it underscores that we’re not theorising about a future use case, this is already happening to some degree.

Our job now isn’t to decide if AI will play a role in product development. It’s to decide how it should and how we make sure it does so safely, thoughtfully, and with the right people still in the loop; which brings me back to the Product Trio. These tools change how we build, but PMs still need to strategise about what and why we build what we do and the Trio needs to ensure teams are constantly asking themselves how do we choose to build and why. As Cagan points out, the question for product teams isn’t ‘Can AI write code?’, it’s ‘Can our team still ask the right questions in response? AI may automate delivery, but only judgment can drive meaningful product discovery. Product teams need to embrace the use of AI, and those that fail to integrate it into their work smartly will fall far behind competitors; but the key point here is integrating it ‘smartly’.

Looking back on the past few weeks, what started as an experiment in AI-assisted building became something much bigger - a crash course in the reality of working with these tools. The MVP I built isn’t perfect, and neither was the process; but it gave me first hand experience of both the power and the pitfalls of AI in product work. I didn’t just read about how these tools affect engineering, UX, and PM; I experienced it. The opportunity is real, but only if we choose to engage with it deliberately, ask better questions, and design our ways of working to support the kind of thinking that AI can’t do for us, and through building ways of working that support good judgment as much as speed.

Whilst it’s been fun building a product solo (and doing so at pace), I think that the key to delivering things of value still lies in people rather than clever tooling - through the partnership of individuals who all bring a different specialist skillset to the table.

If you’re a PM: Look at where your team’s use of AI is leading them to default to speed over product rigour. Are you still carving out time for problem-definition and trade-offs?

If you’re a designer: How are you using AI tools to yours and your team’s benefit, whilst ensuring you are able to focus your efforts on what really matters - designing empathetically, for accessibility, for user agency and to build trust? And for those designing AI products, what guardrails have you added and how are you handling failure states?

If you’re an engineer: What docs, tests, or reviews are being skipped in the name of speed? What’s the long-term cost? How can product better support you in carving space for deep, quality work?