Q&A: Why ‘AI-generated’ is not the same as ‘client-ready’ in financial services

Despite rapid advances in AI, many financial services teams still spend significant time rewriting AI-generated outputs before they are ready for clients. The focus now is shifting toward systems that can understand how an individual organization works and apply its institutional knowledge over time.

Noah Faro, CTO and co-founder of Farsight, shared his thoughts on this topic in conversation with Digital Journal.

Digital Journal: Despite the rapid adoption of AI in finance, many teams still find themselves rewriting or heavily editing outputs before they’re usable. What’s driving that disconnect?

Noah Faro: A lot of the tools on the market today can generate something that looks complete. Ask for a pitch deck or a financial model and you’ll get back an output that may pass a quick first impression. The challenge is whether that work will hold-up and pass the stress test in a real deal context – oftentimes, that isn’t the case, and that is what causes the iteration. This impact is largely because most AI systems themselves are still general-purpose, designed inherently to tackle any problem rather than finance’s specific problems, and designed to execute with a general person’s preferences rather than how you specifically create solutions.

In finance, there’s a misconception that the deliverable is just a set of slides or numbers. In reality, it is a structured argument built on data, where every section needs to connect and hold up under scrutiny. The process for creating that requires referencing prior examples, conducting detailed research, building a narrative, and then translating that into something genuinely compelling, coherent, and unique. This alone is a very complicated process, but add in that one firm’s way of going about crafting a narrative or formatting their materials is different than another’s, and that leads to an even more complex layer of reasoning that most AI systems don’t inherently understand.

DJ: What does “institutional knowledge” consist of in practice, and how does it shape the way work gets done?

Faro: By the colloquial definition, operationalizing institutional knowledge means feeding a firm’s own data into the AI system, so that what comes out reflects how that particular firm works rather than how the work is done everywhere. In practice today, how much impact it has depends entirely on which underlying data you incorporate.

The industry’s default answer is the most basic version of it: index the firm’s documents and let the system retrieve over them – pulling prior decks as references, lifting historical formatting, drawing on past models and memos. That by itself is certainly useful, as it shows what materials and generated outputs need to look like to be representative of your firm. However, there are thousands of decisions that get made along the way to creating such a deliverable – and teaching an AI system why something was done a certain way helps educate the process for getting to that polished output, rather than just hoping the AI system connects the dots.

The real step change that institutional knowledge can create lives in capturing that “why” – what judgement and preferences were conveyed in the edits, the approvals, and the deltas between a first draft and the final version for work that a firm does. That’s where the firm’s true standard reveals itself, and that is knowledge that doesn’t exist just in their historical finished files. AI systems that are able to collect, reference, and improve over time by understanding the preferences of an institution align a system much closer to the real, human-led work a firm already produces.

At Farsight, we call this new layer of institutional knowledge the System of Judgement, and it is a core reason why we are able to deliver the quality of AI materials generation that we have.

DJ: Why is it difficult for general-purpose AI tools to reflect how a specific financial firm actually works?

Faro: It comes down to what these models were trained on. The frontier labs have made extraordinary progress on the human alignment problem – making a model capable of and agreeable to what people in general want. But every source feeding into their training data is generic: the public internet, public benchmarks, and increasingly commissioned expert data. All of it describes how the work is done everywhere, not how your firm does it in particular.

Creating a deliverable at a financial institution is the result of thousands of small decisions – on research, formatting, narrative, data wrangling – that a model built to run across every industry and use case has no real basis for making the way your firm would. So out of the box, when you prompt it to build something like a pitch deck, you get a helpful starting point from the model’s native understanding of the task – but it’s far from what you need because you have your own way of doing a financial analysis, formatting a slide, or spreading a model that isn’t represented in the model’s priors.

The usual workaround is heavy prompt engineering and attached examples, and it helps, but it’s a stopgap rather than a fix. You can’t change the underlying behavior of the model, so each time you want a deliverable done your way, you’re re-teaching it from scratch inside the prompt – your preferences, your formatting, your approach – and paying for that re-education in tokens and inconsistency on every call. And even then, your entire firm’s history of experiences likely won’t be encapsulated into that prompt’s context.

MIT recently found that 95% of enterprise AI pilots produce no measurable return, and they pinned it on a learning gap, not on the models themselves. The models are capable at executing tasks when they understand how they should be done. Thus, unless you build an entire system to capture the System of Judgement and apply it to your AI system, you will continue getting generic outputs.

DJ: What does it mean for an AI tool to truly “learn” a firm? How does this firm-specific knowledge evolve over time?

Faro: For an AI system to truly “learn” a firm, it needs to understand not only how they do things – like create specific deliverables, write different types of emails, or conduct certain depths of research – but also why they do them.

For the how, that includes the system maintaining a healthy understanding of how deliverables are structured, how key sections are built, and how the firm communicates its ideas. A good system can generate outputs that already align with those standards, without requiring extensive setup or prompting at the individual user level.

For the why, that mainly involves the system understanding a user and firm’s preferences, and how they craft the narratives and arguments they are trying to tell with their deliverables – the System of Judgement. By understanding preferences, a system is able to learn the braintrust of a firm over time and maintain that firm-specificity to not make similar mistakes in the future. And by understanding and breaking down the narratives and arguments conveyed for certain situations, a system can learn to guide its decisions in the future with similar analyses.

As more work continues to flow through the system, these two aspects of its learning allow the system to become more consistent and more tailored to the firm, and every individual there can leverage it.

DJ: As firms look to operationalize their internal knowledge with AI, how can they do that in a way that protects client data?

Faro: In financial services, protecting sensitive data is non-negotiable, and any system touching it has to run in a controlled, auditable environment – the same bar you’d hold any technology to. But there’s a distinction that makes this more tractable than people assume. The thing actually worth capturing from a firm isn’t its raw client data, it’s the method behind how the work was done with it – how an analyst structures an argument, which source clears the bar to ship, how one framing gets chosen over another. That layer of judgment is separable from the confidential numbers underneath it, so a firm can encode and reuse how it works without ever exposing the sensitive content that happened to be in the room when it learned it.

Traceability is the other half, and keeping this kind of record is actually an asset for it. When the system retains the decisions behind a deliverable – where a figure came from, which sources made it into the final version, who approved it – the work becomes defensible and reviewable in a way it never was when all of that lived in scattered threads and got discarded at send. The goal isn’t to hoard a firm’s data – it’s to preserve how the firm decides, under control, so teams can move faster without ever compromising client trust.