AI in Finance

Which Claude for Which Finance Process — And What Has to Be True Underneath

Matt Kopiec
by Matt Kopiec
June 8, 2026
-
8 min read

A working guide for finance leaders past the question of whether AI is useful, and into the harder one of what to deploy, where, and on top of what.


Every CFO I've spoken to in the last six months has asked me a version of the same question. They've sat through three vendor demos. Their CEO forwarded a LinkedIn post about agents. A controller is pestering them about whether the team should be using Claude or ChatGPT or both. They have, in their words, an AI strategy problem.

They don't. They have a which AI, where, and on top of what problem. That's the question this piece answers.

I'll do three things. First, characterise the five distinct ways Claude shows up in a finance function — not as products, but as capabilities, because product names change and the capabilities don't. Second, match those capabilities to the three categorically different kinds of work every finance team does. Third — and this is the part most pieces in this category skip — name the precondition that determines whether any of it actually works.

That last one matters more than the first two combined. The reason most AI-in-finance projects fail isn't a wrong deployment choice. It's a right deployment choice on a foundation that wasn't ready, and nobody told the CFO before the contract was signed.

Part one: five capabilities, not one product

The mistake almost every finance leader makes is treating "Claude" as a single thing. It's five different deployment modes with different build costs, different risk profiles, and different appropriate uses. They look superficially similar in a demo. They are not similar in production.

1. Conversational AI — the chatbot capability. A person opens a browser, types a question, gets an answer. No persistence, no system access, no integration. The obvious example: claude.ai. Cost to deploy: zero. Risk: low, because nothing leaves the conversation. The right use is individual analysts learning the tool, one-off questions, and exploratory drafting on data the analyst pastes in. The wrong use is anything touching real numbers from your actual systems — copy-paste introduces error and breaks audit trail. Most "we're using AI" stories in finance today are this, and most of them shouldn't be.

2. Human-in-the-loop assistance — the Cowork capability. An analyst works with the AI on a project that spans hours or days, with file access, multi-step reasoning, and persistent context. The human still drives every step; the AI removes the friction. The example: Claude Cowork. Cost to deploy: a license and an analyst's time to learn the workflow. Risk: contained, because the human reviews every output. The right use is M&A due diligence, board pack preparation, deep-dive analysis on a specific question, FP&A work where the analyst owns the answer but wants to move faster. The wrong use is anything that runs the same way every month — you'll waste analyst hours doing what should have been automated.

3. Natural-language system access — the connector capability. The AI reaches into your actual systems — your ERP, your accounting system, your data warehouse, your CRM, your planning tools — and answers questions on real data. The user types "what's our cash by entity?" and the AI queries the source. The example: Claude with MCP connectors. Cost to deploy: real integration work, plus permissioning and audit setup. Risk: medium — the AI sees real data, so access controls and audit trail matter. The right use is self-serve analytics for non-finance users, ad hoc data questions during close, and exploratory analysis on live data. The wrong use is any process where the data underneath is dirty, because the AI will confidently surface confident-sounding nonsense.

4. Autonomous and semi-autonomous agents — the agent capability. The AI runs on a schedule or a trigger, executes a defined workflow, produces deterministic outputs, and hands them back for human review. The human reviews the output, not the process. The example: Claude via the Anthropic API, orchestrated into an agent that runs your monthly consolidation, intercompany reconciliation, or tax provision pipeline. Cost to deploy: real engineering investment, but amortised across every cycle the agent runs. Risk: highest of the four, because the agent acts without human review of each step — guardrails, error handling, and audit trail are non-negotiable. The right use is recurring, multi-source, deterministic processes. The wrong use is anything one-off, anything judgment-heavy, or anything where the output gets argued about in a meeting rather than reconciled.

5. Private deployment — the sovereignty capability. Any of the four above, but running in your infrastructure or a single-tenant cloud environment rather than a shared one. Capabilities are identical; risk profile is different. Cost to deploy: meaningfully higher than the standard equivalent; ongoing maintenance burden is real. Risk: lowest from a data sovereignty standpoint; highest from an operational standpoint, because you own more of the stack. The right use is regulated industries, sensitive M&A workstreams, and anything that genuinely cannot leave your perimeter. The wrong use is defaulting to it because "AI is scary." On-prem is a real cost. Most finance data doesn't require it.

The point of this list isn't to explain the technology. It's to give you the vocabulary to have a precise conversation with your team, your CIO, and your board. "We're piloting Cowork for M&A, scoping an agent for consolidation, and rolling out natural-language access into the ERP for the FP&A team" is a defensible AI strategy. "We're using AI" isn't.

Part two: matching capabilities to finance work

Every finance function does three categorically different kinds of work, and each rewards a different deployment mode. Treating them as the same is what produces vendor PoCs that demo beautifully and die in production.

Recurring processes — the agent quadrant

Matrix 1 — Recurring processes: which Claude deployment for close, consolidation, reporting, and payments

The close. Consolidation. The monthly tax provision. The lease portfolio update. The intercompany reconciliation. The weekly cash report across 40 entities. Same shape, same inputs, same expected output, same time every cycle.

This is where autonomous agents pay back. The build cost is real, but you amortise it across every cycle for years. A consolidation agent that takes six weeks to build and saves four days of controller time every month pays for itself inside a year and keeps paying.

Two examples from our own deployments. First, a percentage-of-completion agent running on production data from the ERP of one of Poland's leading modular construction companies. The agent reconciles revenue against costs per contract every cycle — work that previously took the finance team days of manual reconciliation across job-costing data, contract milestones, and the GL. Second, a group consolidation agent for one of the largest software houses in CEE — twelve entities across eleven countries, eleven local accounting systems, and a single set of consolidated figures produced every month-end. Both are textbook examples of why this quadrant exists. Multi-source, deterministic, frequent, reconciled. The agent earns its keep through repetition.

Two counterintuitive placements worth naming. Hedge effectiveness testing under IFRS 9 sits in this quadrant — the math is deterministic once the hedge is designated, and running it monthly across dozens of instruments is exactly what agents are for. The judgment is in the designation, which happens once. Multi-jurisdictional tax provisioning also belongs here despite its reputation for complexity — the rules are deterministic, the inputs are knowable, the answer is reconciled rather than argued.

Two placements that don't belong here, despite looking like they might. Variance commentary is not an agent task — the numbers are reconciled, but the commentary is argued about in the next QBR, which makes it Cowork. Single-entity standard close in one accounting system is also not an agent task — it's already solvable with ERP-native features or a scripted workflow, and an agent would be expensive overkill.

The test that resolves most placement debates: if the output is reconciled, it can be an agent. If the output is argued about in a meeting, it can't.

Periodic planning — the Cowork quadrant

Matrix 2 — Periodic planning processes: where Cowork delivers the highest leverage in FP&A

The annual budget. The rolling reforecast. The QBR. The headcount plan. The capex prioritisation. Same shape each cycle, but every iteration involves fresh judgment about the future. The data may be clean, the format may be standard, but the question being answered changes every time.

This is where Cowork delivers its highest leverage in a planning function. An FP&A analyst working alongside Claude can stress-test assumptions, draft commentary, model sensitivities, and prepare board narrative in a fraction of the time it would take alone. The analyst still owns the answer. Claude removes the friction between the question and the work.

The hybrid case worth naming: driver-based forecasting at scale. The mechanical part (rolling up driver assumptions across BUs, calculating sensitivities) is agent-shaped. The judgment part (which scenarios to present, which assumptions to defend) is Cowork-shaped. The two work together — agent runs the mechanics, Cowork shapes the conversation around the output.

What planning is not: it's not the place to deploy autonomous agents on the judgment layer. "Build me an agent that produces our budget" sounds appealing in a vendor demo and produces a beautifully formatted set of numbers that no one believes. Budgets are negotiated, not calculated. The agent can build the data pack; the human owns the conversation.

One-off strategic projects — also the Cowork quadrant

Matrix 3 — One-off strategic processes: Cowork as the dominant mode for M&A, carve-outs, and transformation

M&A due diligence. Carve-out P&L reconstruction. Quality of earnings analysis. Restructuring scenarios. CFO transition planning. Capital raise prep. Each one happens once per situation, with massive data volumes, high stakes, and a tight timeline.

This is where Cowork delivers its most visible value. A diligence team using Cowork on a data room operates at three to five times the throughput of the same team without it. Carve-out P&L reconstruction — historically a six-week analyst grind — compresses to days. The judgment still lives with the deal team. The grind doesn't have to.

What strategic projects are not: they're not agent territory. By the time you've built an agent for "M&A diligence," the deal is closed. Agents pay back through repetition. Each deal is a snowflake. The economics don't work, and they don't need to — Cowork on a per-project basis delivers more value at a tenth of the build cost.

The exception that proves the rule: synergy quantification looks like it should be agent-shaped (lots of data, lots of calculation) but is actually human-led with Cowork support, because every synergy number is argued about in investment committee. Reconciled outputs can be automated. Argued outputs need a human in the room.

Part three: the precondition nobody tells you about

Here's where most pieces in this category stop. We don't.

The above is a useful matching exercise. It's also, on its own, dangerous — because it assumes that picking the right deployment mode is sufficient. It isn't. The most common cause of AI-in-finance failure isn't a wrong deployment choice. It's a right deployment choice on a foundation that wasn't ready.

If you run finance for two or more entities, you already know the secret nobody says out loud: your consolidation works because one person holds it together in Excel. Your GLs don't really map. Your CRM, billing, and ops data live in different shapes in every country. Every month-end is a heroic act. And every FP&A tool you've bought sits on top of this mess, multiplying the confusion instead of fixing it.

Putting an AI agent on top of that is not a transformation. It's a faster way to surface the same problems.

Five preconditions determine whether any deployment mode works.

Data accessibility. Is the data the AI needs sitting in a queryable system, or trapped in emailed Excel files, scanned PDFs, and shared drives? If your monthly close depends on twelve files that one controller renames every month, no agent can run it. The agent isn't broken — your data foundation is.

Process documentation. Is the how of the process written down, or does it live in one person's head? Most finance processes are tribal knowledge. Building an agent for an undocumented process means automating the bugs along with the value, and discovering both in production.

Error tolerance. Do you know what "wrong" means for this process and what it costs? If you can't answer that question, you can't put guardrails around the AI. Without guardrails, you can't deploy it in anything regulated, anything material, or anything board-reported.

Audit trail. Can you reconstruct what the AI did and why, after the fact? This is non-optional for anything touching the books, anything subject to external audit, and anything you'd need to explain to a regulator. Some deployment modes make this easy. Others make it nearly impossible.

Executive sponsor. Is there someone senior who owns the outcome, fights for the budget, and absorbs the politics? Without one, the project becomes a science experiment. Science experiments die when the analyst running them gets pulled onto something else.

The first two are the killers. If data isn't accessible, you don't have an AI problem — you have a data engineering problem. If the process isn't documented, you don't have an AI problem — you have a knowledge extraction problem. Solving these is unglamorous. It's also the boring work that determines whether the glamorous work ever pays off.

This is the way we think about it at incro. We call it the Finance Foundry. Three layers, and the order is non-negotiable:

LayerWhat it isWhat sits on it
01 — FoundationClean data. Entity by entity, system by system. GL is the spine; everything else reconciles to it. Output: consolidation-grade data.Nothing works without this.
02 — IntelligenceManagement reporting, KPI structure, board pack, BI built on the clean foundation.This is where Cowork starts paying back.
03 — AgentsAutonomous and semi-autonomous deployments for narrow jobs — reconciliations, variance commentary, tax provision, anomaly detection.This is where agents live.

We refuse to do Layer 03 work for clients who won't let us do Layer 01 first. Not because we're precious about methodology, but because we've seen what happens when people try to skip the foundation. The AI works. The numbers are still wrong. The CFO looks bad in front of the board. The project gets quietly killed six months later.

There's a one-sentence version of the mechanism that sits underneath all of this: we compute the number deterministically; we let AI explain it. AI is good at narrative, reasoning, and reach. It is not good at being the calculation engine for anything you care about. Use it for what it's good at. Don't use it for what it isn't.

The honest version of the deployment recommendation, therefore, has two parts. Not one.

Deployment fitReadinessWhat to do
Agent candidateReadyBuild it. Highest ROI of any AI investment you'll make this year.
Agent candidateNot readyFix the foundation first. Data engineering and knowledge extraction before any AI build.
Cowork candidateReadyDeploy this quarter. Lower barrier, faster time to value.
Cowork candidateNot readyPilot with one analyst. Use the pilot to build documentation as you go.
Human-ledEitherDon't automate the decision. Use Cowork on supporting work; humans own the call.

Where to start — a 90-day move

If you've read this far, you probably want one thing: a concrete starting point that isn't a vendor pitch.

Three moves, in order of decreasing readiness barrier.

Week one — deploy Cowork for one analyst, on one process. Pick variance commentary, board pack drafting, or a current one-off project. Cost: a license. Time-to-value: days. Risk: minimal, because the analyst reviews every output. This is the move that builds organisational muscle without requiring any foundation work. It's also the move that demonstrates Claude's actual capabilities to your team in a way no demo can.

Month one to two — pilot natural-language access to your primary ERP or data warehouse. Connect Claude into your accounting system, ERP, or data warehouse and give five non-finance users self-serve access to the metrics they currently ask your team for. Cost: real integration work, but contained. Time-to-value: weeks. Impact: the 30% of finance team time spent answering "what's our ARR by segment" questions disappears.

Month two to three — scope one agent. Pick a recurring process that meets every readiness gate: documented, data-accessible, owned by a named person, with a clear definition of "right." Group consolidation across multiple entities is the canonical example, but multi-jurisdictional tax provision, IFRS 16 lease portfolio tracking, hedge effectiveness testing, or a percentage-of-completion engine for contract-based revenue are equally valid depending on your business. Scope it. Estimate the build cost. If the foundation isn't ready, fix that first before you commit to the agent. The agent will still be there when you're ready.

Notice what isn't on this list: a big-bang AI transformation, a six-figure platform commitment, a CIO-led RFP. Those are how AI initiatives fail. The path that works is smaller, faster, and more honest — Cowork first, system access second, agents third, and only when the foundation underneath is ready to support them.

The shorter version

If you take one thing from this piece:

Claude agents earn their keep through repetition. Cowork earns its keep through acceleration. Natural-language system access earns its keep through reach. The wrong choice over-engineers a simple task or under-serves a critical one.

And none of them work on a broken foundation.

The CFOs we work with don't fail because they picked the wrong AI. They fail because they tried to put AI on top of a data foundation that one person was holding together in Excel. The boring work — cleaning the foundation, documenting the process, fixing the access — isn't optional. It's what makes everything above it possible.

That's the work we do at incro. The AI matching exercise above is genuinely useful, but it's the second conversation. The first one is always about the foundation underneath.

If you're thinking through where Claude fits in your finance function, or whether your foundation is ready to receive it, that's the conversation we're set up to have.

TABLE OF CONTENTS
Heading 2

Want to see what we'd build for you?

EXPLORE WITH AI
LET’S TALK

Your financial data won't fix itself.

30 minutes. We'll tell you exactly where your data is costing you money — and what AI can do about it.