🧩 LLM’s aren’t that smart?

Everyone keeps throwing around “LLM” like it’s a magic wand.

It’s not.

A large language model is essentially designed to predict the next word (it’s not THAT simple, but…

That’s literally it.

And weirdly enough, that’s why they’re useful in accounting.

Because what we do, at scale, is pattern recognition.

You see a vendor name, an account, a description: and you already know where it should land.

You recognize the shape of a normal entry, the rhythm of a compliant one, and the red flags in the outliers.

They learn context and consistency the same way you do over years of closes.

Real example

Let’s say you’re doing a capitalization review before close.

You’ve got a batch of JEs and you want to make sure none of them are sneaking past your policy.

Instead of scanning every line manually, you feed a model a few inputs (heavily simplified for the purposes of this example)

  1. Your policy context: key rules, thresholds, criteria. “CapEx threshold: $5,000. Expense anything under that. Capitalize internal-use software if expected life > 12 months.”

  2. Some historical examples: a few compliant JEs, a few that were reclassified.

  3. Current month’s descriptions - 5-10 new entries that were created and approved.

Then, prompt:

Based on this policy and these examples, which JEs might violate the capitalization rule, and why?

The model scans for language patterns, dollar amounts, and inconsistencies.

Maybe it flags a $7,200 laptop coded to expense.

Or a “software subscription” coded to CapEx.

It’s pattern-matching inside the context you gave it, faster and more consistently than a tired reviewer at 10 p.m.

Designed well, this ‘agent’ could act as a first set of eyes to alleviate some of the review bottleneck teams face.

What that means in practice

This is where people overcomplicate AI.

They expect it to know accounting. It doesn’t.

It applies what you feed it… consistently.

The better you frame the context (thresholds, account types, historical edge cases), the smarter it performs.

If your inputs are unclear, the output will be too.

When it does get confused, that’s actually useful (Believe it or not) it means your policy isn’t as clear as you thought.

Give it structure, and it becomes a reliable reviewer that never gets tired or bored.

Try it

Grab five recent JEs and your capitalization policy.

Prompt:

“Using this policy, identify any entries that might violate it and explain why.”

You’ll see immediately how well it can follow rules when you give it good context and how quickly it falls apart when you don’t.

Either way, it teaches you something.

the directional focus

Accounting needs context-aware workflows, systems that can apply logic the same way your senior would, just without fatigue.

LLMs are the first step in that direction.

Human-in-the-loop is a trendy buzzword… but i’d almost call this “AI-in-the-loop” - where your embedding AI into your current processes.