Demo
Armaan KapoorMarch 14, 20265 min read

Why lending runs on PDFs and probably always will

The document is not going away. Not because the technology isn’t there. Because the document is the format that preserves optionality about what the data means.

Every few years someone announces the death of the PDF in financial services. Open banking APIs will replace document exchange. Verified data piped straight from the bank to the lender. No more emailing statements. No more scanning. No more forwarding chains where the file gets re-saved three times before anyone looks at it.

And they're right that verified data is better. When the data arrives authenticated from the source, the trust boundary holds from the bank to the decision. Everyone wins.

But here's what actually happens on the ground. A deal shows up and it's both. Three months of verified pulls and nine months of PDF statements the borrower exported from their bank's website. A credit report that came as a PDF. Two years of tax returns that will never not be PDFs because that's what the IRS produces. A loan application someone filled out by hand and scanned on their phone.

Even if every data source had an API tomorrow, someone still has to pour the output into the format their team actually works in. The pricing sheet. The stacking template. The CRM fields. The spreadsheet the underwriter has been using for five years that encodes assumptions nobody documented. So someone writes an integration layer. Then someone else writes a different one. Now two people are looking at dashboards built on the same API and disagreeing about what the numbers mean because they made different choices about what to include in revenue and what to exclude.


This is the thing that never goes away. Not the document. The interpretation. A verified API gives you clean data. It doesn't give you clean meaning. What counts as revenue. What counts as debt service. Whether that internal transfer inflates the deposit total. Whether that owner draw should be excluded. These are judgment calls and they happen downstream of any data source, verified or not.

The document is just the most honest version of this problem. It doesn't pretend to be structured. It doesn't pretend the interpretation has been done. It's raw and messy and the underwriter knows that when they look at it. An API that returns clean JSON creates the illusion that the interpretation already happened. It didn't. Someone upstream made choices about field mapping and categorization and you inherited those choices without seeing them.

The PDF is annoying precisely because it forces the interpretation to happen at the point of decision. The underwriter reads the page. They see the transaction. They decide what it means. The abstraction layer between reality and decision is as thin as it can possibly be.


The industry keeps waiting for the document to go away. The document is not going away. Not because the technology isn't there. Because the document is the format that preserves optionality about what the data means. The moment you structure it, you've made choices. The document defers those choices to the person who needs to make them.

The companies that build around this reality instead of against it are the ones processing the next generation of deal flow. Not by eliminating documents. By making the interpretation fast, correctable, and traceable back to the page it came from.

Keep reading

View all