By Mick Hittesdorf, OneTick Senior Cloud Architect
During our recent webinar on the build vs. buy decision for market data infrastructure, one attendee asked a question that cut right to the heart of a debate happening in quant teams across the industry:
"Many — often non-technical — leaders are starting to ask if using GenAI coding has tipped the scale in favour of in-house build. What is your experience, and is there any data you can share?"
It's the smartest objection we've heard in a while. And it deserves a straight answer rather than a marketing deflection.
So here it is: GenAI does not change the build vs. buy calculus for market data infrastructure. Not materially. And once you understand why, the decision becomes significantly clearer.
Let's be fair. GitHub Copilot, Claude, and similar tools are genuinely impressive at accelerating code generation. If you need to write a data pipeline, scaffold a database schema, or produce boilerplate integration code, a skilled engineer using GenAI today can move meaningfully faster than they could three years ago.
So the premise of the question is reasonable: if engineering hours are the bottleneck, and GenAI compresses engineering hours, shouldn't building in-house become more attractive?
The problem is that engineering hours are not the bottleneck.
Building a self-provisioned market data platform requires solving two fundamentally different types of problems. The first is an engineering problem. The second is a data problem. GenAI helps with the first. It does nothing for the second.
The engineering problem includes things like: standing up infrastructure, writing ingestion pipelines, building database schemas, and creating query interfaces. These are real but solvable. GenAI can meaningfully accelerate them.
The data problem includes:
None of these are problems that GenAI can solve. You still need the data, the domain expertise to validate it, and the operational overhead to keep it current.
The attendee also asked about headcount. Here's the honest picture:
Building a self-provisioned market data platform from scratch requires 3–4 highly skilled, expensive engineers for the initial build. It also requires at least 1 dedicated operations resource for ongoing support, maintenance, and vendor management. That doesn't change with GenAI.
What GenAI might do is reduce the initial build time modestly — perhaps from 9 months to 6. But the ongoing operational burden remains the same. The data vendor contracts remain the same. The corporate action management remains the same. The symbol mapping updates remain the same.
Our webinar polling underscored this directly. Every respondent who had built a platform in-house reported a build time of more than 12 months before they could begin backtesting in earnest. That's a consistent data point across different firm types and sizes. We haven't seen evidence that GenAI has moved that needle yet in the market data domain specifically.
To be clear: we're not dismissing GenAI as a force in quantitative research. Quite the opposite.
Where GenAI creates real leverage is in what quants do with clean data once they have it — writing strategy code, prototyping factor models, generating and testing hypotheses faster. If your researchers have immediate access to high-quality, pre-normalised data via a Pandas-style Python API, GenAI tools can significantly accelerate the research loop.
The irony is that the 'buy' path — where your data is immediately available and queryable — actually amplifies the benefit of GenAI in your research workflow. The 'build' path, where months are consumed before clean data is accessible, delays the point at which GenAI can add value.
"GenAI can help your engineers write the pipeline code faster. What it can't do is give you 20 years of cleaned, validated tick history, managed corporate actions, or pre-built symbol cross-references across 200 markets. The code was never the hard part."
— Mick Hittesdorf, Head of Cloud Architecture, OneMarketData
If you're having this conversation internally — particularly with non-technical leaders who've heard the GenAI argument — here's a simple framework:
Ask what exactly GenAI is supposed to accelerate. If the answer is "the pipeline code," ask how long the data sourcing, cleaning, validation, and normalisation will take. If the answer is "all of it," probe harder. The data problems are not coding problems.
Separate one-time costs from ongoing costs. GenAI might compress the initial engineering sprint. It doesn't reduce the ongoing cost of maintaining a live data platform — vendor contracts, corporate action feeds, symbol mapping updates, infrastructure monitoring.
Price the opportunity cost. Every month spent building is a month not spent testing strategies. For a quant fund, the expected value of that delay is not trivial. It should be in the model.
Ask what your quants should be doing. Our polling found that 100% of respondents reported at least 20% of quantitative researcher time is lost to data cleaning and management tasks. At some firms, that figure exceeded 40%. This is not a data problem that GenAI is solving in the current generation of tools.
GenAI is a genuine accelerant for software engineering. It is not a solution to the market data problem. The sourcing, normalisation, validation, corporate action management, and ongoing operational overhead of a self-provisioned data platform remain what they were — resource-intensive, time-consuming, and fundamentally about data quality, not code quality.
The build vs. buy decision for market data infrastructure should be made on the same terms it always was: time to value, total cost of ownership, and the opportunity cost of your team's time. GenAI doesn't materially shift any of those.
OneTick Cloud was built specifically to eliminate the data problem — pre-normalised, cleaned, adjusted, and symbol-mapped data across 200+ global markets, accessible via Python, SQL, or REST API from day one. Your engineers' time, GenAI-assisted or otherwise, is better spent on the strategies and models that differentiate your firm.
Ready to see what Day 0 looks like? Create a free trial account at onetick.com/cloud-services and explore real sample databases across equities, futures, options, FX, and more — no infrastructure required.
Have a question about your specific use case? Get in touch.
Best wishes,
Mick Hittesdorf
Learn more at onetick.com or request a private demo here.