Several years into the Generative AI revolution, adoption shows no signs of slowing down. On the contrary - AI initiatives are prioritized everywhere, with budget allocations skyrocketing.
And yet, despite this excitement and apparent growth, Gartner estimates that through 2027, 60% of GenAI projects will be abandoned after proof of concept (POC), and at least 50% will overrun their budgeted costs. What is it exactly about AI deployments that results in these unexpected costs and makes ROI projections so difficult to realize? Why are so many demos and forecasts misleading?
There are a few obvious factors contributing: It is difficult to both isolate AI’s impact from other business factors, and to attribute AI costs to the correct sources. Depending on the project, organizations are also facing long implementation timelines before seeing tangible results. And then there are all the hidden costs across the adoption lifecycle that only become apparent along the way. Those hidden costs are exactly what this blog will try to uncover, along with strategic guidance on how to tackle these head on.
Data readiness represents one of the most fundamental tasks but also most underestimated chunks of costs. To unlock maximum value from AI initiatives, pretrained models are usually combined with enterprise or domain data. Crucially, as LLMs have shifted the focus from structured to unstructured data (text, image, audio, video) for both training and inferencing, many organizations struggle with accessibility and availability. This often comes down to a process problem – data that is not ingested in the right way, that doesn’t land in the correct spot, or is not even digitized in the first place. And as with all data, quality and formatting have a significant impact on the performance of models and applications that it is fed into (“garbage in, garbage out”). While demos or tutorials show perfect data flows, production systems deal with:
Legacy databases that can’t be easily vectorized
Real-time data that changes while AI is processing
Multiple data sources with different formats
Inconsistent data quality that breaks AI models
To ensure there is sufficient data to work with in the first place, many organizations need to retool their data processes and spend on data acquisition or synthetic data generation. This is followed by data cleansing, labelling, integration, governance (e.g. lineage), data storage (e.g. vector databases) and data observability costs. There is no way around these efforts as poor data quality can break models, increase token usage, reduce accuracy, and thus lead to higher costs down the line. Data readiness requires ongoing maintenance and investments instead of a one-time cost, and should be accompanied by best of breed solutions that automate many of these tasks. Enterprises that prioritize preprocessing – those who treat data pipelines with the same rigor as software engineering – build resilience and reliability that prevent a lot of headaches and (cost) down the line.
Many vendors across the AI value chain adapt their pricing models to pass on higher-than-expected costs to customers – especially those developing, training, and/or running LLMs. These pricing models may result in a negative ROI for many seemingly high-value use cases (e.g., productivity) when deployed at scale. This is despite vendors subsidizing pricing to gain early market share.
Some of the most common pricing models and their challenges:
User based: This is per seat per month or per year. Look out for vague terms included such as usage limitations on top of the user cost.
Platform based: Normally, monthly or annual fees to access a platform make for predictable costs – however, also here organizations may run into poorly defined limitations such as compute, storage, or number of seats that drive up the price when exceeded.
Usage or outcome based: Usage is usually determined either by credits/tokens for a unit of the product or service consumed (beware of “credit multipliers”) or by compute resources. Compute resources could be GPU capacity needed to run an AI model, with costs determined by e.g. number of API calls or volume of data processed. The rise in AI agents has pushed boundaries further by charging fees tied to achieving a pre-specified outcome. The biggest challenge here is that usage and demand are often difficult to determine beforehand, meaning these models require a higher risk tolerance.
The evolution of pricing models forces organizations to map out usage scenarios that take into account typical and extreme usage, and to define early on what success looks like by establishing metrics and governance for outcome-based pricing. While vendor lock-in should always be avoided, long-term partnerships that allow for flexibility and fungibility in credit-based contracts may be favorable. This includes the possibility of credit rollovers and swap rights in case of usage that differs from projections.
Securing AI systems is an expensive matter. Getting it right requires brainpower and tech investments that add to the existing security stack. Get it wrong and you are looking at even higher costs through breaches, regulatory noncompliance, and reputational damage. Some of the drivers for breaches include shadow AI, expanding attack surfaces through e.g. more APIs and uncontrolled data exchange between AI systems and external tools, as well as autonomous behavior without any guardrails or humans in the loop. Mitigating these is easier said than done, and the dynamic nature of AI brings challenges that are unique enough to require unique sets of policies and solutions.
While AI platforms now include built-in guardrails to catch problematic queries or responses in real time, they cannot make up for broader governance and security mechanisms across the organization. Data, for example, needs to be secured across its lifecycle - controlling access, masking sensitive information, and ensuring that privacy and compliance are built into every workflow. This blurs the lines between what is security for AI and what is not and makes it difficult to assign costs.
Organizations securing the AI stack will need to take advantage of evolving offerings of their existing vendors and complement these with emerging solutions that tackle novel threat vectors that are specific to AI. Finally, this will have to be rounded off by internal capabilities to develop strong AI policies that can be enforced across the organization. Cybersecurity, like data, will be a constant effort requiring ongoing investments.
There are multiple different ways to measure the performance of AI systems. Two factors that can get particularly costly are latency and inaccuracy.
While tutorials and demos often show instant responses, production systems face increased response times of several seconds for complex queries, rate limits that throttle requests, regional availability issues, and model capacity constraints during peak hours. This can have far-reaching consequences for real-time use cases and drive up opportunity cost. Any regular user of AI systems can attest that even waiting 10 seconds for a response or an action can be a dealbreaker and lead to e.g. abandoned carts for ecommerce firms.
At the same time, hallucinations are still a big problem and guaranteeing accurate outputs is difficult. Depending on what AI outputs are used for – false legal advice, wrongful customer service exchange, investment decisions – hallucinations can carry a hefty price tag if trusted blindly.
Minimizing both latency and inaccuracy will require additional tooling. Observability platforms help by making hallucinations visible and measurable - capturing inputs, outputs, and traces to detect when they occur, linking them to root causes like poor retrieval, and tracking their cost and quality impact. Other solutions such as model routers can evaluate each query in real time and match it with the most appropriate model. This adaptive approach increases accuracy and reduces operational costs by avoiding unnecessary use of expensive resources.
Over the past couple of years, the narrative that AI systems will fully replace human workers has picked up a lot of momentum. While it may be true that more and more tasks can be automated by AI, this has created a dangerous expectation of quick cost cutting measures. This takes out of equation both the skills and the stack of tools required to power these AI systems.
Costs for skills and knowledge include talent, as well as upskilling & training for: Orchestration frameworks, business logic design, programming, prompt engineering, AI/ML sdks, apis & integrations etc.
Costs for an AI stack include platforms tackling data readiness, orchestration, foundation models, governance/security, infrastructure, product cost, shadow IT, onboarding etc. These are expensive solutions whose consumption may be difficult to predict
In most cases, shifting work from humans to AI means shifting costs, not eliminating them. This highlights both the importance of budgeting accurately, and that using AI to scale intelligently beats quick cost cutting measures.
Infrastructure costs can vary greatly depending on the method of deployment. A big tradeoff exists when self-hosting LLMs, between increased control & data privacy, and the high costs of cloud, data center and edge infrastructure to support model customization and inferencing. The most underestimated aspect of self-hosting might be the skills and talent needed to run GenAI applications at scale, given the stage of engineering tools and deep systems knowledge required.
As described in one of our previous blogs, while emerging solutions such as model compression and acceleration platforms can create efficiencies in production, organizations need to carefully evaluate the delivery method of their AI deployment. Depending on the use case, an off-the-shelf solution that comes as a managed service could be more effective. This is also where end-to-end platforms like Palantir or Dell’s AI factory have showcased value by bringing all relevant processes and data together in a unified product. These decisions should be made by evaluating capacities for upfront investment, ongoing maintenance and expertise, and under the consideration that costs and complexities can escalate with model size and usage volumes.
Achieving meaningful ROI from AI deployments is notoriously difficult. Unlike other more established technologies, there are few proven playbooks—every deployment is shaped by an organization’s unique environment, data posture, and use cases. To complicate matters further, frameworks and best practices continue to evolve rapidly, making it hard to lock in a stable approach.
In the rush to capture the upside potential of AI, many organizations focus on short-term gains while underestimating the long-term, recurring costs. Yet these ongoing costs are what ultimately determine whether an initiative will deliver sustainable value. Some of the biggest drivers are the pace of consumption (tokens, licenses, compute and energy usage), vendor dependency (renewals, multipliers, and lock-in), and data (acquisition, preparation, storage, and governance).
The good news: these challenges can be managed. It requires a deliberate mix of smart forecasting and budgeting, thoughtful product and delivery choices, and an efficient, well-integrated AI stack. Here’s how to tackle the hidden cost problem head-on:
Scale intelligently: Start with simpler implementations where possible, then iterate based on real user feedback and scale up from there.
Measure everything: Collecting data and gaining visibility is key to optimizing system performance. Consider response quality, user satisfaction, cost per interaction, error rates etc.
Plan for failure: AI systems can fail in ways that are difficult to predict. Building comprehensive error handling, fallback mechanisms and graceful degradation from the start will help.
Budget for reality: Budget for a lot more than your initial estimates and implement cost controls that can automatically limit spending. Evaluate different pricing models.
Don’t cut corners: Even though it might be tempting to put off security or data efforts, doing so will undoubtedly lead to a messy implementation and expensive setbacks.
Leverage technology: There are solutions emerging that tackle the resource intensive nature of AI. Leveraging these can pay off handsomely in production.
Since ROI is calculated by looking at business value realized and the costs incurred in Gen AI deployment and business adjustments, defining how business value is measured is critical. This will vary based on use case and may only materialize over time: e.g. improved productivity leading to enhanced customer engagement leading to higher conversion rates.
If you’re curious to learn more or want to stay on top of the latest enhancements in this space, feel free to reach out to us at innovation@trace3.com.
Lars is an Innovation Researcher on Trace3's Innovation Team, where he is focused on demystifying emerging trends & technologies across the enterprise IT space. By vetting innovative solutions, and combining insights from leading research and the world's most successful venture capital firms, Lars helps IT leaders navigate through an ever-changing technology landscape.