The Ghost of Marketing: Inside the Secret Economy of Dead Data

At a Mumbai-based e-commerce startup, the marketing team recently discovered a sobering truth: their customer database had become a digital ghost town. Millions of old email addresses, inactive user profiles, and outdated leads sat idle on their servers – untouched by any campaign in years, yet quietly racking up costs. Such “ghost data” – the abandoned or unused customer information haunting company systems – is far from unique to one startup. Across the globe, businesses are hoarding troves of data that no one uses, and the consequences are mounting. These forgotten records carry very real financial costs, contribute to an environmental toll, and even bog down operations. In an era where data is touted as the new oil, companies are learning that crude data stockpiling can backfire.

Ghost data goes by many names – some analysts call it “dark data,” referring to all the information collected and stored during business activities that fails to see the light of day in decision-making or analytics. By some estimates, it makes up more than half of all data out there. A study found that about 52% of stored data worldwide is dark data. In fact, the world’s data volume is projected to hit 175 zettabytes by 2025 – that’s 175 trillion gigabytes – and roughly 91 zettabytes of that will be “dark” information stashed away unused. To put it in perspective, if all that data were burned onto DVDs, the stack would reach the moon multiple times over. Companies have amassed these digital stockpiles under the assumption that more data is inherently valuable. The phrase “data is the new oil” created a misconception that all pieces of data have value and so it should be stored, leading to massive data junkyards inside organizations.

The scale of ghost data within marketing technology (MarTech) systems is staggering. Customer relationship management (CRM) databases, email marketing lists, and user analytics platforms are bloated with inactive entries. Email contact lists naturally degrade by about 22–23% each year as people change jobs or abandon addresses. Mobile app data shows an even more dramatic pattern: over 90% of a typical app’s users are dormant or inactive after the initial engagement burst. Yet businesses often keep all those contacts in their systems indefinitely, paying to store and manage far more records than they actually leverage. In other words, companies are sinking money into maintaining data that sits rotting like unused inventory.

The Mounting Financial Costs

Storing data isn’t free – and ghost data can become a silent budget drain. Cloud providers charge for every gigabyte. On Amazon Web Services, for example, the popular S3 storage service costs about $0.023 per GB per month for standard storage. That sounds trivial, but it adds up quickly at scale. One terabyte of data costs about $23 a month to keep in S3 – about $276 per year, or roughly ₹23,000. Many mid-sized firms easily accumulate tens or hundreds of terabytes across logs, customer profiles, and campaign data archives, much of it never accessed again. Suddenly, tens of thousands of dollars are being spent annually just to warehouse “ghost” records. In India, where cloud infrastructure usage is soaring, companies are feeling the pinch. Over half of Indian businesses reported that their cloud expenses overshot initial estimates. Part of that surprise comes from the pay-as-you-go model: every forgotten subscriber list on AWS or inactive user bucket on Azure keeps silently accruing fees month after month.

Storing idle data can be even more expensive on specialized platforms. Take CRMs like Salesforce – widely used by marketing and sales teams worldwide. Salesforce includes only a modest amount of data storage in its base subscriptions (often as little as 10 GB), after which the charges skyrocket. Purchasing additional Salesforce data storage can cost around $125 per month for just 0.5 GB. In other words, a single extra gigabyte beyond your license allotment might run $250 per month, or $3,000 a year. Companies that don’t prune old leads and accounts can find themselves facing exorbitant CRM bills as their databases bloat. The hidden cost of data hoarding shows up not only in direct cloud bills but also in software subscription tiers, backup storage, and longer processing times for any operation over the dataset.

Marketing platforms built by Indian firms are not immune to this either. Homegrown MarTech providers like Netcore Cloud, MoEngage, CleverTap, and others host engagement data for thousands of brands – from messaging histories to user attributes. These services often price their product based on the number of user profiles or subscribers a client manages. The more contacts in the system (active or not), the higher the fee. An Indian retailer with 10 million customer profiles in a marketing automation tool might be paying for all 10 million, even if only 2 million are actively clicking or purchasing. The inactive 8 million sit dormant, inflating the database and the invoice. This is why data retention policies are becoming a selling point. For instance, MoEngage’s documentation highlights that beyond a certain window, old data has diminishing returns – for a good number of businesses, user activity beyond 60 days provides minimal value. MoEngage by default archives or deletes anonymous user data, inactive user attributes, and event logs after a few months unless a client requests longer retention. Other platforms like Netcore have similarly encouraged clients to scrub “dormant” contacts that haven’t engaged in long periods, to both improve campaign performance and cut unnecessary costs. Unfortunately, not every company takes that advice. Many marketers continue to blast out emails to huge lists where 85% of addresses show no response, until eventually the sender reputation is harmed and they quietly stop – by which time they’ve paid to message ghosts and may even pay again to reacquire some of the same customers via ads.

The financial waste from ghost data adds up dramatically at the macro level. Reports suggest that enterprises globally waste up to $2.5 million annually on storage and infrastructure for data they never actually use. Collectively, businesses are pouring billions of dollars into preserving these digital dead weights. That’s money that could be invested in analytics on valuable data or in new customer acquisition – but instead it’s spent maintaining colossal databases full of “old archives_final_REALLY_final” files and dormant accounts.

An Operational and Legal Time Bomb

Beyond direct costs, ghost data imposes operational burdens. As databases swell with irrelevant records, everything slows. Queries take longer to run. Dashboards choke on sheer volume. Teams spend more time sifting signal from noise. One global bank, for example, found its CRM reports taking hours because the system was crunching through years of stale entries; archiving 20% of the oldest records dramatically sped things up. There are also hidden risks: old data can become a security liability. You can’t secure what you don’t see – forgotten data often isn’t monitored or patched. In the event of a breach, those forgotten troves can be low-hanging fruit for cybercriminals. And under tightening data protection laws, hanging onto customer data without purpose can be outright illegal.

This is a critical moment in India on that front. The country’s new Digital Personal Data Protection Act (DPDPA) 2023 has introduced explicit provisions about data retention and deletion. For the first time, Indian companies are broadly required to erase personal data once the purpose for collecting it is fulfilled, unless law requires continued storage. In the past, firms in India often kept customer data indefinitely by default – there was no strong legal reason to purge it, and “what if we need it later?” was the prevailing attitude. That era is ending. Now, holding onto unnecessary personal data could expose companies to compliance penalties. The DPDPA enforcement regime is still taking shape, but fines for not deleting data could go up to ₹50 crore or more in serious cases. Essentially, Indian marketers will have to justify every bit of personal data they keep. This is pushing companies to implement data lifecycle policies – something many European firms had to do after GDPR. An executive at a Mumbai retail chain noted that they’ve begun automatically deleting customer profiles who haven’t engaged in 24 months, to minimize liability. The law is forcing a mindset shift from limitless data retention to thoughtful minimization.

Interestingly, good data hygiene can pay off beyond just avoiding fines. It often boosts marketing outcomes. When you trim out the dead contacts, your email open rates climb (since you’re no longer diluting metrics with unengaged recipients), and CRM workflows run smoother. It can also clarify your true audience size for better campaign targeting. Salesforce itself has published guidance reminding clients that storing unused data isn’t just inefficient, it’s unsustainable. By prioritizing smart data management – keeping what’s useful, cutting what isn’t – businesses can both save money and sharpen their competitive edge.

A Growing Carbon Footprint in the Cloud

One often overlooked cost of ghost data is environmental. Data might seem weightless, but storing and serving it demands electricity – lots of it. Global data centers already consume about 1–4% of the world’s electricity and that share could double by 2030. Every unused gigabyte sitting on a server contributes to this energy draw. The electricity required to save 1 GB of data to the cloud is around 0.0078 kWh per month, or roughly 0.1 kWh per year. It sounds tiny, but multiply it by the billions of GB in those ghost archives and the power usage becomes enormous. Storing 1 TB of data in a data center for a year emits roughly 40 kg of CO2, given the typical energy mix. So our example of a company passively hoarding 100 TB of stale data is effectively responsible for about 4 tons of CO2 emissions yearly – equivalent to the emissions from driving a car over 16,000 kilometers.

At the macro scale, digital “dark data” is a significant climate problem. Dark data storage worldwide generates about 5.8 million tonnes of CO2 annually – as much greenhouse gas as 1.2 million cars put out in a year. A typical data-driven company with 100 employees might produce about 2,983 GB of dark data per day, and keeping that for a year has a carbon footprint equal to six flights from London to New York. Aggregated globally, companies are collectively churning out 1.3 billion GB of new dark data every day, equivalent to the emissions of over 3 million transatlantic flights per year. These analogies drive home the point: the cost of idle data isn’t just on the ledger, it’s in the atmosphere.

Data centers in India contribute to this too. India has over 130 major data centers and counting, serving the country’s exploding digital needs. The climate impact of inefficient data storage is drawing notice. Dark data alone can generate millions of tonnes of CO2 annually, and these phantom bits also accelerate e-waste as more hardware is needed to store and cool them. It’s a double whammy: wasted energy and more servers that eventually become scrap. Indian enterprises, many of which pride themselves on corporate sustainability pledges, are starting to realize that cleaning up data clutter is part of being “green.” As one IT manager at a Bangalore tech firm quipped, “We spent years swapping light bulbs for efficiency – meanwhile our data lakes were silently burning power in some far-off cloud.” Now, that firm is aggressively shifting old logs to cold storage and deleting what’s truly not needed, in line with a broader carbon footprint reduction plan.

The irony is that storing data has become so cheap and easy that it encouraged mindless hoarding. Yet those pennies per gigabyte have scaled into huge electric bills and environmental externalities. As one researcher observed, the cost of storing data gets cheaper every year, but implementing good-quality protocols for purging it doesn’t get any easier. Companies haven’t invested the same effort into deleting data as they did into collecting it. The result is an unseen glut with real-world impacts. Now, a combination of cost pressure, climate responsibility, and privacy regulation is forcing a reckoning.

Confronting the Ghosts

Reckoning with ghost data requires a change in both technology and mindset. On the tech side, businesses are adopting tools to automatically archive or delete data after a certain period of inactivity. Cloud providers like AWS, Azure, and Google Cloud offer lifecycle management policies – you can set your storage bucket to auto-delete objects older than 180 days, or move them to ultra-cheap “cold” storage tiers like AWS Glacier. These colder tiers cost a fraction of standard storage, which is appropriate for data you legally need to retain but will likely never touch. Many firms are finally availing these options to shrink active storage footprints. Salesforce customers increasingly use third-party archiving solutions to offload historical CRM records out of the expensive core CRM and into cheaper databases. Indian SaaS companies are also rolling out features to tackle bloat: for example, MoEngage allows clients to define a Data Retention Policy so that user data beyond, say, one year is automatically purged or anonymized. Such features acknowledge that not all data deserves eternal life in the system.

However, technology is only half the battle – the culture around data value needs an adjustment. Companies must get comfortable with the idea that letting go of data can be beneficial. This runs against years of “store everything, just in case” instinct. It requires collaboration between legal, IT, and marketing teams to define what data truly matters and what has expired. Increasingly, the leading practice is to implement a data governance framework that classifies data types and assigns expiration dates to each. Knowledge management can act like a smart organizer for all the information organizations have. It reduces waste storage, cuts costs, reduces environmental impact, and ensures that the information kept supports better decision-making. In other words, you proactively decide what’s worth keeping and jettison the rest – much as a well-run warehouse discards obsolete inventory to make room for what sells. Some companies are even creating internal “data stewardship” roles tasked with identifying dark data and championing its disposal or reuse.

The incentives to do this are aligning as never before. Financially, every terabyte you delete is immediate savings on your next cloud bill. Operationally, slimmer databases mean faster software and less risk. Legally, it keeps you on the right side of privacy laws. And environmentally, you’re shrinking your digital carbon footprint. It’s a win-win-win-win, yet it requires overcoming inertia. The fear of deleting anything is still real in many boardrooms – what if an old record could one day unlock a new insight? But if you haven’t touched a dataset in five years and it wasn’t collected for a regulated purpose, the odds are it’s safe to let it go. Some firms are finding ways to extract any latent value before deletion – for instance, running one last analysis on old customer notes or support tickets to glean trends and then discarding them. This way they maximize value and minimize storage load.

Around the world, awareness is growing. Industry groups are talking about “data minimization” as the next big push in sustainable tech. Europe’s GDPR enshrined it; now India’s new law will enforce it locally. Cloud vendors are introducing carbon dashboards to show clients how much CO2 their storage use corresponds to – an implicit nudge to cut down on excess. And marketers are starting to measure list quality over list quantity. Why brag about a million contacts if half are essentially ghosts? Far better to have a lean 500,000 that actually respond, while saving money and energy on the rest.

In the coming years, purging ghost data may become as routine as having antivirus protection. The companies that thrive will treat data as a living asset – to be refreshed, cleaned, and cleared out when it expires. Those that cling to everything will drown in clutter and costs. The hidden expenses of abandoned data are finally coming into the light, and they are too big to ignore. As one Indian CIO put it, “Our data was supposed to be an oil well, but we realized we were paying to store barrels of sludge.” The message is clear: it’s time to exorcise those digital ghosts. Smart marketers will invest in keeping their data repositories clean, current, and compliant, ensuring that every byte they keep has a purpose. By doing so, they’ll save money, reduce risk, and even do a small part in saving the planet – proving that sometimes, the best data strategy is knowing when to let go.