Solving Helm Repository Performance Issues with Automated Chart Cleanup in ChartMuseum

A real-world DevOps troubleshooting guide on reducing Helm repository size, improving CI/CD performance, and automating old chart cleanup in ChartMuseum using retention policies.

Sowmya N | TechGalary

5/7/20263 min read

Introduction

If your Jenkins pipelines suddenly start failing during Helm operations with errors like: fatal error: runtime: out of memory

while executing commands such as: helm fetch --untar

then your Helm repository itself might be the hidden problem.

We recently faced this exact issue while using ChartMuseum as our internal Helm repository. After investigation, we discovered that the actual root cause was an oversized index.yaml file caused by years of accumulated Helm chart versions.

In this blog, I’ll explain:

What caused the issue
Why Helm repositories become slow over time
How this affected Jenkins pipelines
The cleanup strategy we implemented
Best practices for Helm chart retention

The Problem

Our CI/CD pipelines continuously published new Helm chart versions during every deployment.

Example:

my-app-1.0.1.tgz

my-app-1.0.2.tgz

my-app-1.0.3.tgz

Since we were using ChartMuseum, every published chart version was added to the repository metadata file: index.yaml

Over time:

thousands of old chart versions accumulated
index.yaml became extremely large
Helm repository operations became slower
Jenkins agents started consuming excessive memory

Eventually, some Jenkins builds failed with: fatal error: runtime: out of memory during: helm fetch --untar

Root Cause Analysis

Initially, it looked like a Jenkins memory issue.

But after deeper analysis, the real problem turned out to be:

oversized Helm repository metadata
excessive old chart versions
high memory usage while Helm processed repository indexes

This especially impacts:

Kubernetes-based Jenkins agents
low-memory CI/CD containers
parallel build environments

The Helm repository itself had become bloated over time.

Existing Repository Structure

We were storing:

development charts
QA releases
production releases
snapshot builds

inside ChartMuseum.

Example:

my-app:
1.0.1
1.0.2
1.0.3
...
1.0.742

Most historical versions were no longer required, but they still remained inside the repository metadata.

The Solution

To solve this issue, we decided to implement an automated Helm chart cleanup policy using the ChartMuseum DELETE API.

The plan was simple:

Read all chart versions
Keep only the latest required versions
Delete older unused versions
Reduce the size of index.yaml

Understanding ChartMuseum DELETE API

ChartMuseum provides APIs to manage chart versions.

This API became the foundation of our cleanup automation.

Cleanup Strategy

We introduced retention policies based on environments.

Environment Retention Policy

Dev Charts Keep latest 5

QA Charts Keep latest 10

Production Charts Keep latest 20

This helped us:

reduce repository size
improve Helm performance
stabilize Jenkins pipelines
reduce memory usage

Cleanup Pipeline Design

Instead of manually deleting charts, we created a separate cleanup pipeline.

Pipeline Flow

This allowed us to safely clean historical data without affecting active deployments.

Important Safeguards

Before deleting any charts, we added several safety checks.

1. Dry Run Mode

Initially, the cleanup pipeline only printed what would be deleted:

DRY_RUN=true

This helped validate cleanup logic safely.

2. Protect Active Versions

We ensured the cleanup never deleted:

currently deployed versions
stable production releases
tagged release builds

Example:

helm list -A

3. Semantic Version Sorting

Simple string sorting can produce incorrect results:

1.0.10
1.0.2

Instead, we used semantic version sorting to correctly identify older versions.

Why We Did Not Integrate Cleanup Directly Into the Publish Pipeline

Initially, we considered deleting old versions immediately after publishing new charts.

Example:

Publish chart
↓
Delete oldest version

However, we decided against this approach.

Reasons

rollback versions may still be required
failed deployments can create operational risks
debugging becomes difficult if versions disappear immediately

Instead, we implemented:

scheduled cleanup jobs
configurable retention policies
safer operational maintenance

Results After Cleanup

After implementing automated chart cleanup:

Helm repository size reduced significantly
index.yaml became much smaller
Jenkins OOM failures disappeared
Helm fetch operations became faster
CI/CD stability improved

Best Practices

If you are managing Helm repositories at scale, I strongly recommend:

Use Retention Policies

Never allow unlimited chart accumulation.

Separate Cleanup Jobs

Avoid cleanup logic inside deployment pipelines.

Use:

Jenkins scheduled jobs
Kubernetes CronJobs
maintenance pipelines

Keep Production Releases Longer

Development charts can be aggressively cleaned, but production releases should have longer retention periods.

Monitor Repository Growth

Track:

index.yaml size
repository response time
Helm fetch latency
chart count growth

Final Thoughts

Helm repositories are often ignored until they start affecting CI/CD performance.

A simple automated cleanup strategy can:

improve Jenkins stability
reduce memory usage
speed up Helm operations
keep repositories manageable

For teams using ChartMuseum in enterprise environments, implementing Helm chart retention policies should be considered an essential operational practice rather than an optional optimization.

Solving Helm Repository Performance Issues with Automated Chart Cleanup in ChartMuseum

Introduction

The Problem

Root Cause Analysis

Existing Repository Structure

The Solution

Understanding ChartMuseum DELETE API

List all charts

Get chart versions

Delete specific chart version

Cleanup Strategy

This helped us:

Cleanup Pipeline Design

Pipeline Flow

Important Safeguards

1. Dry Run Mode

2. Protect Active Versions

3. Semantic Version Sorting

Why We Did Not Integrate Cleanup Directly Into the Publish Pipeline

Reasons

Results After Cleanup

Best Practices

Use Retention Policies

Separate Cleanup Jobs

Keep Production Releases Longer

Monitor Repository Growth

Final Thoughts

Contact