Our team inherited a tricky legacy project: a monolithic application with configuration management deeply tied to Puppet, inter-service authentication reliant on self-signed JWTs, and a frontend built with the once-popular CSS Modules. The goal was to containerize it and migrate it to Kubernetes. The straightforward approach of packaging it with a Helm Chart or Kustomize was quickly dismissed. The application’s startup and operational logic were far more complex than simple template rendering. For instance, some critical configurations didn’t exist statically in a code repository; they were dynamically pulled and generated by Puppet at runtime from an external configuration hub we’ll call the “Truth Service.” Additionally, the JWT signing keys required periodic rotation. And during frontend deployment, the backend API gateway needed to know the mapping between the hashed class names generated by CSS Modules and the original class names to handle some server-side rendering injections.
If we were to glue this complex operational logic together with a pile of shell scripts and CI/CD pipeline jobs, we would create a brittle and unmaintainable beast. In real-world projects, this kind of glue code is a primary source of technical debt. We needed a solution that could automate and declarative these processes in a cloud-native way, right inside Kubernetes. Ultimately, we decided to write a dedicated Kubernetes Operator for this “legacy application.”
Technical Pain Points and Initial Concept
The core conflict was that the world of Kubernetes is declarative, while our application’s operational knowledge was procedural, scattered across Puppet manifests and the minds of operations engineers.
- Dynamic Configuration Dependency: A key configuration file for the application,
legacy_config.ini, was dynamically generated by Puppet based on data from the “Truth Service.” In K8s, we couldn’t just run a Puppet Agent for the sake of starting a single Pod. - Key Lifecycle Management: The RSA private key for JWTs needed to be securely generated, stored, and made available for mounting by the application Pod. More importantly, when rotating keys, we had to ensure that both old and new keys coexisted for a certain period to allow for a smooth transition.
- Frontend-Backend Deployment Coupling: The frontend CI process would generate a
classmap.jsonfile (e.g.,{"button": "Button_button__aB3cD"}). This file needed to be uploaded somewhere and its location communicated to the backend during deployment so that the server could correctly process certain requests.
Using a simple combination of Deployments and ConfigMaps couldn’t meet these dynamic and coordination requirements. We needed a controller—a continuously running reconciliation loop—that would constantly observe the desired state (our custom resource) and the actual state (the Deployments, Secrets, ConfigMaps, etc., in the cluster) and take action to make them consistent. This is the essence of the Operator pattern.
Operator Design: Defining the LegacyApp Resource
Our first step was to design a Custom Resource Definition (CRD) to describe the desired state of this complex application. We named it LegacyApp.
# config/crd/bases/app.techcrafter.io_legacyapps.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: legacyapps.app.techcrafter.io
spec:
group: app.techcrafter.io
names:
kind: LegacyApp
listKind: LegacyAppList
plural: legacyapps
singular: legacyapp
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
replicaCount:
type: integer
minimum: 0
description: Number of desired pods.
backend:
type: object
properties:
image:
type: string
jwtSecretName:
type: string
description: "Name of the Kubernetes Secret to store JWT signing keys."
truthServiceEndpoint:
type: string
description: "Endpoint for the legacy truth service."
required: ["image", "jwtSecretName", "truthServiceEndpoint"]
frontend:
type: object
properties:
image:
type: string
cssModuleMapName:
type: string
description: "Name of the ConfigMap storing the CSS Modules class map."
required: ["image", "cssModuleMapName"]
required: ["replicaCount", "backend", "frontend"]
status:
type: object
properties:
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
lastTransitionTime:
type: string
reason:
type: string
message:
type: string
observedGeneration:
type: integer
activeJwtKeyId:
type: string
This CRD defines all the configuration items we care about: the backend image, the name of the Secret for JWT keys, the address of the legacy configuration service, the frontend image, and the name of the ConfigMap for the CSS Modules mapping. The .status field is used by the Operator to write back the actual state of the application, which is crucial for implementing a robust control loop.
Implementing the Core Reconciler Logic
We used Kubebuilder to scaffold the Operator project. The core work lies in the Reconcile method within the internal/controller/legacyapp_controller.go file. This method is our reconciliation loop; it gets triggered whenever a LegacyApp resource or any of its owned child resources change.
A common mistake is to write the Reconcile function as one massive, sequential script. In a real project, this becomes incredibly difficult to maintain. A better approach is to break it down into a series of independent, idempotent reconciliation functions, each responsible for a single child resource.
// internal/controller/legacyapp_controller.go
// ... imports ...
import (
"context"
"crypto/rand"
"crypto/rsa"
"crypto/x509"
"encoding/pem"
"fmt"
"time"
// ... other imports ...
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
appv1alpha1 "github.com/your-repo/legacy-app-operator/api/v1alpha1"
corev1 "k8s.io/api/core/v1"
appsv1 "k8s.io/api/apps/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
type LegacyAppReconciler struct {
client.Client
Scheme *runtime.Scheme
}
//+kubebuilder:rbac:groups=app.techcrafter.io,resources=legacyapps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=app.techcrafter.io,resources=legacyapps/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=core,resources=secrets;configmaps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
func (r *LegacyAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
// 1. Fetch the LegacyApp instance
var legacyApp appv1alpha1.LegacyApp
if err := r.Get(ctx, req.NamespacedName, &legacyApp); err != nil {
if errors.IsNotFound(err) {
// Resource has been deleted, no need to process
logger.Info("LegacyApp resource not found. Ignoring since object must be deleted.")
return ctrl.Result{}, nil
}
logger.Error(err, "Failed to get LegacyApp")
return ctrl.Result{}, err
}
// 2. Reconcile the dynamic configuration ConfigMap (fetched from Puppet's "Truth Service")
configMap, err := r.reconcileDynamicConfig(ctx, &legacyApp)
if err != nil {
logger.Error(err, "Failed to reconcile dynamic configmap")
// Update status to an error state and requeue
// ... status update logic ...
return ctrl.Result{}, err
}
// 3. Reconcile the JWT key Secret
jwtSecret, err := r.reconcileJWTSecret(ctx, &legacyApp)
if err != nil {
logger.Error(err, "Failed to reconcile JWT secret")
return ctrl.Result{}, err
}
// 4. Reconcile the backend Deployment
if _, err := r.reconcileBackendDeployment(ctx, &legacyApp, configMap, jwtSecret); err != nil {
logger.Error(err, "Failed to reconcile backend deployment")
return ctrl.Result{}, err
}
// 5. Reconcile the frontend Deployment
if _, err := r.reconcileFrontendDeployment(ctx, &legacyApp); err != nil {
logger.Error(err, "Failed to reconcile frontend deployment")
return ctrl.Result{}, err
}
// ... update status to Ready ...
logger.Info("Successfully reconciled LegacyApp")
return ctrl.Result{}, nil
}
// SetupWithManager sets up the controller with the Manager.
func (r *LegacyAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&appv1alpha1.LegacyApp{}).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Secret{}).
Owns(&corev1.ConfigMap{}).
Complete(r)
}
Step 1: Interfacing with Puppet’s “Truth Service”
The pitfall here is that we cannot let the Operator depend on an unstable external service. When interacting with external services, you must have robust timeout, retry, and error-handling mechanisms. We encapsulate this logic within the reconcileDynamicConfig function.
// internal/controller/reconcile_configmap.go
func (r *LegacyAppReconciler) reconcileDynamicConfig(ctx context.Context, app *appv1alpha1.LegacyApp) (*corev1.ConfigMap, error) {
logger := log.FromContext(ctx)
configMapName := fmt.Sprintf("%s-dynamic-config", app.Name)
// Assume we have a client for fetching data from the truth service
// truthServiceClient := NewTruthServiceClient(app.Spec.Backend.truthServiceEndpoint)
// configData, err := truthServiceClient.GetConfig()
// For this example, we'll simulate a hardcoded return value
configData := `
[database]
host = db.prod.svc.cluster.local
user = legacy_user
`
cm := &corev1.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: configMapName,
Namespace: app.Namespace,
},
}
// Use controller-runtime's CreateOrUpdate for an idempotent operation
op, err := ctrl.CreateOrUpdate(ctx, r.Client, cm, func() error {
// Update the ConfigMap's data here
cm.Data = map[string]string{
"legacy_config.ini": configData,
}
// Set an OwnerReference so this ConfigMap gets garbage collected when the LegacyApp is deleted
return ctrl.SetControllerReference(app, cm, r.Scheme)
})
if err != nil {
logger.Error(err, "Failed to create or update dynamic ConfigMap", "Operation", op)
return nil, err
}
logger.Info("Dynamic ConfigMap reconciled", "Operation", op)
return cm, nil
}
Note the use of ctrl.SetControllerReference. This is a best practice in Kubernetes Operator development. It establishes a parent-child relationship between the LegacyApp and the ConfigMap it manages, ensuring that the resource lifecycle is handled correctly.
Step 2: Managing the JWT Key Lifecycle
This part is security-sensitive. Key generation must be done inside the Operator, not passed in from an external source. The generated keys are stored in a Kubernetes Secret, which the application Pod consumes via a Volume Mount, avoiding the exposure of keys in environment variables.
// internal/controller/reconcile_secret.go
func (r *LegacyAppReconciler) reconcileJWTSecret(ctx context.Context, app *appv1alpha1.LegacyApp) (*corev1.Secret, error) {
logger := log.FromContext(ctx)
secretName := app.Spec.Backend.jwtSecretName
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: secretName,
Namespace: app.Namespace,
},
}
// Check if the Secret already exists
err := r.Get(ctx, client.ObjectKey{Name: secretName, Namespace: app.Namespace}, secret)
if err == nil {
// It exists, so we're done
logger.Info("JWT Secret already exists, skipping creation.")
return secret, nil
}
if !errors.IsNotFound(err) {
// Another error occurred while getting the secret
logger.Error(err, "Failed to get JWT Secret")
return nil, err
}
// The Secret doesn't exist, so let's create it
logger.Info("JWT Secret not found, creating a new one.")
privateKey, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
logger.Error(err, "Failed to generate RSA private key")
return nil, err
}
privateKeyBytes := x509.MarshalPKCS1PrivateKey(privateKey)
privateKeyPEM := pem.EncodeToMemory(&pem.Block{
Type: "RSA PRIVATE KEY",
Bytes: privateKeyBytes,
})
publicKeyBytes, err := x509.MarshalPKIXPublicKey(&privateKey.PublicKey)
if err != nil {
logger.Error(err, "Failed to marshal public key")
return nil, err
}
publicKeyPEM := pem.EncodeToMemory(&pem.Block{
Type: "PUBLIC KEY",
Bytes: publicKeyBytes,
})
secret.Data = map[string][]byte{
"private.pem": privateKeyPEM,
"public.pem": publicKeyPEM,
}
// We still need to set the OwnerReference
if err := ctrl.SetControllerReference(app, secret, r.Scheme); err != nil {
return nil, err
}
if err := r.Create(ctx, secret); err != nil {
logger.Error(err, "Failed to create new JWT Secret")
return nil, err
}
logger.Info("Successfully created new JWT Secret")
return secret, nil
}
The logic here is simple: if the Secret doesn’t exist, create it. If it exists, do nothing. A more complete implementation would include key rotation logic: for example, checking the key’s creation timestamp and, if it exceeds a certain age, generating a new key, placing it in the Secret under a different key (e.g., private-new.pem), updating the activeJwtKeyId in LegacyApp.Status, and then removing the old key after a grace period.
Step 3: Reconciling the Backend and Frontend Deployments
This is where all the pieces come together. The backend Deployment needs to mount the dynamic config ConfigMap and the JWT Secret. The frontend Deployment needs to mount the ConfigMap containing the CSS Modules map.
graph TD
A[Reconcile Loop Triggered] --> B{LegacyApp CR};
B --> C[reconcileDynamicConfig];
C --> D[Fetch from Truth Service];
D --> E[Create/Update ConfigMap: dynamic-config];
B --> F[reconcileJWTSecret];
F --> G{Secret Exists?};
G -- No --> H[Generate RSA Key];
H --> I[Create Secret: jwt-keys];
G -- Yes --> J[Done];
I --> K;
J --> K;
subgraph "Backend Reconciliation"
K[reconcileBackendDeployment];
K --> L[Construct Deployment Spec];
L --> M[Mount dynamic-config];
L --> N[Mount jwt-keys];
M & N --> O[Create/Update Backend Deployment];
end
subgraph "Frontend Reconciliation"
B --> P[reconcileFrontendDeployment];
P --> Q[Find ConfigMap: css-module-map];
Q --> R[Construct Deployment Spec];
R --> S[Mount css-module-map];
S --> T[Create/Update Frontend Deployment];
end
Below is a snippet of the backend Deployment construction code, showing how dependencies are injected into the Pod.
// internal/controller/reconcile_deployment.go
func (r *LegacyAppReconciler) reconcileBackendDeployment(ctx context.Context, app *appv1alpha1.LegacyApp, configMap *corev1.ConfigMap, secret *corev1.Secret) (*appsv1.Deployment, error) {
// ...
deployment := &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-backend", app.Name),
Namespace: app.Namespace,
},
}
_, err := ctrl.CreateOrUpdate(ctx, r.Client, deployment, func() error {
// ... (set Replicas, Selector, etc.)
// Key part: define the Volumes
deployment.Spec.Template.Spec.Volumes = []corev1.Volume{
{
Name: "dynamic-config-vol",
VolumeSource: corev1.VolumeSource{
ConfigMap: &corev1.ConfigMapVolumeSource{
LocalObjectReference: corev1.LocalObjectReference{
Name: configMap.Name,
},
},
},
},
{
Name: "jwt-key-vol",
VolumeSource: corev1.VolumeSource{
Secret: &corev1.SecretVolumeSource{
SecretName: secret.Name,
},
},
},
}
// Key part: define the container and its Volume Mounts
deployment.Spec.Template.Spec.Containers = []corev1.Container{
{
Name: "backend-app",
Image: app.Spec.Backend.Image,
Ports: []corev1.ContainerPort{{ContainerPort: 8080}},
VolumeMounts: []corev1.VolumeMount{
{
Name: "dynamic-config-vol",
MountPath: "/etc/app/config", // Mount config file into the container
ReadOnly: true,
},
{
Name: "jwt-key-vol",
MountPath: "/etc/app/keys", // Mount JWT keys
ReadOnly: true,
},
},
// Tell the app where to find the key file via an environment variable
Env: []corev1.EnvVar{
{
Name: "JWT_PRIVATE_KEY_PATH",
Value: "/etc/app/keys/private.pem",
},
},
},
}
return ctrl.SetControllerReference(app, deployment, r.Scheme)
})
// ... error handling ...
return deployment, nil
}
The logic for the frontend Deployment is similar, except it mounts the ConfigMap specified by app.Spec.Frontend.cssModuleMapName. The assumption here is that the CI/CD process is responsible for pushing the classmap.json into this ConfigMap. The Operator’s job is to ensure this ConfigMap is correctly mounted into the frontend service Pod so it can be read at runtime (e.g., for server-side rendering).
Final Outcome and the Shift in Operational Paradigm
With this Operator, we completely transformed the operational model for this application.
Before:
- Ops manually triggered Puppet or used scripts.
- Puppet pulled configs from the Truth Service and wrote them to server files.
- Ops manually generated or rotated JWT keys and distributed them to application servers.
- After a frontend release, a backend config file had to be updated manually or via a script.
- The entire process was risky and difficult to roll back.
Now:
- Developers/SREs only need to maintain a single
LegacyAppYAML file. - They commit this YAML to a Git repository, and a GitOps tool (like ArgoCD) automatically applies it to the cluster with
kubectl apply. - The Operator automatically handles all the coordination: pulling dynamic configs, creating and managing keys, and deploying and configuring the frontend and backend applications.
- Updating image versions, replica counts, or even changing the name of the key Secret is just a matter of changing a field in the YAML file. The system automatically converges to the new desired state.
# A developer would apply this file to deploy/update the application
apiVersion: app.techcrafter.io/v1alpha1
kind: LegacyApp
metadata:
name: my-legacy-service
namespace: prod
spec:
replicaCount: 3
backend:
image: my-registry/legacy-backend:v1.2.1
jwtSecretName: legacy-service-jwt-keys
truthServiceEndpoint: "http://truth-service.internal:8080/config/legacy-app"
frontend:
image: my-registry/legacy-frontend:v2.5.0
cssModuleMapName: frontend-v2.5.0-css-map
A single kubectl apply -f command now triggers a complex, codified, and testable set of operational logic executed by the Operator.
Lingering Issues and Future Iterations
This Operator solution is not a silver bullet. First, it still depends on that legacy “Truth Service.” In the long run, the configuration logic from this external service should be gradually migrated into Kubernetes ConfigMaps or dedicated CRDs, eventually eliminating the external dependency altogether.
Second, the current JWT key management only handles the initial creation and doesn’t implement automatic rotation. A complete implementation would require adding timed logic to the reconcile loop to check key age and perform a safe, buffered rotation process, which would significantly increase the code’s complexity.
Finally, the robustness of the Operator itself needs continuous refinement. For example, failure handling for calls to external services requires more sophisticated backoff strategies (like exponential backoff). Monitoring and reporting the status of child resources back into the LegacyApp.Status field also needs more detailed design so users can clearly see each stage of the deployment and any potential issues. Unit and integration testing for the Operator are also critical to ensure the reconciliation logic works correctly under all edge cases.