Your AI app has 15 different LLM calls scattered across 8 services. Product wants to add cost tracking per user. You start digging through the code and realize there's no single place to instrument these calls. They're embedded directly in business logic, each with slightly different error handling, different timeout values, different retry strategies.

You're looking at touching every service, every endpoint, every integration. What should've been a one-line configuration change becomes three days of hunting down call sites and praying you didn't miss any.

Everyone's building AI features. Nobody's thinking about the structure that makes those features maintainable.

Before you learn SOLID principles, before you apply design patterns, there's something more fundamental: how you organize code so changes don't cascade into full rewrites.

That's what Object-Oriented Programming gives you.

The Problem: AI Apps Have Complexity in Every Direction

AI applications aren't like typical CRUD apps. They have complexity stacked in multiple dimensions.

You're juggling multiple model types.

LLMs for chat
Embedding models for search
Vision models for images
Speech models for audio.

Each has different input formats, output shapes, and failure modes.

You're integrating multiple vendors. OpenAI for production. Anthropic as a fallback. Google for specific use cases. Maybe local models for sensitive data. Each vendor has different APIs, different rate limits, different pricing.

You're supporting multiple integration patterns. Synchronous calls for chat. Streaming for real-time responses. Batch processing for bulk operations. Each pattern needs different error handling and timeout strategies.

And all of this changes rapidly. Models deprecate with 90 days notice. APIs introduce breaking changes. Pricing shifts. What worked last quarter might not work next quarter.

Here's what happens without proper structure:

1) Scattered logic everywhere

Your retry logic is copy-pasted across 12 files. When you need to change the backoff strategy, you edit 12 places. You miss 3. Production breaks in subtle ways.

2) No boundaries between concerns

Your prompt engineering code directly manipulates HTTP clients. A bug in error handling crashes your prompt builder. You spend an hour debugging why a typo in a header breaks template rendering.

3) Leaky abstractions

Your business logic knows whether it's calling GPT-4 or Claude. It knows about token limits and context windows. A simple model swap requires changing orchestration code across your entire pipeline.

4) Copy-paste maintenance hell

You built OpenAI integration. It works great. Now you need to add Anthropic. You duplicate 200 lines of code and maintain two nearly identical versions forever. A bug fix in one doesn't automatically apply to the other.

There's this idea floating around that AI code is fundamentally different, that traditional programming principles don't apply. That's backwards. AI code has more moving parts than typical applications. More providers. More models. More ways things can fail. Without structure, you're building a house of cards where every change risks collapsing the entire stack.

Object-oriented programming gives you tools to manage this complexity. Not as academic theory. As practical engineering.

What is OOP? (The Practical Version)

Object-Oriented Programming is about organizing code into objects that bundle data and behavior together. Four core concepts give you the leverage you need:

Encapsulation:

Hide internal state and expose clean interfaces. Your LLM client has complex retry logic, rate limiting, and token tracking inside. But from the outside? Just a simple .complete() method. Callers don't need to know how it works, just what it does.

Abstraction:

Show only what matters and hide how it works. Your code calls chatService.complete(prompt). It doesn't care if that's hitting OpenAI, Claude, or a local model. It doesn't care about HTTP clients or JSON parsing. It just wants an answer.

Inheritance:

Share behavior across related classes. All your AI model integrations need rate limiting, exponential backoff, timeout handling, and circuit breaking. Write that once in a base class. Every specific integration inherits it automatically.

Polymorphism:

Same interface, different implementations. Your code calls model.predict(input). At runtime that might be GPT-4, Claude, or a fallback mock during testing. Same method call, different behavior based on the actual object type.

These aren't about following "proper OOP style" or making your code look pretty. They're tools for managing change. And AI applications? They change constantly. Models update. Vendors shift. Requirements evolve. These concepts make change cheap instead of expensive.

Encapsulation: Hide Complexity Behind Clean Interfaces

The core idea:

Bundle related data and behavior together. Hide the messy details. Expose only what callers actually need.

In AI systems, this shows up with cross-cutting concerns. Token counting, rate limiting, cost tracking, retry logic, these are complex, but every caller needs them. If every place that calls an LLM has to handle these concerns, you've got duplication and fragility everywhere.

Here's what happens without encapsulation:

// ❌ Every caller handles complexity
@Service
public class ChatService {
    private final OpenAI openAI;
    private final TokenCounter tokenCounter;
    private final CostTracker costTracker;
    private final RateLimiter rateLimiter;

    public String generateResponse(String userId, String prompt) {
        // Every caller does this manually
        rateLimiter.waitForCapacity();

        int inputTokens = tokenCounter.count(prompt);
        String response = openAI.complete(prompt);
        int outputTokens = tokenCounter.count(response);

        double cost = (inputTokens * 0.00003) + (outputTokens * 0.00006);
        costTracker.record(userId, cost);

        return response;
    }
}

@Service  
public class SummaryService {
    // Same pattern duplicated
    public String summarize(String userId, String text) {
        rateLimiter.waitForCapacity();
        int inputTokens = tokenCounter.count(text);
        // ... repeated logic
    }
}

Now product wants per-user cost tracking. You're touching every service. Then they want to add a spending cap. Another round of edits. Then they want detailed token analytics. You're editing the same 15 files for the third time this month.

Here's the encapsulated version:

// ✅ All complexity hidden inside LLMClient
@Component
public class LLMClient {
    private final OpenAI openAI;
    private final TokenCounter tokenCounter;
    private final CostTracker costTracker;
    private final RateLimiter rateLimiter;

    public LLMResponse complete(String userId, String prompt) {
        rateLimiter.waitForCapacity();

        int inputTokens = tokenCounter.count(prompt);
        String response = openAI.complete(prompt);
        int outputTokens = tokenCounter.count(response);

        double cost = calculateCost(inputTokens, outputTokens);
        costTracker.record(userId, cost, inputTokens, outputTokens);

        return new LLMResponse(response, inputTokens, outputTokens, cost);
    }

    private double calculateCost(int input, int output) {
        return (input * 0.00003) + (output * 0.00006);
    }

    public UsageStats getUsageStats(String userId) {
        return costTracker.getStats(userId);
    }
}

// Now callers are simple
@Service
public class ChatService {
    private final LLMClient llmClient;

    public String generateResponse(String userId, String prompt) {
        return llmClient.complete(userId, prompt).getText();
    }
}

All the complexity lives in one place. Rate limiting? Inside LLMClient. Token counting? Inside LLMClient. Cost tracking? Inside LLMClient. When you need to add spending caps or detailed analytics, you change one class. Every caller automatically gets the new behavior.

Quick win:

Next time you're about to copy-paste infrastructure logic (retries, logging, metrics), stop. Create a class that encapsulates that logic. Make callers use the class instead of reimplementing it.

When to skip it:

Single-use scripts or prototype code where you're just testing if something works. But the moment you have two call sites? Encapsulate.

Abstraction: Hide Implementation Details

The core idea:

Define what something does without specifying how it does it. Callers depend on the interface, not the implementation.

In AI systems, this is your defense against vendor lock-in and API churn. Your business logic should care about "moderate this content" not about "call the OpenAI Moderation API endpoint with these specific headers and parse this specific JSON response format."

Here's the coupling problem:

// ❌ Business logic knows too much about OpenAI
@Service
public class ContentPipeline {
    private final RestTemplate restTemplate;

    public void processUserContent(String content) {
        // Business logic coupled to OpenAI API details
        HttpHeaders headers = new HttpHeaders();
        headers.setBearerAuth(openAIKey);
        headers.setContentType(MediaType.APPLICATION_JSON);

        Map<String, Object> request = Map.of("input", content);
        HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, headers);

        ResponseEntity<Map> response = restTemplate.postForEntity(
            "https://api.openai.com/v1/moderations",
            entity,
            Map.class
        );

        Map<String, Object> result = response.getBody();
        boolean flagged = (boolean) ((Map) result.get("results")).get("flagged");

        if (flagged) {
            rejectContent(content);
        } else {
            publishContent(content);
        }
    }
}

This code knows about HTTP clients. It knows about OpenAI's exact endpoint structure. It knows how to parse their JSON response. Now OpenAI changes their API. Or you want to try a different moderation service. Or you want to use a custom fine-tuned model. Every change means editing this business logic.

Here's the abstracted version:

// ✅ Business logic depends on abstraction
public interface ContentModerationService {
    ModerationResult moderate(String content);
}

public class ModerationResult {
    private final boolean safe;
    private final List<String> categories;
    private final double confidence;

    // constructor, getters
}

@Component
public class OpenAIModerationService implements ContentModerationService {
    private final RestTemplate restTemplate;
    private final String apiKey;

    @Override
    public ModerationResult moderate(String content) {
        HttpHeaders headers = new HttpHeaders();
        headers.setBearerAuth(apiKey);

        Map<String, Object> request = Map.of("input", content);
        HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, headers);

        ResponseEntity<Map> response = restTemplate.postForEntity(
            "https://api.openai.com/v1/moderations",
            entity,
            Map.class
        );

        // Parse OpenAI-specific response format
        Map<String, Object> result = response.getBody();
        boolean flagged = (boolean) ((Map) result.get("results")).get("flagged");

        return new ModerationResult(
            !flagged,
            extractCategories(result),
            extractConfidence(result)
        );
    }
}

@Service
public class ContentPipeline {
    private final ContentModerationService moderationService;

    public void processUserContent(String content) {
        ModerationResult result = moderationService.moderate(content);

        if (result.isSafe()) {
            publishContent(content);
        } else {
            rejectContent(content);
        }
    }
}

Now your business logic is clean. It calls .moderate() and gets a result. It doesn't know anything about HTTP or JSON or OpenAI. Want to swap providers? Write a new implementation of ContentModerationService. Change one line in your Spring configuration. Done. Want to test without API calls? Inject a mock implementation. Your content pipeline code never changes.

Quick win:

If your services import vendor SDKs or HTTP clients directly, extract an interface. Move all the messy integration details into an implementation class.

When to skip it:

If you know with absolute certainty you'll never change providers and the API is stable, the indirection might not be worth it. But APIs change. Vendors sunset products. Plan accordingly.

The core idea:

Define common behavior in a parent class. Child classes inherit that behavior and add their own specifics.

In AI systems, this shows up with reliability patterns. Every AI model integration needs exponential backoff when rate limited. Every integration needs timeout handling. Every integration needs circuit breaking to prevent cascading failures. You don't want to implement this 5 times.

Here's the duplication:

// ❌ Every client reimplements retry logic
@Component
public class OpenAIClient {
    public String complete(String prompt) {
        int attempts = 0;
        while (attempts < 3) {
            try {
                return callOpenAI(prompt);
            } catch (RateLimitException e) {
                attempts++;
                sleep(Math.pow(2, attempts) * 1000);
            } catch (TimeoutException e) {
                attempts++;
                sleep(1000);
            }
        }
        throw new AIServiceException("Max retries exceeded");
    }
}

@Component
public class ClaudeClient {
    public String complete(String prompt) {
        // Same retry logic duplicated
        int attempts = 0;
        while (attempts < 3) {
            try {
                return callClaude(prompt);
            } catch (RateLimitException e) {
                attempts++;
                sleep(Math.pow(2, attempts) * 1000);
            } catch (TimeoutException e) {
                attempts++;
                sleep(1000);
            }
        }
        throw new AIServiceException("Max retries exceeded");
    }
}

You've got the same 20 lines in multiple classes. Then you discover a bug in the backoff calculation. Now you're fixing it in 5 places. Or you want to add jitter to prevent thundering herd. Another round of edits everywhere.

Here's the shared behavior:

// ✅ Common behavior in base class
public abstract class BaseAIClient {
    private static final int MAX_RETRIES = 3;
    private static final long BASE_DELAY_MS = 1000;

    protected String executeWithRetry(Supplier<String> operation) {
        int attempts = 0;
        while (attempts < MAX_RETRIES) {
            try {
                return operation.get();
            } catch (RateLimitException e) {
                attempts++;
                if (attempts >= MAX_RETRIES) throw new AIServiceException("Max retries exceeded");
                sleep(calculateBackoff(attempts));
            } catch (TimeoutException e) {
                attempts++;
                if (attempts >= MAX_RETRIES) throw new AIServiceException("Max retries exceeded");
                sleep(BASE_DELAY_MS);
            }
        }
        throw new AIServiceException("Max retries exceeded");
    }

    private long calculateBackoff(int attempt) {
        long exponentialDelay = (long) Math.pow(2, attempt) * BASE_DELAY_MS;
        long jitter = (long) (Math.random() * BASE_DELAY_MS);
        return exponentialDelay + jitter;
    }

    protected abstract String callModel(String prompt);
}

@Component
public class OpenAIClient extends BaseAIClient {
    @Override
    protected String callModel(String prompt) {
        // Only OpenAI-specific logic
        return openAI.chat()
            .model("gpt-4")
            .message(prompt)
            .execute()
            .getContent();
    }

    public String complete(String prompt) {
        return executeWithRetry(() -> callModel(prompt));
    }
}

@Component
public class ClaudeClient extends BaseAIClient {
    @Override
    protected String callModel(String prompt) {
        // Only Claude-specific logic
        return anthropic.messages()
            .model("claude-sonnet-4")
            .userMessage(prompt)
            .execute()
            .getText();
    }

    public String complete(String prompt) {
        return executeWithRetry(() -> callModel(prompt));
    }
}

Now all the reliability logic lives in one place. Every client automatically gets retries, exponential backoff, and jitter. Fix a bug in BaseAIClient? Every child class inherits the fix. Add circuit breaking? One implementation, universal benefit.

Quick win:

If you're copy-pasting infrastructure patterns across similar classes, extract a base class. Put the common behavior there. Let child classes focus on what's actually different.

When to skip it:

If the classes aren't actually related or the shared behavior is trivial (like a single utility method), composition might be cleaner than inheritance. Use inheritance when there's real shared behavior and a clear "is-a" relationship.

Polymorphism: Same Interface, Different Behavior

The core idea:

Write code that works with a type, then at runtime provide any implementation of that type. Same method calls, different behavior based on the actual object.

In AI systems, this is how you build extensible agents and tool systems. Your agent shouldn't have hardcoded if-else chains for every tool. It should work with a Tool interface. Adding new tools means adding new classes, not editing the core orchestration logic.

Here's the brittle approach:

// ❌ Hardcoded tool dispatch
@Service
public class AgentOrchestrator {
    private final BingSearchService bingSearch;
    private final CalculatorService calculator;
    private final WeatherService weather;

    public String executeTool(String toolName, Map<String, Object> params) {
        if (toolName.equals("search")) {
            String query = (String) params.get("query");
            return bingSearch.search(query);
        } else if (toolName.equals("calculator")) {
            String expression = (String) params.get("expression");
            return calculator.evaluate(expression);
        } else if (toolName.equals("weather")) {
            String city = (String) params.get("city");
            return weather.getForecast(city);
        } else {
            throw new IllegalArgumentException("Unknown tool: " + toolName);
        }
    }
}

Product wants to add a database query tool. You edit AgentOrchestrator. Then they want a code execution tool. Another edit. Then an email tool. You're constantly modifying core orchestration logic. Every change risks breaking existing tools.

Here's the polymorphic version:

// ✅ Tool interface enables extension
public interface Tool {
    String getName();
    String getDescription();
    ToolResult execute(Map<String, Object> params);
}

public class ToolResult {
    private final boolean success;
    private final String output;
    private final String error;

    // constructor, getters
}

@Component
public class SearchTool implements Tool {
    private final BingSearchService bingSearch;

    @Override
    public String getName() {
        return "search";
    }

    @Override
    public String getDescription() {
        return "Search the web for information";
    }

    @Override
    public ToolResult execute(Map<String, Object> params) {
        try {
            String query = (String) params.get("query");
            String results = bingSearch.search(query);
            return new ToolResult(true, results, null);
        } catch (Exception e) {
            return new ToolResult(false, null, e.getMessage());
        }
    }
}

@Component
public class CalculatorTool implements Tool {
    @Override
    public String getName() {
        return "calculator";
    }

    @Override
    public String getDescription() {
        return "Evaluate mathematical expressions";
    }

    @Override
    public ToolResult execute(Map<String, Object> params) {
        try {
            String expression = (String) params.get("expression");
            double result = evaluateExpression(expression);
            return new ToolResult(true, String.valueOf(result), null);
        } catch (Exception e) {
            return new ToolResult(false, null, e.getMessage());
        }
    }
}

@Service
public class AgentOrchestrator {
    private final List<Tool> tools;

    public AgentOrchestrator(List<Tool> tools) {
        this.tools = tools;
    }

    public String executeTool(String toolName, Map<String, Object> params) {
        Tool tool = tools.stream()
            .filter(t -> t.getName().equals(toolName))
            .findFirst()
            .orElseThrow(() -> new IllegalArgumentException("Unknown tool: " + toolName));

        ToolResult result = tool.execute(params);
        if (result.isSuccess()) {
            return result.getOutput();
        } else {
            throw new RuntimeException("Tool execution failed: " + result.getError());
        }
    }

    public List<String> listAvailableTools() {
        return tools.stream()
            .map(t -> t.getName() + ": " + t.getDescription())
            .collect(Collectors.toList());
    }
}

Now adding a new tool is just adding a new class that implements Tool. Spring's autowiring automatically injects it into the list. The orchestrator never changes. No if-else chains. No risk of breaking existing tools. Your agent scales from 3 tools to 30 tools without touching core logic.

Quick win:

If you're writing if-else chains or switch statements to handle different implementations, replace them with polymorphism. Define an interface. Make each case an implementation. Let the type system handle dispatch.

When to skip it:

If you truly have only 2-3 cases that will never grow, a simple conditional might be clearer. But the moment you're adding cases frequently, refactor to polymorphism.

How Each Concept Protects Your AI System

Concept	What It Protects Against	Velocity Gain	Cost Savings
Encapsulation	Duplicated infrastructure logic across services	Add cost tracking in 1 place, not 15	Centralized optimization of token usage
Abstraction	Vendor API changes breaking business logic	Swap providers via config, not rewrites	Test with mocks, not real API credits
Inheritance	Re-implementing reliability patterns everywhere	Fix retry bugs once, all clients benefit	Less code means fewer production incidents
Polymorphism	Brittle if-else chains for extensibility	Add AI tools/models as plugins, zero edits to core	A/B test providers without branching logic

Each concept reduces the blast radius of change. Fewer files to touch. Less risk. Faster shipping. That's the math that matters.

When This Actually Matters

OOP isn't about building perfect class hierarchies. It's about containing change. And AI applications have more volatility than typical software.

Models update quarterly. Claude Opus becomes Claude Sonnet 4. GPT-4 becomes GPT-5. Each update changes pricing, context windows, and behavior. Your code needs to adapt without a full rewrite.

Vendors change APIs. OpenAI deprecates endpoints. Anthropic introduces new parameters. Google changes authentication. If these changes ripple through your entire codebase, you're spending more time on maintenance than features.

Requirements shift constantly. Marketing wants per-user cost caps. Sales wants usage analytics. Product wants A/B testing between models. Each requirement should be a localized change, not a system-wide refactor.

Here's the honest breakdown. Building a weekend prototype to validate an AI feature? Write flat procedural code. Get it working. Learn fast. Structure doesn't matter yet.

But if you're running in production with real users and real costs, you need boundaries. Because without encapsulation, adding cost tracking touches 15 files. Without abstraction, swapping models requires rewriting business logic. Without inheritance, you're duplicating reliability patterns and introducing bugs. Without polymorphism, your agent system becomes an unmaintainable if-else nightmare.

The real test is simple. Can you add detailed token analytics in under an hour? Can you swap from OpenAI to Claude by changing one config file? Can you add a new agent tool without touching orchestration code?

If the answer is no, you're fighting your own architecture. These four concepts fix that. Not as theory. As practical tools that make change cheap instead of expensive.

— Harsh

Need help with your AI architecture? Let’s talk

harsh@pragmaticbyharsh.com

OOP Fundamentals for AI Applications

The Problem: AI Apps Have Complexity in Every Direction

Here's what happens without proper structure:

1) Scattered logic everywhere

2) No boundaries between concerns

3) Leaky abstractions

4) Copy-paste maintenance hell

What is OOP? (The Practical Version)

Encapsulation:

Abstraction:

Inheritance:

Polymorphism:

Encapsulation: Hide Complexity Behind Clean Interfaces

The core idea:

Quick win:

When to skip it:

Abstraction: Hide Implementation Details

The core idea:

Quick win:

When to skip it:

The core idea:

Quick win:

When to skip it:

Polymorphism: Same Interface, Different Behavior

The core idea:

Quick win:

When to skip it:

How Each Concept Protects Your AI System

When This Actually Matters

Comments

More from this blog

Anatomy of a Prompt — System, User, and Assistant Explained

Choosing Embedding Models and Dimensions: Why 1536 Isn't Always Better Than 384

What Are Embeddings and How Vector Similarity Actually Works

How Tokenization Works: BPE and the Algorithm Behind Your LLM

What Are Tokens and Why Your LLM Bill Depends on Them

Command Palette

The Problem: AI Apps Have Complexity in Every Direction

Here's what happens without proper structure:

1) Scattered logic everywhere

2) No boundaries between concerns

3) Leaky abstractions

4) Copy-paste maintenance hell

What is OOP? (The Practical Version)

Encapsulation:

Abstraction:

Inheritance:

Polymorphism:

Encapsulation: Hide Complexity Behind Clean Interfaces

The core idea:

Quick win:

When to skip it:

Abstraction: Hide Implementation Details

The core idea:

Quick win:

When to skip it:

Inheritance: Share Behavior Across Related Classes

The core idea:

Quick win:

When to skip it:

Polymorphism: Same Interface, Different Behavior

The core idea:

Quick win:

When to skip it:

How Each Concept Protects Your AI System

When This Actually Matters

Comments

More from this blog