Feb 2026/25 min

Building a Production-Grade IT Ticketing System: Django, React, Celery, and AI-Powered Ticket Analysis

A comprehensive technical walkthrough of designing and building an enterprise IT support platform with async task processing, 4-language i18n, RAG-powered ticket analysis, and a full observability stack with Prometheus and Grafana.

DjangoReactCeleryDockerFull-Stack

Enterprise IT support platforms are deceptively complex. What appears to be a simple CRUD application — create a ticket, assign it, resolve it — quickly balloons into a system that must handle asynchronous processing, multi-language support, AI-powered classification, comprehensive audit trails, and full-stack observability. This post documents the architecture, implementation decisions, and hard-won lessons from building such a system from scratch.

Why Build This?

Most IT ticketing demos are glorified todo lists. They demonstrate CRUD operations and call it a day. I wanted to build something that reflects the actual complexity of enterprise software: background task processing that doesn't block the user, internationalisation that goes beyond string replacement, AI analysis that adds genuine value rather than being a gimmick, and monitoring that gives operators real visibility into system health.

The result is a 10-service Docker Compose stack that handles the full lifecycle of IT support tickets — from submission through AI analysis to resolution — with production-grade observability and a four-language frontend.

Architecture Overview

┌─────────────────────────────────────────────────┐
│              Frontend (React + MUI)              │
│  Port 80 │ Vite build │ nginx │ 4-lang i18n     │
└──────────────────────┬──────────────────────────┘
                       │ REST API
┌──────────────────────▼──────────────────────────┐
│           Django + DRF Backend                    │
│  Port 8000 │ ViewSets │ Serializers │ Signals    │
├──────────────┬──────────────┬───────────────────┤
│  PostgreSQL  │    Redis     │   Celery Worker    │
│  Port 5432   │  Port 6379   │  AI Analysis       │
│  5 Models    │  Msg Queue   │  Embeddings        │
│  Audit Trail │  Results     │  LLaMA 3.2 / GPT   │
└──────────────┴──────────────┴───────────────────┘
┌─────────────────────────────────────────────────┐
│           Monitoring Stack                        │
│  Prometheus (9090) │ Grafana (3001)               │
│  Node Exporter │ Postgres Exporter │ Redis Exp.   │
└─────────────────────────────────────────────────┘

10 Docker services: PostgreSQL 16, Redis 7, Django backend, Celery worker, React/nginx frontend, Prometheus, Grafana, Node Exporter, PostgreSQL Exporter, Redis Exporter.

Deep Dive: Data Model Design

The Ticket Model

The Ticket model is the centrepiece, and its design reflects several real-world requirements that simple demos ignore.

Auto-generated ticket numbers: Each ticket gets a human-readable identifier in the format \TKYYYYMMDDxxx\ (e.g., TK20260228001). This is generated atomically in the serializer's \create()\ method. The format encodes the creation date, making tickets sortable by date from the number alone — useful for phone support where agents reference tickets verbally.

Snapshot fields: The model includes \employee_name_snapshot\ and \department_snapshot\ alongside the \employee_id\ foreign key. Why? Because employee names and departments change. If Sarah Chen from Engineering submits a ticket, gets transferred to Product a month later, and IT resolves the ticket — the ticket should still show "Sarah Chen, Engineering". Snapshot fields capture the state at creation time, not the current state.

python

class Ticket(models.Model):
    ticket_number = models.CharField(max_length=20, unique=True)
    employee_id = models.CharField(max_length=50)
    employee_name_snapshot = models.CharField(max_length=100)
    department_snapshot = models.CharField(max_length=100)
    # ... status, category, priority, timestamps
    embedding = models.JSONField(null=True, blank=True)

Embedding field: The JSONField stores a vector representation of the ticket's title and description. This enables semantic similarity search for finding related historical tickets. I chose to store embeddings directly in PostgreSQL rather than using a separate vector database — the corpus is small enough that cosine similarity computation on retrieval is fast, and it avoids the operational complexity of an additional service.

Database indexes: Strategic indexes on \employee_id\, \status\, \category\, and \created_at\ optimise the most common query patterns: "show me all my tickets", "show me all open tickets", "show me all hardware tickets this month".

The TicketHistory Model — Complete Audit Trail

Every state change is recorded:

python

class TicketHistory(models.Model):
    ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
    changed_field = models.CharField(max_length=100)
    old_value = models.TextField(null=True)
    new_value = models.TextField(null=True)
    changed_by = models.CharField(max_length=50)
    comment = models.TextField(blank=True)
    created_at = models.DateTimeField(auto_now_add=True)

This isn't just for compliance — it's operationally essential. When a user complains "my ticket was closed without resolution", IT managers can see the complete history: who changed what, when, and why. The \comment\ field allows staff to annotate changes ("Closing per user confirmation via email").

The AIResponse Model — AI Analysis Results

Each ticket can have an associated AI analysis:

python

class AIResponse(models.Model):
    ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
    suggested_category = models.CharField(max_length=50)
    confidence_score = models.FloatField()
    suggested_solution = models.TextField()
    similar_tickets = models.JSONField(default=list)
    model_used = models.CharField(max_length=100)
    processing_time_ms = models.IntegerField(null=True)

Key design decision: storing the \model_used\ field. When you're running AI analysis across hundreds of tickets, you need to know which model produced each suggestion — especially when evaluating whether switching models improves quality. The \processing_time_ms\ field enables performance monitoring without diving into logs.

The KnowledgeBase Model

The knowledge base stores reusable IT solutions and documentation:

python

class KnowledgeBase(models.Model):
    title = models.CharField(max_length=200)
    content = models.TextField()
    category = models.CharField(max_length=50)
    embedding = models.JSONField(null=True, blank=True)
    usage_count = models.IntegerField(default=0)
    success_rate = models.FloatField(default=0.0)
    tags = models.JSONField(default=list)

The \usage_count\ and \success_rate\ fields track which knowledge base articles actually help resolve tickets. Over time, this data can inform which articles need updating and which are most valuable — a feedback loop that most ticketing systems lack.

Deep Dive: Asynchronous Task Processing with Celery

Why Celery?

AI analysis of a ticket — generating embeddings, searching for similar tickets, querying the knowledge base, calling the LLM — takes 10-20 seconds. If this ran synchronously in the HTTP request handler, the user would stare at a loading spinner for 20 seconds after submitting a ticket. That's unacceptable.

Celery decouples the submission from the analysis. The user submits a ticket, gets an immediate HTTP 201 response, and the AI analysis runs asynchronously in a background worker. When the user navigates to the ticket detail page, the AI suggestions appear (or show "Analysis in progress" if still running).

The Analysis Pipeline

python

@shared_task
def analyze_ticket_task(ticket_id):
    ticket = Ticket.objects.get(id=ticket_id)

    # 1. Generate embedding from title + description
    embedding = generate_embedding(f"{ticket.title} {ticket.description}")
    ticket.embedding = embedding
    ticket.save()

    # 2. Find similar historical tickets
    similar = find_similar_tickets(embedding, threshold=0.5, top_n=3)

    # 3. Search knowledge base for relevant articles
    kb_articles = find_similar_knowledge(embedding, category=ticket.category)

    # 4. Call LLM with context
    analysis = analyze_with_llm(ticket, similar, kb_articles)

    # 5. Store results
    AIResponse.objects.create(
        ticket=ticket,
        suggested_category=analysis["category"],
        confidence_score=analysis["confidence"],
        suggested_solution=analysis["solution"],
        similar_tickets=[t.ticket_number for t in similar],
        model_used="llama3.2-3b-local",
    )

Embedding Generation and Similarity Search

I use \sentence-transformers\ to generate dense vector embeddings. The embedding captures the semantic meaning of the ticket content, enabling similarity search that goes beyond keyword matching.

Similarity search uses cosine similarity computed directly in Python:

python

def find_similar_tickets(query_embedding, threshold=0.5, top_n=3):
    tickets_with_embeddings = Ticket.objects.exclude(embedding=None)
    similarities = []
    for ticket in tickets_with_embeddings:
        sim = cosine_similarity(query_embedding, ticket.embedding)
        if sim > threshold:
            similarities.append((ticket, sim))
    return sorted(similarities, key=lambda x: x[1], reverse=True)[:top_n]

For the current dataset size (hundreds of tickets), this brute-force approach is fast enough. For production at scale, I'd migrate to a proper vector index (pgvector extension for PostgreSQL, or a dedicated vector database).

LLM Integration with Fallback Strategy

The primary AI model is LLaMA 3.2 3B running locally via Ollama. This is a deliberate architectural choice:

No API costs — every analysis is free after the initial model download
Data privacy — ticket content never leaves the local network
No external dependency — the system doesn't break when an API provider has an outage

The fallback strategy is equally important: if the LLM is unavailable (Ollama not running, GPU out of memory), the system falls back to rule-based category detection using keyword matching:

python

def analyze_category(title, description):
    text = f"{title} {description}".lower()
    if any(w in text for w in ["printer", "monitor", "keyboard", "hardware"]):
        return "hardware"
    if any(w in text for w in ["install", "crash", "update", "software"]):
        return "software"
    if any(w in text for w in ["wifi", "vpn", "network", "internet"]):
        return "network"
    # ... more categories
    return "other"

The fallback generates category-specific solution templates rather than LLM-generated solutions. The confidence score for fallback analysis is set to 0.88 — high enough to be useful, but distinguishable from LLM-generated scores that range from 0 to 1.

This dual-mode approach means the system always provides suggestions, even when the AI model is unavailable. In enterprise environments, reliability trumps sophistication.

Deep Dive: 4-Language Internationalisation

Architecture

The i18n implementation uses \react-i18next\ with HTTP backend for lazy-loading translation files. Translations are organised into namespaces: \common\ (shared UI strings), \ticket\ (ticket-specific terms), \dashboard\ (analytics labels), and \knowledge\ (knowledge base UI).

javascript

i18n
  .use(HttpBackend)
  .use(LanguageDetector)
  .use(initReactI18next)
  .init({
    fallbackLng: 'en',
    supportedLngs: ['en', 'zh', 'fr', 'nl'],
    ns: ['common', 'ticket', 'dashboard', 'knowledge'],
    defaultNS: 'common',
    backend: {
      loadPath: '/locales/{{lng}}/{{ns}}.json',
    },
  });

Why These Four Languages?

The language selection is strategic for the European IT market:

English — the lingua franca of IT
French — primary language in Belgium (Wallonia), France, Luxembourg
Dutch — primary language in Belgium (Flanders), the Netherlands
Chinese — my native language, demonstrating CJK character handling

This isn't decorative. In Belgium, where many IT companies operate, a ticketing system that supports French, Dutch, and English covers the entire workforce. Adding Chinese demonstrates that the i18n architecture handles non-Latin scripts and right-to-left text considerations.

Translation Structure

Each language has four JSON files (one per namespace). Example for the ticket namespace:

json

// en/ticket.json
{
  "status": {
    "pending": "Pending",
    "in_progress": "In Progress",
    "resolved": "Resolved",
    "closed": "Closed"
  },
  "priority": {
    "low": "Low",
    "medium": "Medium",
    "high": "High",
    "urgent": "Urgent"
  }
}

In the React components, translations are consumed via the \useTranslation\ hook:

jsx

const { t } = useTranslation('ticket');
<Chip label={t(\`status.${ticket.status}\`)} color={statusColor} />

Language Switching

The \LanguageSwitcher\ component allows runtime language changes without page reload. The selected language is persisted in \localStorage\ and automatically detected on return visits.

Deep Dive: Frontend Architecture

MUI + Vite + React 19

The frontend uses Material-UI for component design, Vite for bundling, and React 19 for rendering. This combination provides:

MUI — a professional component library that looks like enterprise software (not a startup landing page)
Vite — instant hot module replacement during development, optimised production builds
React Router — client-side routing for SPA experience

Component Architecture

The application has 7 core components:

App.jsx — Root layout with sidebar navigation, language switcher, notification container
Dashboard.jsx — KPI overview with charts, recent tickets, and summary statistics
TicketList.jsx — Filterable, searchable ticket table with pagination and status chips
CreateTicket.jsx — Form with category, priority, and employee info fields
TicketDetail.jsx — Full ticket view with AI analysis, history timeline, and action buttons
KnowledgeBase.jsx — Article management with category filtering and search
LanguageSwitcher.jsx — Runtime language selection component

API Integration

All API calls go through an Axios service layer that handles base URL configuration, error formatting, and response parsing:

javascript

const api = axios.create({
  baseURL: '/api',
  headers: { 'Content-Type': 'application/json' },
});

export const ticketAPI = {
  getTickets: (params) => api.get('/tickets/', { params }),
  getTicket: (id) => api.get(\`/tickets/${id}/\`),
  createTicket: (data) => api.post('/tickets/', data),
  assignTicket: (id, data) => api.post(\`/tickets/${id}/assign/\`, data),
  resolveTicket: (id) => api.post(\`/tickets/${id}/resolve/\`),
  closeTicket: (id) => api.post(\`/tickets/${id}/close/\`),
};

The nginx configuration proxies \/api\ requests to the Django backend, so the frontend doesn't need to know the backend's host or port.

Deep Dive: Docker Compose Orchestration

Service Dependencies and Health Checks

The \docker-compose.yml\ defines clear service dependencies with health checks:

yaml

services:
  postgres:
    image: postgres:16-alpine
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

  backend:
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    command: >
      sh -c "python manage.py migrate &&
             python manage.py runserver 0.0.0.0:8000"

Health checks prevent race conditions during startup. Without them, the Django backend might try to run migrations before PostgreSQL is ready, causing a crash-and-restart loop.

Volume Mounts for Development

The backend uses a volume mount (\./backend:/app\) for hot-reload during development. Django's \runserver\ detects file changes and restarts automatically. The frontend, however, is built into a static nginx image — changes require a rebuild:

bash

docker-compose build frontend && docker-compose up -d frontend

This asymmetry is intentional. Backend development requires rapid iteration (change code, see result), while frontend changes are less frequent and benefit from the production-like nginx serving.

Multi-Stage Frontend Build

The frontend Dockerfile uses a two-stage build:

dockerfile

# Stage 1: Build with Node 20
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Serve with nginx
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf

This produces a tiny production image (< 30MB) containing only the compiled static files and nginx, with no Node.js runtime, no node_modules, and no development dependencies.

Deep Dive: Monitoring and Observability

Why Monitoring from Day One?

Most portfolio projects skip monitoring entirely. I included it because:

It demonstrates production thinking — monitoring isn't something you bolt on later
It provides real operational insight — during development, Grafana dashboards helped me identify slow queries and memory leaks
It's table stakes for enterprise software — no IT manager will deploy a system they can't monitor

The Monitoring Stack

Prometheus scrapes metrics from five targets every 15 seconds:

Django backend via django-prometheus middleware — HTTP request count, latency percentiles, error rates, database query metrics
Node Exporter — CPU utilisation, memory usage, disk I/O, network throughput
PostgreSQL Exporter — active connections, transaction rates, cache hit ratios, table sizes
Redis Exporter — memory usage, operations per second, key count, eviction rates
Prometheus itself — scrape duration, target health

Grafana visualises these metrics with pre-provisioned datasources:

yaml

# grafana-datasources.yml
apiVersion: 1
datasources:
  - name: Prometheus
    type: prometheus
    url: http://prometheus:9090
    isDefault: true

Key Metrics I Track

The most useful metrics during development and demo:

HTTP request latency (p95) — identifies slow API endpoints before users complain
PostgreSQL cache hit ratio — should be > 99%. A drop indicates the database needs more memory or the indexes need tuning
Celery task duration — tracks AI analysis processing time across different models
Redis memory usage — ensures the message queue doesn't grow unboundedly
Container CPU/memory — catches resource leaks early

REST API Design

The API follows REST conventions with Django REST Framework ViewSets:

GET    /api/tickets/                List tickets (paginated, filterable)
POST   /api/tickets/                Create ticket (triggers async AI analysis)
GET    /api/tickets/{id}/           Get ticket with AI analysis
POST   /api/tickets/{id}/assign/    Assign to IT staff
POST   /api/tickets/{id}/resolve/   Mark as resolved
POST   /api/tickets/{id}/close/     Close ticket

GET    /api/knowledge-base/         List knowledge articles
POST   /api/knowledge-base/         Create article (triggers async embedding)
GET    /api/knowledge-base/{id}/    Get article
PUT    /api/knowledge-base/{id}/    Update article (re-generates embedding)
DELETE /api/knowledge-base/{id}/    Delete article

GET    /api/employees/              List employees
POST   /api/employees/              Add employee
GET    /api/employees/{id}/         Get employee
PATCH  /api/employees/{id}/         Update employee

Pagination

All list endpoints use page-based pagination with 10 items per page. The response includes \count\, \next\, and \previous\ links following DRF conventions.

Serializer Design

I use separate serializers for list and detail views. The list serializer excludes heavy fields (embeddings, full AI response text) to keep the response payload small. The detail serializer includes nested AIResponse objects:

python

class TicketSerializer(serializers.ModelSerializer):
    class Meta:
        model = Ticket
        exclude = ['embedding']  # Exclude large vector field from list views

class TicketDetailSerializer(serializers.ModelSerializer):
    ai_responses = AIResponseSerializer(many=True, read_only=True)
    class Meta:
        model = Ticket
        fields = '__all__'

Data Flow: Ticket Creation End-to-End

Here's the complete flow when a user creates a ticket:

User fills form in React CreateTicket component
Frontend sends POST /api/tickets/ with title, description, category, priority, employee info
Django ViewSet validates via TicketSerializer, auto-generates ticket_number (TK20260228001), creates Ticket record in PostgreSQL, returns HTTP 201
Signal triggers Celery task: \analyze_ticket_task.delay(ticket.id)\
Celery worker picks up the task from Redis queue
Embedding generation: sentence-transformers encodes title+description into a vector
Similar ticket search: cosine similarity against all existing ticket embeddings
Knowledge base search: same similarity search filtered by category
LLM analysis: LLaMA 3.2 receives ticket + similar tickets + KB articles, generates category suggestion, confidence score, and detailed solution
AIResponse record created in PostgreSQL with all results
Frontend polls or navigates to ticket detail, sees AI suggestions

The user experiences step 3 as instantaneous (< 200ms). Steps 4-10 happen in the background while the user continues working. When they return to the ticket detail page, the AI analysis is ready.

Technical Decisions and Trade-offs

Django vs FastAPI

I chose Django for this project (and FastAPI for RAG Talk) to demonstrate proficiency in both. Django's strengths for this use case:

ORM — complex data models with relationships, migrations, and query optimisation
Admin interface — immediate back-office access without building a separate admin panel
DRF — mature, well-documented REST framework with serializers, viewsets, and pagination
Ecosystem — django-prometheus, django-celery-results, django-cors-headers all just work

FastAPI would have been fine too, but Django's batteries-included approach saved significant development time for a CRUD-heavy application.

PostgreSQL JSON Fields vs Separate Tables

I store embeddings and similar_tickets as JSONField rather than creating dedicated tables with foreign keys. This is a trade-off:

Pros: Simpler schema, fewer joins, embeddings are tightly coupled to their parent record. Cons: No database-level indexing on embedding values, no foreign key integrity for similar_tickets references.

For the current scale, this is the right trade-off. If I needed to query embeddings directly (e.g., "find all tickets within 0.1 cosine distance of this vector"), I'd migrate to pgvector.

Local LLM vs Cloud API

Running LLaMA locally adds deployment complexity (Ollama must be installed and the model downloaded), but eliminates API costs and external dependencies. For an enterprise IT system handling potentially sensitive internal data, local inference is a meaningful privacy advantage.

The fallback to rule-based analysis ensures the system never fails just because the AI model is unavailable. This resilience is critical for production systems.

What I'd Do Differently

1. Add WebSocket Notifications

Currently, the frontend doesn't know when AI analysis completes — it only sees results when the user navigates to the ticket detail page. WebSocket notifications would push analysis completion events to the client in real time.

2. Implement Role-Based Access Control

The current system doesn't enforce permissions — any user can close any ticket. A production system needs proper RBAC: employees can only view and create, IT staff can assign and resolve, managers can view reports and close.

3. Add End-to-End Tests

Unit tests cover individual components, but the async nature of the Celery pipeline means the most interesting bugs occur at integration boundaries. I'd add end-to-end tests that create a ticket and verify the complete flow through to AI analysis completion.

4. Use pgvector for Embeddings

Storing embeddings in JSONField works but doesn't scale. PostgreSQL's pgvector extension provides proper vector indexing with approximate nearest neighbour search, which would handle millions of tickets efficiently.

Conclusion

Building this system reinforced a belief I hold about software engineering: the interesting problems aren't in the individual components (Django is well-documented, React has excellent tutorials, Docker Compose is straightforward) — they're in the integration. Making Celery tasks fire reliably when Django creates a record, ensuring the frontend handles the async gap gracefully, configuring Prometheus to scrape metrics from five services simultaneously, getting nginx to proxy API requests while serving static files — these integration challenges are where real engineering happens.

The system currently runs as a 10-service Docker Compose stack that starts with a single command. It handles ticket creation, AI-powered analysis, 4-language UI, and full-stack monitoring. It's the kind of system I'd be confident deploying in a real IT department — and that confidence comes from the production practices baked in from day one: health checks, async processing, monitoring, audit trails, and graceful degradation.

Source code on GitHub

All Posts