Building a Production-Grade IT Ticketing System: Django, React, Celery, and AI-Powered Ticket Analysis
A comprehensive technical walkthrough of designing and building an enterprise IT support platform with async task processing, 4-language i18n, RAG-powered ticket analysis, and a full observability stack with Prometheus and Grafana.
Enterprise IT support platforms are deceptively complex. What appears to be a simple CRUD application — create a ticket, assign it, resolve it — quickly balloons into a system that must handle asynchronous processing, multi-language support, AI-powered classification, comprehensive audit trails, and full-stack observability. This post documents the architecture, implementation decisions, and hard-won lessons from building such a system from scratch.
Why Build This?
Most IT ticketing demos are glorified todo lists. They demonstrate CRUD operations and call it a day. I wanted to build something that reflects the actual complexity of enterprise software: background task processing that doesn't block the user, internationalisation that goes beyond string replacement, AI analysis that adds genuine value rather than being a gimmick, and monitoring that gives operators real visibility into system health.
The result is a 10-service Docker Compose stack that handles the full lifecycle of IT support tickets — from submission through AI analysis to resolution — with production-grade observability and a four-language frontend.
Architecture Overview
┌─────────────────────────────────────────────────┐
│ Frontend (React + MUI) │
│ Port 80 │ Vite build │ nginx │ 4-lang i18n │
└──────────────────────┬──────────────────────────┘
│ REST API
┌──────────────────────▼──────────────────────────┐
│ Django + DRF Backend │
│ Port 8000 │ ViewSets │ Serializers │ Signals │
├──────────────┬──────────────┬───────────────────┤
│ PostgreSQL │ Redis │ Celery Worker │
│ Port 5432 │ Port 6379 │ AI Analysis │
│ 5 Models │ Msg Queue │ Embeddings │
│ Audit Trail │ Results │ LLaMA 3.2 / GPT │
└──────────────┴──────────────┴───────────────────┘
┌─────────────────────────────────────────────────┐
│ Monitoring Stack │
│ Prometheus (9090) │ Grafana (3001) │
│ Node Exporter │ Postgres Exporter │ Redis Exp. │
└─────────────────────────────────────────────────┘10 Docker services: PostgreSQL 16, Redis 7, Django backend, Celery worker, React/nginx frontend, Prometheus, Grafana, Node Exporter, PostgreSQL Exporter, Redis Exporter.
Deep Dive: Data Model Design
The Ticket Model
The Ticket model is the centrepiece, and its design reflects several real-world requirements that simple demos ignore.
Auto-generated ticket numbers: Each ticket gets a human-readable identifier in the format \TKYYYYMMDDxxx\ (e.g., TK20260228001). This is generated atomically in the serializer's \create()\ method. The format encodes the creation date, making tickets sortable by date from the number alone — useful for phone support where agents reference tickets verbally.
Snapshot fields: The model includes \employee_name_snapshot\ and \department_snapshot\ alongside the \employee_id\ foreign key. Why? Because employee names and departments change. If Sarah Chen from Engineering submits a ticket, gets transferred to Product a month later, and IT resolves the ticket — the ticket should still show "Sarah Chen, Engineering". Snapshot fields capture the state at creation time, not the current state.
class Ticket(models.Model):
ticket_number = models.CharField(max_length=20, unique=True)
employee_id = models.CharField(max_length=50)
employee_name_snapshot = models.CharField(max_length=100)
department_snapshot = models.CharField(max_length=100)
# ... status, category, priority, timestamps
embedding = models.JSONField(null=True, blank=True)Embedding field: The JSONField stores a vector representation of the ticket's title and description. This enables semantic similarity search for finding related historical tickets. I chose to store embeddings directly in PostgreSQL rather than using a separate vector database — the corpus is small enough that cosine similarity computation on retrieval is fast, and it avoids the operational complexity of an additional service.
Database indexes: Strategic indexes on \employee_id\, \status\, \category\, and \created_at\ optimise the most common query patterns: "show me all my tickets", "show me all open tickets", "show me all hardware tickets this month".
The TicketHistory Model — Complete Audit Trail
Every state change is recorded:
class TicketHistory(models.Model):
ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
changed_field = models.CharField(max_length=100)
old_value = models.TextField(null=True)
new_value = models.TextField(null=True)
changed_by = models.CharField(max_length=50)
comment = models.TextField(blank=True)
created_at = models.DateTimeField(auto_now_add=True)This isn't just for compliance — it's operationally essential. When a user complains "my ticket was closed without resolution", IT managers can see the complete history: who changed what, when, and why. The \comment\ field allows staff to annotate changes ("Closing per user confirmation via email").
The AIResponse Model — AI Analysis Results
Each ticket can have an associated AI analysis:
class AIResponse(models.Model):
ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
suggested_category = models.CharField(max_length=50)
confidence_score = models.FloatField()
suggested_solution = models.TextField()
similar_tickets = models.JSONField(default=list)
model_used = models.CharField(max_length=100)
processing_time_ms = models.IntegerField(null=True)Key design decision: storing the \model_used\ field. When you're running AI analysis across hundreds of tickets, you need to know which model produced each suggestion — especially when evaluating whether switching models improves quality. The \processing_time_ms\ field enables performance monitoring without diving into logs.
The KnowledgeBase Model
The knowledge base stores reusable IT solutions and documentation:
class KnowledgeBase(models.Model):
title = models.CharField(max_length=200)
content = models.TextField()
category = models.CharField(max_length=50)
embedding = models.JSONField(null=True, blank=True)
usage_count = models.IntegerField(default=0)
success_rate = models.FloatField(default=0.0)
tags = models.JSONField(default=list)The \usage_count\ and \success_rate\ fields track which knowledge base articles actually help resolve tickets. Over time, this data can inform which articles need updating and which are most valuable — a feedback loop that most ticketing systems lack.
Deep Dive: Asynchronous Task Processing with Celery
Why Celery?
AI analysis of a ticket — generating embeddings, searching for similar tickets, querying the knowledge base, calling the LLM — takes 10-20 seconds. If this ran synchronously in the HTTP request handler, the user would stare at a loading spinner for 20 seconds after submitting a ticket. That's unacceptable.
Celery decouples the submission from the analysis. The user submits a ticket, gets an immediate HTTP 201 response, and the AI analysis runs asynchronously in a background worker. When the user navigates to the ticket detail page, the AI suggestions appear (or show "Analysis in progress" if still running).
The Analysis Pipeline
@shared_task
def analyze_ticket_task(ticket_id):
ticket = Ticket.objects.get(id=ticket_id)
# 1. Generate embedding from title + description
embedding = generate_embedding(f"{ticket.title} {ticket.description}")
ticket.embedding = embedding
ticket.save()
# 2. Find similar historical tickets
similar = find_similar_tickets(embedding, threshold=0.5, top_n=3)
# 3. Search knowledge base for relevant articles
kb_articles = find_similar_knowledge(embedding, category=ticket.category)
# 4. Call LLM with context
analysis = analyze_with_llm(ticket, similar, kb_articles)
# 5. Store results
AIResponse.objects.create(
ticket=ticket,
suggested_category=analysis["category"],
confidence_score=analysis["confidence"],
suggested_solution=analysis["solution"],
similar_tickets=[t.ticket_number for t in similar],
model_used="llama3.2-3b-local",
)Embedding Generation and Similarity Search
I use \sentence-transformers\ to generate dense vector embeddings. The embedding captures the semantic meaning of the ticket content, enabling similarity search that goes beyond keyword matching.
Similarity search uses cosine similarity computed directly in Python:
def find_similar_tickets(query_embedding, threshold=0.5, top_n=3):
tickets_with_embeddings = Ticket.objects.exclude(embedding=None)
similarities = []
for ticket in tickets_with_embeddings:
sim = cosine_similarity(query_embedding, ticket.embedding)
if sim > threshold:
similarities.append((ticket, sim))
return sorted(similarities, key=lambda x: x[1], reverse=True)[:top_n]For the current dataset size (hundreds of tickets), this brute-force approach is fast enough. For production at scale, I'd migrate to a proper vector index (pgvector extension for PostgreSQL, or a dedicated vector database).
LLM Integration with Fallback Strategy
The primary AI model is LLaMA 3.2 3B running locally via Ollama. This is a deliberate architectural choice:
- No API costs — every analysis is free after the initial model download
- Data privacy — ticket content never leaves the local network
- No external dependency — the system doesn't break when an API provider has an outage
The fallback strategy is equally important: if the LLM is unavailable (Ollama not running, GPU out of memory), the system falls back to rule-based category detection using keyword matching:
def analyze_category(title, description):
text = f"{title} {description}".lower()
if any(w in text for w in ["printer", "monitor", "keyboard", "hardware"]):
return "hardware"
if any(w in text for w in ["install", "crash", "update", "software"]):
return "software"
if any(w in text for w in ["wifi", "vpn", "network", "internet"]):
return "network"
# ... more categories
return "other"The fallback generates category-specific solution templates rather than LLM-generated solutions. The confidence score for fallback analysis is set to 0.88 — high enough to be useful, but distinguishable from LLM-generated scores that range from 0 to 1.
This dual-mode approach means the system always provides suggestions, even when the AI model is unavailable. In enterprise environments, reliability trumps sophistication.
Deep Dive: 4-Language Internationalisation
Architecture
The i18n implementation uses \react-i18next\ with HTTP backend for lazy-loading translation files. Translations are organised into namespaces: \common\ (shared UI strings), \ticket\ (ticket-specific terms), \dashboard\ (analytics labels), and \knowledge\ (knowledge base UI).
i18n
.use(HttpBackend)
.use(LanguageDetector)
.use(initReactI18next)
.init({
fallbackLng: 'en',
supportedLngs: ['en', 'zh', 'fr', 'nl'],
ns: ['common', 'ticket', 'dashboard', 'knowledge'],
defaultNS: 'common',
backend: {
loadPath: '/locales/{{lng}}/{{ns}}.json',
},
});Why These Four Languages?
The language selection is strategic for the European IT market:
- English — the lingua franca of IT
- French — primary language in Belgium (Wallonia), France, Luxembourg
- Dutch — primary language in Belgium (Flanders), the Netherlands
- Chinese — my native language, demonstrating CJK character handling
This isn't decorative. In Belgium, where many IT companies operate, a ticketing system that supports French, Dutch, and English covers the entire workforce. Adding Chinese demonstrates that the i18n architecture handles non-Latin scripts and right-to-left text considerations.
Translation Structure
Each language has four JSON files (one per namespace). Example for the ticket namespace:
// en/ticket.json
{
"status": {
"pending": "Pending",
"in_progress": "In Progress",
"resolved": "Resolved",
"closed": "Closed"
},
"priority": {
"low": "Low",
"medium": "Medium",
"high": "High",
"urgent": "Urgent"
}
}In the React components, translations are consumed via the \useTranslation\ hook:
const { t } = useTranslation('ticket');
<Chip label={t(\`status.${ticket.status}\`)} color={statusColor} />Language Switching
The \LanguageSwitcher\ component allows runtime language changes without page reload. The selected language is persisted in \localStorage\ and automatically detected on return visits.
Deep Dive: Frontend Architecture
MUI + Vite + React 19
The frontend uses Material-UI for component design, Vite for bundling, and React 19 for rendering. This combination provides:
- MUI — a professional component library that looks like enterprise software (not a startup landing page)
- Vite — instant hot module replacement during development, optimised production builds
- React Router — client-side routing for SPA experience
Component Architecture
The application has 7 core components:
- App.jsx — Root layout with sidebar navigation, language switcher, notification container
- Dashboard.jsx — KPI overview with charts, recent tickets, and summary statistics
- TicketList.jsx — Filterable, searchable ticket table with pagination and status chips
- CreateTicket.jsx — Form with category, priority, and employee info fields
- TicketDetail.jsx — Full ticket view with AI analysis, history timeline, and action buttons
- KnowledgeBase.jsx — Article management with category filtering and search
- LanguageSwitcher.jsx — Runtime language selection component
API Integration
All API calls go through an Axios service layer that handles base URL configuration, error formatting, and response parsing:
const api = axios.create({
baseURL: '/api',
headers: { 'Content-Type': 'application/json' },
});
export const ticketAPI = {
getTickets: (params) => api.get('/tickets/', { params }),
getTicket: (id) => api.get(\`/tickets/${id}/\`),
createTicket: (data) => api.post('/tickets/', data),
assignTicket: (id, data) => api.post(\`/tickets/${id}/assign/\`, data),
resolveTicket: (id) => api.post(\`/tickets/${id}/resolve/\`),
closeTicket: (id) => api.post(\`/tickets/${id}/close/\`),
};The nginx configuration proxies \/api\ requests to the Django backend, so the frontend doesn't need to know the backend's host or port.
Deep Dive: Docker Compose Orchestration
Service Dependencies and Health Checks
The \docker-compose.yml\ defines clear service dependencies with health checks:
services:
postgres:
image: postgres:16-alpine
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
backend:
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
command: >
sh -c "python manage.py migrate &&
python manage.py runserver 0.0.0.0:8000"Health checks prevent race conditions during startup. Without them, the Django backend might try to run migrations before PostgreSQL is ready, causing a crash-and-restart loop.
Volume Mounts for Development
The backend uses a volume mount (\./backend:/app\) for hot-reload during development. Django's \runserver\ detects file changes and restarts automatically. The frontend, however, is built into a static nginx image — changes require a rebuild:
docker-compose build frontend && docker-compose up -d frontendThis asymmetry is intentional. Backend development requires rapid iteration (change code, see result), while frontend changes are less frequent and benefit from the production-like nginx serving.
Multi-Stage Frontend Build
The frontend Dockerfile uses a two-stage build:
# Stage 1: Build with Node 20
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: Serve with nginx
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.confThis produces a tiny production image (< 30MB) containing only the compiled static files and nginx, with no Node.js runtime, no node_modules, and no development dependencies.
Deep Dive: Monitoring and Observability
Why Monitoring from Day One?
Most portfolio projects skip monitoring entirely. I included it because:
- It demonstrates production thinking — monitoring isn't something you bolt on later
- It provides real operational insight — during development, Grafana dashboards helped me identify slow queries and memory leaks
- It's table stakes for enterprise software — no IT manager will deploy a system they can't monitor
The Monitoring Stack
Prometheus scrapes metrics from five targets every 15 seconds:
- Django backend via django-prometheus middleware — HTTP request count, latency percentiles, error rates, database query metrics
- Node Exporter — CPU utilisation, memory usage, disk I/O, network throughput
- PostgreSQL Exporter — active connections, transaction rates, cache hit ratios, table sizes
- Redis Exporter — memory usage, operations per second, key count, eviction rates
- Prometheus itself — scrape duration, target health
Grafana visualises these metrics with pre-provisioned datasources:
# grafana-datasources.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus:9090
isDefault: trueKey Metrics I Track
The most useful metrics during development and demo:
- HTTP request latency (p95) — identifies slow API endpoints before users complain
- PostgreSQL cache hit ratio — should be > 99%. A drop indicates the database needs more memory or the indexes need tuning
- Celery task duration — tracks AI analysis processing time across different models
- Redis memory usage — ensures the message queue doesn't grow unboundedly
- Container CPU/memory — catches resource leaks early
REST API Design
The API follows REST conventions with Django REST Framework ViewSets:
GET /api/tickets/ List tickets (paginated, filterable)
POST /api/tickets/ Create ticket (triggers async AI analysis)
GET /api/tickets/{id}/ Get ticket with AI analysis
POST /api/tickets/{id}/assign/ Assign to IT staff
POST /api/tickets/{id}/resolve/ Mark as resolved
POST /api/tickets/{id}/close/ Close ticket
GET /api/knowledge-base/ List knowledge articles
POST /api/knowledge-base/ Create article (triggers async embedding)
GET /api/knowledge-base/{id}/ Get article
PUT /api/knowledge-base/{id}/ Update article (re-generates embedding)
DELETE /api/knowledge-base/{id}/ Delete article
GET /api/employees/ List employees
POST /api/employees/ Add employee
GET /api/employees/{id}/ Get employee
PATCH /api/employees/{id}/ Update employeePagination
All list endpoints use page-based pagination with 10 items per page. The response includes \count\, \next\, and \previous\ links following DRF conventions.
Serializer Design
I use separate serializers for list and detail views. The list serializer excludes heavy fields (embeddings, full AI response text) to keep the response payload small. The detail serializer includes nested AIResponse objects:
class TicketSerializer(serializers.ModelSerializer):
class Meta:
model = Ticket
exclude = ['embedding'] # Exclude large vector field from list views
class TicketDetailSerializer(serializers.ModelSerializer):
ai_responses = AIResponseSerializer(many=True, read_only=True)
class Meta:
model = Ticket
fields = '__all__'Data Flow: Ticket Creation End-to-End
Here's the complete flow when a user creates a ticket:
- User fills form in React CreateTicket component
- Frontend sends POST /api/tickets/ with title, description, category, priority, employee info
- Django ViewSet validates via TicketSerializer, auto-generates ticket_number (TK20260228001), creates Ticket record in PostgreSQL, returns HTTP 201
- Signal triggers Celery task: \
analyze_ticket_task.delay(ticket.id)\ - Celery worker picks up the task from Redis queue
- Embedding generation: sentence-transformers encodes title+description into a vector
- Similar ticket search: cosine similarity against all existing ticket embeddings
- Knowledge base search: same similarity search filtered by category
- LLM analysis: LLaMA 3.2 receives ticket + similar tickets + KB articles, generates category suggestion, confidence score, and detailed solution
- AIResponse record created in PostgreSQL with all results
- Frontend polls or navigates to ticket detail, sees AI suggestions
The user experiences step 3 as instantaneous (< 200ms). Steps 4-10 happen in the background while the user continues working. When they return to the ticket detail page, the AI analysis is ready.
Technical Decisions and Trade-offs
Django vs FastAPI
I chose Django for this project (and FastAPI for RAG Talk) to demonstrate proficiency in both. Django's strengths for this use case:
- ORM — complex data models with relationships, migrations, and query optimisation
- Admin interface — immediate back-office access without building a separate admin panel
- DRF — mature, well-documented REST framework with serializers, viewsets, and pagination
- Ecosystem — django-prometheus, django-celery-results, django-cors-headers all just work
FastAPI would have been fine too, but Django's batteries-included approach saved significant development time for a CRUD-heavy application.
PostgreSQL JSON Fields vs Separate Tables
I store embeddings and similar_tickets as JSONField rather than creating dedicated tables with foreign keys. This is a trade-off:
Pros: Simpler schema, fewer joins, embeddings are tightly coupled to their parent record. Cons: No database-level indexing on embedding values, no foreign key integrity for similar_tickets references.
For the current scale, this is the right trade-off. If I needed to query embeddings directly (e.g., "find all tickets within 0.1 cosine distance of this vector"), I'd migrate to pgvector.
Local LLM vs Cloud API
Running LLaMA locally adds deployment complexity (Ollama must be installed and the model downloaded), but eliminates API costs and external dependencies. For an enterprise IT system handling potentially sensitive internal data, local inference is a meaningful privacy advantage.
The fallback to rule-based analysis ensures the system never fails just because the AI model is unavailable. This resilience is critical for production systems.
What I'd Do Differently
1. Add WebSocket Notifications
Currently, the frontend doesn't know when AI analysis completes — it only sees results when the user navigates to the ticket detail page. WebSocket notifications would push analysis completion events to the client in real time.
2. Implement Role-Based Access Control
The current system doesn't enforce permissions — any user can close any ticket. A production system needs proper RBAC: employees can only view and create, IT staff can assign and resolve, managers can view reports and close.
3. Add End-to-End Tests
Unit tests cover individual components, but the async nature of the Celery pipeline means the most interesting bugs occur at integration boundaries. I'd add end-to-end tests that create a ticket and verify the complete flow through to AI analysis completion.
4. Use pgvector for Embeddings
Storing embeddings in JSONField works but doesn't scale. PostgreSQL's pgvector extension provides proper vector indexing with approximate nearest neighbour search, which would handle millions of tickets efficiently.
Conclusion
Building this system reinforced a belief I hold about software engineering: the interesting problems aren't in the individual components (Django is well-documented, React has excellent tutorials, Docker Compose is straightforward) — they're in the integration. Making Celery tasks fire reliably when Django creates a record, ensuring the frontend handles the async gap gracefully, configuring Prometheus to scrape metrics from five services simultaneously, getting nginx to proxy API requests while serving static files — these integration challenges are where real engineering happens.
The system currently runs as a 10-service Docker Compose stack that starts with a single command. It handles ticket creation, AI-powered analysis, 4-language UI, and full-stack monitoring. It's the kind of system I'd be confident deploying in a real IT department — and that confidence comes from the production practices baked in from day one: health checks, async processing, monitoring, audit trails, and graceful degradation.