Skip to main content

System Architecture

Overview

JobHive is built on a modern, scalable, microservices-inspired architecture using Django as the primary backend framework. The system is designed for high availability, real-time processing, and seamless scalability to handle thousands of concurrent interviews.

High-Level Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Frontend      │    │   Load Balancer  │    │   CDN/CloudFront│
│   (React)       │◄───┤   (ALB)          │◄───┤   (Static Assets)│
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                        │
         │                        ▼
         │              ┌──────────────────┐
         │              │   Django Backend │
         │              │   (ECS Cluster)  │
         │              └──────────────────┘
         │                        │
         │                        ▼
         │              ┌──────────────────┐
         │              │   PostgreSQL     │
         │              │   (RDS)          │
         │              └──────────────────┘


┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   WebSocket     │    │   Redis Cache    │    │   S3 Storage    │
│   (Channels)    │◄───┤   (ElastiCache)  │    │   (Media Files) │
└─────────────────┘    └──────────────────┘    └─────────────────┘


┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   LiveKit       │    │   AI/ML Services │    │   Monitoring    │
│   (Video/Audio) │    │   (AWS Services) │    │   (DataDog)     │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Core Components

1. Backend Application Layer

Django Framework (v5.1.3)

Primary Components:
  • API Layer: Django REST Framework for RESTful APIs
  • Authentication: JWT-based authentication with Django Allauth
  • Real-time: Django Channels for WebSocket communication
  • Task Queue: Celery with Redis for background processing
  • Database: PostgreSQL with advanced indexing strategies
Key Django Apps:
# Core business logic modules
INSTALLED_APPS = [
    'jobhive.users',           # User management and authentication
    'jobhive.company',         # Company profiles and management  
    'jobhive.interview',       # Interview sessions and analysis
    'jobhive.billing',         # Subscription and payment processing
    'jobhive.utils',           # Shared utilities and middleware
]

API Architecture

RESTful Design Principles:
  • Resource-based URLs (/api/v1/interviews/, /api/v1/users/)
  • HTTP methods for CRUD operations
  • Consistent response formats with pagination
  • Version-controlled API endpoints
  • Comprehensive error handling and validation
Key API Endpoints:
# Interview Management
POST   /api/interview/interviews/              # Create interview session
GET    /api/interview/interviews/{id}/         # Get interview details
PUT    /api/interview/interviews/{id}/         # Update interview
DELETE /api/interview/interviews/{id}/         # Delete interview

# AI Analysis
GET    /api/interview/interviews/{id}/benchmark-comparison/  # Benchmark data
POST   /api/interview/sentiment-sessions/      # Create sentiment session
GET    /api/interview/skill-assessments/       # Get skill assessments

# User Management  
GET    /api/users/me/                          # Current user profile
POST   /api/auth/login/                        # User authentication
POST   /api/auth/register/                     # User registration

# Billing
GET    /api/billing/plans/                     # Subscription plans
POST   /api/billing/subscriptions/            # Create subscription
GET    /api/billing/analytics/                # Billing analytics

2. Database Layer

PostgreSQL Configuration

Performance Optimizations:
-- Interview session indexes for fast queries
CREATE INDEX CONCURRENTLY idx_interview_sessions_user_status 
ON interview_interviewsession(user_id, status, start_time);

CREATE INDEX CONCURRENTLY idx_interview_sessions_completion 
ON interview_interviewsession(completion_percentage);

-- Sentiment analysis indexes
CREATE INDEX CONCURRENTLY idx_sentiment_sessions_created 
ON interview_sentimentsession(interview_session_id, created_at);

-- Billing indexes for performance
CREATE INDEX CONCURRENTLY idx_billing_user_subscription 
ON billing_customersubscription(user_id, status);
Database Models Architecture:
# Core Models Relationships
User (1) ──── (1) Company                    # Company profile
User (1) ──── (*) InterviewSession           # Interview sessions
InterviewSession (1) ──── (*) SentimentSession    # Sentiment analysis
InterviewSession (1) ──── (*) SkillAssessment     # Skill evaluations
InterviewSession (1) ──── (1) CulturalFit         # Cultural fit analysis
User (1) ──── (1) CustomerSubscription       # Billing relationship

3. Real-Time Communication Layer

WebSocket Architecture (Django Channels)

Connection Management:
# WebSocket routing configuration
websocket_urlpatterns = [
    path('ws/interview/<session_id>/', InterviewConsumer.as_asgi()),
    path('ws/dashboard/<user_id>/', DashboardConsumer.as_asgi()),
]

# Consumer for real-time interview updates
class InterviewConsumer(AsyncWebsocketConsumer):
    async def connect(self):
        self.session_id = self.scope['url_route']['kwargs']['session_id']
        self.room_group_name = f'interview_{self.session_id}'
        
        await self.channel_layer.group_add(
            self.room_group_name,
            self.channel_name
        )

LiveKit Integration

Video/Audio Processing:
  • Real-time Communication: Low-latency video and audio streaming
  • Recording: Automatic session recording for analysis
  • Transcription: Real-time speech-to-text conversion
  • Quality Adaptation: Dynamic quality adjustment based on connection

4. AI/ML Processing Layer

Sentiment Analysis Engine

Multi-Modal Processing:
class EnhancedSentimentAgent:
    def analyze_sentiment_with_context(self, text, context_factors):
        # Base sentiment analysis
        sentiment_scores = self.base_sentiment_analyzer(text)
        
        # Context enhancement
        context_scores = self.analyze_context_factors(text, context_factors)
        
        # Weighted combination
        final_score = self.calculate_weighted_sentiment(
            sentiment_scores, context_scores
        )
        
        return final_score

AI Agent Architecture

Orchestrated Agent System:
# AI Agent Hierarchy
OrchestratorAgent           # Coordinates all AI processes
├── SentimentAnalysisAgent  # Emotional and engagement analysis
├── SkillAssessmentAgent    # Technical skill evaluation
├── CulturalFitAgent        # Company culture alignment
├── RecommendationsAgent    # Improvement suggestions
└── WorkerAgent             # Background processing tasks
Key AI Capabilities:
  • Natural Language Processing: Advanced text analysis and understanding
  • Computer Vision: Facial expression and body language analysis
  • Speech Processing: Audio quality, pace, and filler word detection
  • Behavioral Analysis: Pattern recognition in responses and interactions

5. Caching and Performance Layer

Redis Configuration

Caching Strategy:
# Session data caching
CACHES = {
    'default': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': 'redis://elasticache-endpoint:6379/1',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
            'SERIALIZER': 'django_redis.serializers.json.JSONSerializer',
        }
    }
}

# Cache usage patterns
@cache_result(timeout=300)  # 5-minute cache
def get_interview_analytics(user_id, date_range):
    return calculate_interview_metrics(user_id, date_range)

Database Query Optimization

Performance Patterns:
# Optimized queryset patterns
interviews = InterviewSession.objects.select_related(
    'user', 'job', 'job__company'
).prefetch_related(
    'sentiment_sessions',
    'skill_assessments__skill',
    'recommendations'
).filter(
    status='completed',
    start_time__gte=start_date
)

6. Background Processing Layer

Celery Task Queue

Task Organization:
# Celery configuration for background tasks
@shared_task(bind=True, max_retries=3)
def process_interview_analysis(self, interview_session_id):
    try:
        # Run comprehensive AI analysis
        session = InterviewSession.objects.get(id=interview_session_id)
        
        # Execute analysis pipeline
        sentiment_analysis.delay(session.id)
        skill_assessment.delay(session.id)
        cultural_fit_analysis.delay(session.id)
        
    except Exception as exc:
        self.retry(countdown=60, exc=exc)

# Periodic tasks
@periodic_task(run_every=crontab(minute=0, hour=0))  # Daily at midnight
def generate_daily_analytics():
    for company in Company.objects.filter(is_active=True):
        InterviewStatistics.update_statistics(company)

Data Flow Architecture

1. Interview Session Lifecycle

1. Session Creation
   ├── User initiates interview
   ├── WebSocket connection established
   ├── LiveKit room created
   └── Database session record created

2. Real-time Processing
   ├── Audio/video stream processing
   ├── Live transcription and sentiment analysis
   ├── WebSocket updates to frontend
   └── Intermediate results cached

3. Post-Interview Analysis
   ├── Comprehensive AI analysis triggered
   ├── Skill assessments calculated
   ├── Cultural fit evaluation
   ├── Recommendations generated
   └── Final results stored and cached

4. Results Delivery
   ├── Dashboard updates via WebSocket
   ├── Email notifications sent
   ├── Analytics data aggregated
   └── Reports generated

2. API Request Flow

Client Request

Load Balancer (ALB)

Django Application

Authentication Middleware

Permission Checks

Business Logic Layer

Database Query (with caching)

Response Serialization

JSON Response to Client

Security Architecture

Authentication & Authorization

Multi-layered Security:
# JWT-based authentication
SIMPLE_JWT = {
    'ACCESS_TOKEN_LIFETIME': timedelta(minutes=60),
    'REFRESH_TOKEN_LIFETIME': timedelta(days=7),
    'ROTATE_REFRESH_TOKENS': True,
    'ALGORITHM': 'HS256',
}

# Permission-based access control
class InterviewSessionViewSet(ModelViewSet):
    permission_classes = [IsAuthenticated, IsOwnerOrReadOnly]
    
    def get_queryset(self):
        return InterviewSession.objects.filter(
            user=self.request.user
        )

Data Protection

Encryption and Privacy:
  • Data at Rest: AES-256 encryption for sensitive data
  • Data in Transit: TLS 1.3 for all communications
  • Personal Data: GDPR-compliant data handling
  • Video/Audio: Encrypted storage with access controls

Scalability Design

Horizontal Scaling

Container Architecture:
# ECS Service Configuration
services:
  django-web:
    image: jobhive/backend:latest
    cpu: 512
    memory: 1024
    desired_count: 3
    load_balancer:
      target_group: jobhive-web-tg
      
  celery-worker:
    image: jobhive/backend:latest
    command: celery -A config worker
    cpu: 256  
    memory: 512
    desired_count: 2
    
  celery-beat:
    image: jobhive/backend:latest
    command: celery -A config beat
    cpu: 128
    memory: 256
    desired_count: 1

Auto-Scaling Configuration

AWS Auto Scaling:
# CloudWatch metrics for scaling
SCALING_METRICS = {
    'cpu_utilization': {
        'target': 70,
        'scale_up_cooldown': 300,
        'scale_down_cooldown': 600
    },
    'memory_utilization': {
        'target': 80,
        'scale_up_cooldown': 300,
        'scale_down_cooldown': 600
    }
}

Monitoring and Observability

DataDog Integration

Comprehensive Monitoring:
# Custom metrics tracking
from datadog import DogStatsdClient

statsd = DogStatsdClient(host='localhost', port=8125)

class InterviewMetricsMiddleware:
    def process_request(self, request):
        statsd.increment('interview.session.started')
        
    def process_response(self, request, response):
        statsd.timing('interview.response_time', response.time)
        statsd.increment(f'interview.response.{response.status_code}')

Logging Strategy

Structured Logging:
# Logging configuration
LOGGING = {
    'version': 1,
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'json',
        },
        'datadog': {
            'class': 'datadog.DogStatsdLogHandler',
            'level': 'INFO',
        }
    },
    'formatters': {
        'json': {
            'class': 'pythonjsonlogger.jsonlogger.JsonFormatter',
            'format': '%(asctime)s %(name)s %(levelname)s %(message)s'
        }
    }
}

Deployment Architecture

AWS Infrastructure

Production Environment:
# Infrastructure Components
VPC:
  - Public Subnets (ALB, NAT Gateway)
  - Private Subnets (ECS, RDS)
  - Database Subnets (Multi-AZ RDS)

ECS Cluster:
  - Fargate launch type
  - Auto Scaling Groups
  - Service discovery
  
RDS PostgreSQL:
  - Multi-AZ deployment
  - Read replicas for analytics
  - Automated backups
  
ElastiCache Redis:
  - Cluster mode enabled
  - Multi-AZ replication
  - Automatic failover

CI/CD Pipeline

Automated Deployment:
# GitHub Actions Workflow
name: Deploy to Production
on:
  push:
    branches: [main]
    
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Build Docker Image
        run: docker build -t jobhive/backend:${{ github.sha }} .
        
      - name: Push to ECR
        run: docker push $ECR_REGISTRY/jobhive/backend:${{ github.sha }}
        
      - name: Deploy to ECS
        run: aws ecs update-service --cluster prod --service jobhive-web

Performance Characteristics

Response Time Targets

  • API Endpoints: < 200ms average response time
  • Real-time Updates: < 50ms WebSocket message delivery
  • AI Analysis: < 2 seconds for sentiment analysis
  • Database Queries: < 10ms for indexed queries

Throughput Capabilities

  • Concurrent Interviews: 1000+ simultaneous sessions
  • API Requests: 10,000 requests/minute
  • WebSocket Connections: 5,000 concurrent connections
  • Background Tasks: 500 tasks/minute processing

Availability Targets

  • Uptime: 99.9% availability (8.76 hours downtime/year)
  • Recovery Time: < 5 minutes for service restoration
  • Data Backup: 15-minute RPO, 1-hour RTO
  • Multi-region: Disaster recovery in secondary region