Documentation Index
Fetch the complete documentation index at: https://docs.jobhive.ai/llms.txt
Use this file to discover all available pages before exploring further.
System Architecture
Overview
JobHive is built on a modern, scalable, microservices-inspired architecture using Django as the primary backend framework. The system is designed for high availability, real-time processing, and seamless scalability to handle thousands of concurrent interviews.
High-Level Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Frontend │ │ Load Balancer │ │ CDN/CloudFront│
│ (React) │◄───┤ (ALB) │◄───┤ (Static Assets)│
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │
│ ▼
│ ┌──────────────────┐
│ │ Django Backend │
│ │ (ECS Cluster) │
│ └──────────────────┘
│ │
│ ▼
│ ┌──────────────────┐
│ │ PostgreSQL │
│ │ (RDS) │
│ └──────────────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ WebSocket │ │ Redis Cache │ │ S3 Storage │
│ (Channels) │◄───┤ (ElastiCache) │ │ (Media Files) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ LiveKit │ │ AI/ML Services │ │ Monitoring │
│ (Video/Audio) │ │ (AWS Services) │ │ (DataDog) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
Core Components
1. Backend Application Layer
Django Framework (v5.1.3)
Primary Components:
- API Layer: Django REST Framework for RESTful APIs
- Authentication: JWT-based authentication with Django Allauth
- Real-time: Django Channels for WebSocket communication
- Task Queue: Celery with Redis for background processing
- Database: PostgreSQL with advanced indexing strategies
Key Django Apps:
# Core business logic modules
INSTALLED_APPS = [
'jobhive.users', # User management and authentication
'jobhive.company', # Company profiles and management
'jobhive.interview', # Interview sessions and analysis
'jobhive.billing', # Subscription and payment processing
'jobhive.utils', # Shared utilities and middleware
]
API Architecture
RESTful Design Principles:
- Resource-based URLs (
/api/v1/interviews/, /api/v1/users/)
- HTTP methods for CRUD operations
- Consistent response formats with pagination
- Version-controlled API endpoints
- Comprehensive error handling and validation
Key API Endpoints:
# Interview Management
POST /api/interview/interviews/ # Create interview session
GET /api/interview/interviews/{id}/ # Get interview details
PUT /api/interview/interviews/{id}/ # Update interview
DELETE /api/interview/interviews/{id}/ # Delete interview
# AI Analysis
GET /api/interview/interviews/{id}/benchmark-comparison/ # Benchmark data
POST /api/interview/sentiment-sessions/ # Create sentiment session
GET /api/interview/skill-assessments/ # Get skill assessments
# User Management
GET /api/users/me/ # Current user profile
POST /api/auth/login/ # User authentication
POST /api/auth/register/ # User registration
# Billing
GET /api/billing/plans/ # Subscription plans
POST /api/billing/subscriptions/ # Create subscription
GET /api/billing/analytics/ # Billing analytics
2. Database Layer
PostgreSQL Configuration
Performance Optimizations:
-- Interview session indexes for fast queries
CREATE INDEX CONCURRENTLY idx_interview_sessions_user_status
ON interview_interviewsession(user_id, status, start_time);
CREATE INDEX CONCURRENTLY idx_interview_sessions_completion
ON interview_interviewsession(completion_percentage);
-- Sentiment analysis indexes
CREATE INDEX CONCURRENTLY idx_sentiment_sessions_created
ON interview_sentimentsession(interview_session_id, created_at);
-- Billing indexes for performance
CREATE INDEX CONCURRENTLY idx_billing_user_subscription
ON billing_customersubscription(user_id, status);
Database Models Architecture:
# Core Models Relationships
User (1) ──── (1) Company # Company profile
User (1) ──── (*) InterviewSession # Interview sessions
InterviewSession (1) ──── (*) SentimentSession # Sentiment analysis
InterviewSession (1) ──── (*) SkillAssessment # Skill evaluations
InterviewSession (1) ──── (1) CulturalFit # Cultural fit analysis
User (1) ──── (1) CustomerSubscription # Billing relationship
3. Real-Time Communication Layer
WebSocket Architecture (Django Channels)
Connection Management:
# WebSocket routing configuration
websocket_urlpatterns = [
path('ws/interview/<session_id>/', InterviewConsumer.as_asgi()),
path('ws/dashboard/<user_id>/', DashboardConsumer.as_asgi()),
]
# Consumer for real-time interview updates
class InterviewConsumer(AsyncWebsocketConsumer):
async def connect(self):
self.session_id = self.scope['url_route']['kwargs']['session_id']
self.room_group_name = f'interview_{self.session_id}'
await self.channel_layer.group_add(
self.room_group_name,
self.channel_name
)
LiveKit Integration
Video/Audio Processing:
- Real-time Communication: Low-latency video and audio streaming
- Recording: Automatic session recording for analysis
- Transcription: Real-time speech-to-text conversion
- Quality Adaptation: Dynamic quality adjustment based on connection
4. AI/ML Processing Layer
Sentiment Analysis Engine
Multi-Modal Processing:
class EnhancedSentimentAgent:
def analyze_sentiment_with_context(self, text, context_factors):
# Base sentiment analysis
sentiment_scores = self.base_sentiment_analyzer(text)
# Context enhancement
context_scores = self.analyze_context_factors(text, context_factors)
# Weighted combination
final_score = self.calculate_weighted_sentiment(
sentiment_scores, context_scores
)
return final_score
AI Agent Architecture
Orchestrated Agent System:
# AI Agent Hierarchy
OrchestratorAgent # Coordinates all AI processes
├── SentimentAnalysisAgent # Emotional and engagement analysis
├── SkillAssessmentAgent # Technical skill evaluation
├── CulturalFitAgent # Company culture alignment
├── RecommendationsAgent # Improvement suggestions
└── WorkerAgent # Background processing tasks
Key AI Capabilities:
- Natural Language Processing: Advanced text analysis and understanding
- Computer Vision: Facial expression and body language analysis
- Speech Processing: Audio quality, pace, and filler word detection
- Behavioral Analysis: Pattern recognition in responses and interactions
Redis Configuration
Caching Strategy:
# Session data caching
CACHES = {
'default': {
'BACKEND': 'django_redis.cache.RedisCache',
'LOCATION': 'redis://elasticache-endpoint:6379/1',
'OPTIONS': {
'CLIENT_CLASS': 'django_redis.client.DefaultClient',
'SERIALIZER': 'django_redis.serializers.json.JSONSerializer',
}
}
}
# Cache usage patterns
@cache_result(timeout=300) # 5-minute cache
def get_interview_analytics(user_id, date_range):
return calculate_interview_metrics(user_id, date_range)
Database Query Optimization
Performance Patterns:
# Optimized queryset patterns
interviews = InterviewSession.objects.select_related(
'user', 'job', 'job__company'
).prefetch_related(
'sentiment_sessions',
'skill_assessments__skill',
'recommendations'
).filter(
status='completed',
start_time__gte=start_date
)
6. Background Processing Layer
Celery Task Queue
Task Organization:
# Celery configuration for background tasks
@shared_task(bind=True, max_retries=3)
def process_interview_analysis(self, interview_session_id):
try:
# Run comprehensive AI analysis
session = InterviewSession.objects.get(id=interview_session_id)
# Execute analysis pipeline
sentiment_analysis.delay(session.id)
skill_assessment.delay(session.id)
cultural_fit_analysis.delay(session.id)
except Exception as exc:
self.retry(countdown=60, exc=exc)
# Periodic tasks
@periodic_task(run_every=crontab(minute=0, hour=0)) # Daily at midnight
def generate_daily_analytics():
for company in Company.objects.filter(is_active=True):
InterviewStatistics.update_statistics(company)
Data Flow Architecture
1. Interview Session Lifecycle
1. Session Creation
├── User initiates interview
├── WebSocket connection established
├── LiveKit room created
└── Database session record created
2. Real-time Processing
├── Audio/video stream processing
├── Live transcription and sentiment analysis
├── WebSocket updates to frontend
└── Intermediate results cached
3. Post-Interview Analysis
├── Comprehensive AI analysis triggered
├── Skill assessments calculated
├── Cultural fit evaluation
├── Recommendations generated
└── Final results stored and cached
4. Results Delivery
├── Dashboard updates via WebSocket
├── Email notifications sent
├── Analytics data aggregated
└── Reports generated
2. API Request Flow
Client Request
↓
Load Balancer (ALB)
↓
Django Application
↓
Authentication Middleware
↓
Permission Checks
↓
Business Logic Layer
↓
Database Query (with caching)
↓
Response Serialization
↓
JSON Response to Client
Security Architecture
Authentication & Authorization
Multi-layered Security:
# JWT-based authentication
SIMPLE_JWT = {
'ACCESS_TOKEN_LIFETIME': timedelta(minutes=60),
'REFRESH_TOKEN_LIFETIME': timedelta(days=7),
'ROTATE_REFRESH_TOKENS': True,
'ALGORITHM': 'HS256',
}
# Permission-based access control
class InterviewSessionViewSet(ModelViewSet):
permission_classes = [IsAuthenticated, IsOwnerOrReadOnly]
def get_queryset(self):
return InterviewSession.objects.filter(
user=self.request.user
)
Data Protection
Encryption and Privacy:
- Data at Rest: AES-256 encryption for sensitive data
- Data in Transit: TLS 1.3 for all communications
- Personal Data: GDPR-compliant data handling
- Video/Audio: Encrypted storage with access controls
Scalability Design
Horizontal Scaling
Container Architecture:
# ECS Service Configuration
services:
django-web:
image: jobhive/backend:latest
cpu: 512
memory: 1024
desired_count: 3
load_balancer:
target_group: jobhive-web-tg
celery-worker:
image: jobhive/backend:latest
command: celery -A config worker
cpu: 256
memory: 512
desired_count: 2
celery-beat:
image: jobhive/backend:latest
command: celery -A config beat
cpu: 128
memory: 256
desired_count: 1
Auto-Scaling Configuration
AWS Auto Scaling:
# CloudWatch metrics for scaling
SCALING_METRICS = {
'cpu_utilization': {
'target': 70,
'scale_up_cooldown': 300,
'scale_down_cooldown': 600
},
'memory_utilization': {
'target': 80,
'scale_up_cooldown': 300,
'scale_down_cooldown': 600
}
}
Monitoring and Observability
DataDog Integration
Comprehensive Monitoring:
# Custom metrics tracking
from datadog import DogStatsdClient
statsd = DogStatsdClient(host='localhost', port=8125)
class InterviewMetricsMiddleware:
def process_request(self, request):
statsd.increment('interview.session.started')
def process_response(self, request, response):
statsd.timing('interview.response_time', response.time)
statsd.increment(f'interview.response.{response.status_code}')
Logging Strategy
Structured Logging:
# Logging configuration
LOGGING = {
'version': 1,
'handlers': {
'console': {
'class': 'logging.StreamHandler',
'formatter': 'json',
},
'datadog': {
'class': 'datadog.DogStatsdLogHandler',
'level': 'INFO',
}
},
'formatters': {
'json': {
'class': 'pythonjsonlogger.jsonlogger.JsonFormatter',
'format': '%(asctime)s %(name)s %(levelname)s %(message)s'
}
}
}
Deployment Architecture
AWS Infrastructure
Production Environment:
# Infrastructure Components
VPC:
- Public Subnets (ALB, NAT Gateway)
- Private Subnets (ECS, RDS)
- Database Subnets (Multi-AZ RDS)
ECS Cluster:
- Fargate launch type
- Auto Scaling Groups
- Service discovery
RDS PostgreSQL:
- Multi-AZ deployment
- Read replicas for analytics
- Automated backups
ElastiCache Redis:
- Cluster mode enabled
- Multi-AZ replication
- Automatic failover
CI/CD Pipeline
Automated Deployment:
# GitHub Actions Workflow
name: Deploy to Production
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Build Docker Image
run: docker build -t jobhive/backend:${{ github.sha }} .
- name: Push to ECR
run: docker push $ECR_REGISTRY/jobhive/backend:${{ github.sha }}
- name: Deploy to ECS
run: aws ecs update-service --cluster prod --service jobhive-web
Response Time Targets
- API Endpoints: < 200ms average response time
- Real-time Updates: < 50ms WebSocket message delivery
- AI Analysis: < 2 seconds for sentiment analysis
- Database Queries: < 10ms for indexed queries
Throughput Capabilities
- Concurrent Interviews: 1000+ simultaneous sessions
- API Requests: 10,000 requests/minute
- WebSocket Connections: 5,000 concurrent connections
- Background Tasks: 500 tasks/minute processing
Availability Targets
- Uptime: 99.9% availability (8.76 hours downtime/year)
- Recovery Time: < 5 minutes for service restoration
- Data Backup: 15-minute RPO, 1-hour RTO
- Multi-region: Disaster recovery in secondary region