Infrastructure Documentation
Overview
JobHive’s infrastructure is built on AWS using a modern, scalable architecture designed for high availability, performance, and security. The platform utilizes containerized services, managed databases, and comprehensive monitoring.AWS Infrastructure Architecture
High-Level Architecture Diagram
Copy
Internet
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CloudFront CDN │
│ (Global Distribution) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Application Load Balancer │
│ (Multi-AZ, Auto Scaling) │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ ECS Fargate Cluster │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Django │ │ Celery │ │ Celery │ │
│ │ Web Tasks │ │ Workers │ │ Beat │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ RDS PostgreSQL│ │ ElastiCache Redis│ │ S3 │
│ (Multi-AZ) │ │ (Clustered) │ │ (Media/Static) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Core Infrastructure Components
1. Compute Layer - Amazon ECS Fargate
ECS Cluster Configuration
Copy
{
"cluster_name": "jobhive-production",
"launch_type": "FARGATE",
"capacity_providers": ["FARGATE", "FARGATE_SPOT"],
"default_capacity_provider_strategy": [
{
"capacity_provider": "FARGATE",
"weight": 4,
"base": 2
},
{
"capacity_provider": "FARGATE_SPOT",
"weight": 1
}
]
}
Service Definitions
Django Web ServiceCopy
service_name: jobhive-web
task_definition: jobhive-web-task:latest
launch_type: FARGATE
desired_count: 3
platform_version: LATEST
network_configuration:
subnets:
- subnet-1a2b3c4d
- subnet-5e6f7g8h
security_groups:
- sg-web-application
assign_public_ip: DISABLED
load_balancers:
- target_group_arn: arn:aws:elasticloadbalancing:us-east-1:123456789:targetgroup/jobhive-web
container_name: django-web
container_port: 8000
auto_scaling:
min_capacity: 2
max_capacity: 10
target_cpu_utilization: 70
target_memory_utilization: 80
Copy
service_name: jobhive-celery-workers
task_definition: jobhive-celery-worker-task:latest
launch_type: FARGATE
desired_count: 2
auto_scaling:
min_capacity: 1
max_capacity: 8
target_cpu_utilization: 75
custom_metrics:
- metric_name: celery_queue_length
target_value: 10
Copy
service_name: jobhive-celery-beat
task_definition: jobhive-celery-beat-task:latest
launch_type: FARGATE
desired_count: 1 # Single instance for scheduled tasks
deployment_configuration:
maximum_percent: 100
minimum_healthy_percent: 0 # Allow complete replacement
Task Definitions
Web Application TaskCopy
{
"family": "jobhive-web-task",
"network_mode": "awsvpc",
"requires_compatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"execution_role_arn": "arn:aws:iam::123456789:role/ecsTaskExecutionRole",
"task_role_arn": "arn:aws:iam::123456789:role/jobhiveTaskRole",
"container_definitions": [
{
"name": "django-web",
"image": "123456789.dkr.ecr.us-east-1.amazonaws.com/jobhive/backend:latest",
"port_mappings": [
{
"container_port": 8000,
"protocol": "tcp"
}
],
"environment": [
{"name": "DJANGO_SETTINGS_MODULE", "value": "config.settings.production"},
{"name": "AWS_DEFAULT_REGION", "value": "us-east-1"}
],
"secrets": [
{"name": "DATABASE_URL", "value_from": "arn:aws:secretsmanager:us-east-1:123456789:secret:jobhive/database"},
{"name": "REDIS_URL", "value_from": "arn:aws:secretsmanager:us-east-1:123456789:secret:jobhive/redis"},
{"name": "DJANGO_SECRET_KEY", "value_from": "arn:aws:secretsmanager:us-east-1:123456789:secret:jobhive/django"}
],
"log_configuration": {
"log_driver": "awslogs",
"options": {
"awslogs-group": "/ecs/jobhive-web",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"health_check": {
"command": ["CMD-SHELL", "curl -f http://localhost:8000/health/ || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"start_period": 60
}
}
]
}
2. Database Layer - Amazon RDS PostgreSQL
RDS Configuration
Copy
engine: postgres
engine_version: "15.4"
instance_class: db.r6g.xlarge
allocated_storage: 1000
storage_type: gp3
storage_encrypted: true
kms_key_id: alias/rds-encryption-key
multi_az: true
availability_zones:
- us-east-1a
- us-east-1b
backup_retention_period: 30
backup_window: "03:00-04:00"
maintenance_window: "sun:04:00-sun:05:00"
monitoring_interval: 60
performance_insights_enabled: true
performance_insights_retention_period: 731
parameter_group: jobhive-postgres15-params
option_group: default:postgres-15
vpc_security_group_ids:
- sg-rds-database
db_subnet_group: jobhive-db-subnet-group
Database Performance Configuration
Copy
-- PostgreSQL parameter group settings
shared_preload_libraries = 'pg_stat_statements'
max_connections = 200
shared_buffers = '4GB'
effective_cache_size = '12GB'
work_mem = '64MB'
maintenance_work_mem = '512MB'
random_page_cost = 1.1
effective_io_concurrency = 200
wal_buffers = '16MB'
checkpoint_completion_target = 0.9
max_wal_size = '4GB'
min_wal_size = '1GB'
-- Performance optimization settings
log_statement = 'mod'
log_min_duration_statement = 1000
track_activity_query_size = 2048
pg_stat_statements.track = all
Read Replicas
Copy
read_replica_1:
instance_class: db.r6g.large
availability_zone: us-east-1c
publicly_accessible: false
auto_minor_version_upgrade: true
read_replica_2:
instance_class: db.r6g.large
availability_zone: us-east-1d
publicly_accessible: false
auto_minor_version_upgrade: true
3. Caching Layer - Amazon ElastiCache Redis
Redis Cluster Configuration
Copy
cache_cluster_id: jobhive-redis-cluster
engine: redis
engine_version: "7.0"
node_type: cache.r7g.large
num_cache_nodes: 3
replication_group_id: jobhive-redis-replication
description: "JobHive Redis cluster for caching and sessions"
port: 6379
parameter_group_name: jobhive-redis7-params
subnet_group_name: jobhive-cache-subnet-group
security_group_ids:
- sg-redis-cache
at_rest_encryption_enabled: true
transit_encryption_enabled: true
auth_token: "stored-in-secrets-manager"
automatic_failover_enabled: true
multi_az_enabled: true
backup_retention_limit: 7
backup_window: "02:00-03:00"
maintenance_window: "sun:01:00-sun:02:00"
notification_topic_arn: "arn:aws:sns:us-east-1:123456789:jobhive-cache-notifications"
Redis Performance Tuning
Copy
# Redis configuration parameters
maxmemory-policy allkeys-lru
maxmemory 6gb
save 900 1
save 300 10
save 60 10000
timeout 300
tcp-keepalive 300
databases 16
4. Storage Layer - Amazon S3
S3 Bucket Configuration
Copy
# Primary media bucket
media_bucket:
name: jobhive-media-production
region: us-east-1
versioning: Enabled
encryption:
sse_algorithm: AES256
lifecycle_policy:
- id: delete_incomplete_multipart_uploads
status: Enabled
abort_incomplete_multipart_upload:
days_after_initiation: 7
- id: transition_to_ia
status: Enabled
transition:
days: 30
storage_class: STANDARD_IA
- id: transition_to_glacier
status: Enabled
transition:
days: 90
storage_class: GLACIER
# Static assets bucket
static_bucket:
name: jobhive-static-production
region: us-east-1
public_read_access: true
website_configuration:
index_document: index.html
error_document: error.html
cors_configuration:
- allowed_origins: ["https://jobhive.com", "https://app.jobhive.com"]
allowed_methods: ["GET", "HEAD"]
allowed_headers: ["*"]
max_age_seconds: 3600
# Backup bucket
backup_bucket:
name: jobhive-backups-production
region: us-west-2 # Different region for disaster recovery
versioning: Enabled
encryption:
sse_algorithm: aws:kms
kms_master_key_id: alias/jobhive-backup-key
lifecycle_policy:
- id: delete_old_backups
status: Enabled
expiration:
days: 2555 # 7 years retention
5. Content Delivery - Amazon CloudFront
CloudFront Distribution
Copy
distribution:
enabled: true
price_class: PriceClass_All
http_version: http2
is_ipv6_enabled: true
origins:
- id: s3-static-origin
domain_name: jobhive-static-production.s3.amazonaws.com
s3_origin_config:
origin_access_identity: E1234567890ABC
- id: alb-api-origin
domain_name: jobhive-alb-1234567890.us-east-1.elb.amazonaws.com
custom_origin_config:
http_port: 443
origin_protocol_policy: https-only
origin_ssl_protocols: [TLSv1.2]
default_cache_behavior:
target_origin_id: alb-api-origin
viewer_protocol_policy: redirect-to-https
allowed_methods: [GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE]
cached_methods: [GET, HEAD, OPTIONS]
compress: true
forwarded_values:
query_string: true
headers: [Authorization, Content-Type]
cookies:
forward: whitelist
whitelisted_names: [sessionid, csrftoken]
cache_behaviors:
- path_pattern: "/static/*"
target_origin_id: s3-static-origin
viewer_protocol_policy: redirect-to-https
allowed_methods: [GET, HEAD]
cached_methods: [GET, HEAD]
compress: true
min_ttl: 86400
default_ttl: 31536000
max_ttl: 31536000
restrictions:
geo_restriction:
restriction_type: none
viewer_certificate:
acm_certificate_arn: arn:aws:acm:us-east-1:123456789:certificate/12345678-1234-1234-1234-123456789012
ssl_support_method: sni-only
minimum_protocol_version: TLSv1.2_2021
6. Load Balancing - Application Load Balancer
ALB Configuration
Copy
load_balancer:
name: jobhive-alb
scheme: internet-facing
type: application
ip_address_type: ipv4
subnets:
- subnet-1a2b3c4d # Public subnet AZ-a
- subnet-5e6f7g8h # Public subnet AZ-b
- subnet-9i0j1k2l # Public subnet AZ-c
security_groups:
- sg-alb-external
listeners:
- port: 443
protocol: HTTPS
ssl_policy: ELBSecurityPolicy-TLS-1-2-2017-01
certificate_arn: arn:aws:acm:us-east-1:123456789:certificate/12345678-1234-1234-1234-123456789012
default_actions:
- type: forward
target_group_arn: arn:aws:elasticloadbalancing:us-east-1:123456789:targetgroup/jobhive-web
- port: 80
protocol: HTTP
default_actions:
- type: redirect
redirect:
protocol: HTTPS
port: 443
status_code: HTTP_301
target_groups:
- name: jobhive-web
port: 8000
protocol: HTTP
target_type: ip
vpc_id: vpc-12345678
health_check:
enabled: true
healthy_threshold_count: 2
unhealthy_threshold_count: 3
timeout: 5
interval: 30
path: /health/
matcher: 200
protocol: HTTP
port: traffic-port
Network Architecture
VPC Configuration
Copy
vpc:
cidr_block: 10.0.0.0/16
enable_dns_hostnames: true
enable_dns_support: true
# Public subnets for ALB and NAT Gateway
public_subnets:
- cidr: 10.0.1.0/24
availability_zone: us-east-1a
- cidr: 10.0.2.0/24
availability_zone: us-east-1b
- cidr: 10.0.3.0/24
availability_zone: us-east-1c
# Private subnets for ECS tasks
private_subnets:
- cidr: 10.0.10.0/24
availability_zone: us-east-1a
- cidr: 10.0.11.0/24
availability_zone: us-east-1b
- cidr: 10.0.12.0/24
availability_zone: us-east-1c
# Database subnets
database_subnets:
- cidr: 10.0.20.0/24
availability_zone: us-east-1a
- cidr: 10.0.21.0/24
availability_zone: us-east-1b
- cidr: 10.0.22.0/24
availability_zone: us-east-1c
# Internet Gateway
internet_gateway:
name: jobhive-igw
# NAT Gateways for private subnet internet access
nat_gateways:
- name: jobhive-nat-1a
subnet: public-subnet-1a
allocation_id: eipalloc-12345678
- name: jobhive-nat-1b
subnet: public-subnet-1b
allocation_id: eipalloc-87654321
Security Groups
Copy
# ALB Security Group
sg_alb_external:
name: jobhive-alb-external
description: Security group for ALB
vpc_id: vpc-12345678
ingress_rules:
- from_port: 80
to_port: 80
protocol: tcp
cidr_blocks: [0.0.0.0/0]
- from_port: 443
to_port: 443
protocol: tcp
cidr_blocks: [0.0.0.0/0]
egress_rules:
- from_port: 0
to_port: 65535
protocol: tcp
cidr_blocks: [10.0.0.0/16]
# ECS Tasks Security Group
sg_web_application:
name: jobhive-web-application
description: Security group for ECS web tasks
vpc_id: vpc-12345678
ingress_rules:
- from_port: 8000
to_port: 8000
protocol: tcp
source_security_group_id: sg-alb-external
egress_rules:
- from_port: 0
to_port: 65535
protocol: tcp
cidr_blocks: [0.0.0.0/0]
# RDS Security Group
sg_rds_database:
name: jobhive-rds-database
description: Security group for RDS PostgreSQL
vpc_id: vpc-12345678
ingress_rules:
- from_port: 5432
to_port: 5432
protocol: tcp
source_security_group_id: sg-web-application
egress_rules: []
# Redis Security Group
sg_redis_cache:
name: jobhive-redis-cache
description: Security group for ElastiCache Redis
vpc_id: vpc-12345678
ingress_rules:
- from_port: 6379
to_port: 6379
protocol: tcp
source_security_group_id: sg-web-application
egress_rules: []
Security & Access Management
IAM Roles and Policies
ECS Task Execution Role
Copy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"logs:CreateLogStream",
"logs:PutLogEvents",
"secretsmanager:GetSecretValue"
],
"Resource": "*"
}
]
}
Application Task Role
Copy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::jobhive-media-production/*",
"arn:aws:s3:::jobhive-static-production/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::jobhive-media-production",
"arn:aws:s3:::jobhive-static-production"
]
},
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue"
],
"Resource": [
"arn:aws:secretsmanager:us-east-1:123456789:secret:jobhive/*"
]
},
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter",
"ssm:GetParameters"
],
"Resource": [
"arn:aws:ssm:us-east-1:123456789:parameter/jobhive/*"
]
}
]
}
Secrets Management
AWS Secrets Manager
Copy
secrets:
database_credentials:
name: jobhive/database
description: PostgreSQL database credentials
secret_string:
engine: postgres
host: jobhive-db.cluster-xyz.us-east-1.rds.amazonaws.com
username: jobhive_app
password: "auto-generated-secure-password"
dbname: jobhive_production
port: 5432
redis_credentials:
name: jobhive/redis
description: Redis cache credentials
secret_string:
host: jobhive-redis-cluster.abc123.cache.amazonaws.com
port: 6379
auth_token: "auto-generated-auth-token"
django_secrets:
name: jobhive/django
description: Django application secrets
secret_string:
secret_key: "auto-generated-django-secret-key"
jwt_secret: "auto-generated-jwt-secret"
stripe_keys:
name: jobhive/stripe
description: Stripe payment processing keys
secret_string:
publishable_key: pk_live_...
secret_key: sk_live_...
webhook_secret: whsec_...
external_services:
name: jobhive/external
description: External service API keys
secret_string:
openai_api_key: sk-...
datadog_api_key: ...
aws_transcribe_region: us-east-1
Monitoring & Observability
CloudWatch Configuration
Log Groups
Copy
log_groups:
- name: /ecs/jobhive-web
retention_in_days: 30
kms_key_id: arn:aws:kms:us-east-1:123456789:key/12345678-1234-1234-1234-123456789012
- name: /ecs/jobhive-celery-worker
retention_in_days: 30
kms_key_id: arn:aws:kms:us-east-1:123456789:key/12345678-1234-1234-1234-123456789012
- name: /ecs/jobhive-celery-beat
retention_in_days: 7
kms_key_id: arn:aws:kms:us-east-1:123456789:key/12345678-1234-1234-1234-123456789012
- name: /aws/rds/instance/jobhive-db/postgresql
retention_in_days: 30
- name: /aws/elasticache/jobhive-redis-cluster
retention_in_days: 14
CloudWatch Alarms
Copy
alarms:
# Application health alarms
- name: jobhive-web-high-cpu
description: Web service CPU utilization is high
metric_name: CPUUtilization
namespace: AWS/ECS
statistic: Average
period: 300
evaluation_periods: 2
threshold: 80
comparison_operator: GreaterThanThreshold
alarm_actions:
- arn:aws:sns:us-east-1:123456789:jobhive-alerts
- name: jobhive-web-high-memory
description: Web service memory utilization is high
metric_name: MemoryUtilization
namespace: AWS/ECS
statistic: Average
period: 300
evaluation_periods: 2
threshold: 85
comparison_operator: GreaterThanThreshold
# Database alarms
- name: jobhive-db-high-cpu
description: Database CPU utilization is high
metric_name: CPUUtilization
namespace: AWS/RDS
statistic: Average
period: 300
evaluation_periods: 3
threshold: 75
comparison_operator: GreaterThanThreshold
- name: jobhive-db-low-freeable-memory
description: Database freeable memory is low
metric_name: FreeableMemory
namespace: AWS/RDS
statistic: Average
period: 300
evaluation_periods: 2
threshold: 1000000000 # 1GB in bytes
comparison_operator: LessThanThreshold
# Cache alarms
- name: jobhive-redis-high-cpu
description: Redis CPU utilization is high
metric_name: CPUUtilization
namespace: AWS/ElastiCache
statistic: Average
period: 300
evaluation_periods: 2
threshold: 80
comparison_operator: GreaterThanThreshold
DataDog Integration
Copy
datadog_agent:
deployment_type: sidecar
api_key_secret: arn:aws:secretsmanager:us-east-1:123456789:secret:jobhive/datadog
configuration:
logs_enabled: true
process_agent_enabled: true
system_probe_enabled: true
integrations:
- postgres:
host: "%%host%%"
port: 5432
username: datadog
password: "%%env_DD_POSTGRES_PASSWORD%%"
tags:
- environment:production
- service:jobhive-db
- redisdb:
host: "%%host%%"
port: 6379
password: "%%env_DD_REDIS_PASSWORD%%"
tags:
- environment:production
- service:jobhive-cache
tags:
- environment:production
- service:jobhive
- version:1.0.0
Auto Scaling Configuration
ECS Service Auto Scaling
Copy
auto_scaling_policies:
# Web service scaling
- service_name: jobhive-web
policies:
- name: cpu-scaling
policy_type: TargetTrackingScaling
target_tracking_scaling_policy:
target_value: 70.0
predefined_metric_specification:
predefined_metric_type: ECSServiceAverageCPUUtilization
scale_out_cooldown: 300
scale_in_cooldown: 300
- name: memory-scaling
policy_type: TargetTrackingScaling
target_tracking_scaling_policy:
target_value: 80.0
predefined_metric_specification:
predefined_metric_type: ECSServiceAverageMemoryUtilization
scale_out_cooldown: 300
scale_in_cooldown: 600
# Celery worker scaling
- service_name: jobhive-celery-workers
policies:
- name: queue-based-scaling
policy_type: StepScaling
step_scaling_policy:
adjustment_type: ChangeInCapacity
cooldown: 300
step_adjustments:
- lower_bound: 0
upper_bound: 5
scaling_adjustment: 1
- lower_bound: 5
upper_bound: 15
scaling_adjustment: 2
- lower_bound: 15
scaling_adjustment: 3
Application Load Balancer Auto Scaling
Copy
target_group_scaling:
target_group: jobhive-web
health_check_grace_period: 300
scaling_policies:
- metric_type: request_count_per_target
target_value: 1000
scale_out_cooldown: 300
scale_in_cooldown: 300
Backup & Disaster Recovery
Database Backup Strategy
Copy
rds_backup:
automated_backups:
backup_retention_period: 30
backup_window: "03:00-04:00"
copy_tags_to_snapshot: true
delete_automated_backups: true
manual_snapshots:
- schedule: daily
retention: 7_days
encryption: true
- schedule: weekly
retention: 4_weeks
encryption: true
- schedule: monthly
retention: 12_months
encryption: true
cross_region_copy:
destination_region: us-west-2
kms_key_id: alias/jobhive-backup-key-west
point_in_time_recovery:
enabled: true
earliest_restorable_time: current_time - 30_days
Application Backup Strategy
Copy
application_backup:
s3_media_backup:
source_bucket: jobhive-media-production
destination_bucket: jobhive-backups-production
schedule: daily
retention: 90_days
cross_region_replication:
destination_bucket: jobhive-backups-west
destination_region: us-west-2
storage_class: STANDARD_IA
configuration_backup:
ecs_task_definitions: daily
cloudformation_templates: on_change
secrets_manager: weekly
parameter_store: weekly
Disaster Recovery Plan
Copy
disaster_recovery:
rto: 4_hours # Recovery Time Objective
rpo: 15_minutes # Recovery Point Objective
primary_region: us-east-1
secondary_region: us-west-2
failover_procedures:
1_database_failover:
- create_rds_instance_from_snapshot
- update_dns_records
- validate_data_integrity
2_application_failover:
- deploy_ecs_services_in_secondary_region
- update_load_balancer_targets
- update_cloudfront_origins
3_data_synchronization:
- restore_s3_media_from_backup
- validate_application_functionality
- update_monitoring_dashboards
testing_schedule:
frequency: quarterly
scope: full_failover_test
documentation: required
Cost Optimization
Resource Right-Sizing
Copy
cost_optimization:
compute:
- use_spot_instances: 30%_of_celery_workers
- scheduled_scaling: scale_down_non_business_hours
- instance_right_sizing: quarterly_review
storage:
- s3_intelligent_tiering: enabled
- lifecycle_policies: automated_transition
- compress_backups: enabled
- delete_unused_snapshots: automated
data_transfer:
- cloudfront_caching: aggressive_static_content
- compression: enabled_for_all_responses
- regional_data_processing: keep_data_close_to_users
reserved_capacity:
rds_reserved_instances: 1_year_term
elasticache_reserved_nodes: 1_year_term
savings_plans: compute_savings_plan_1_year
Resource Tagging Strategy
Copy
tagging_strategy:
mandatory_tags:
- Environment: [production, staging, development]
- Service: [web, worker, database, cache]
- Owner: [team-name]
- Project: jobhive
- CostCenter: engineering
- Backup: [required, not-required]
-
optional_tags:
- Version: application_version
- Component: [api, frontend, ml-engine]
- Schedule: [24x7, business-hours, on-demand]
Performance Optimization
Application Performance
Copy
performance_tuning:
database:
- connection_pooling: pgbouncer_sidecar
- query_optimization: automated_explain_plans
- index_optimization: quarterly_review
- read_replicas: 2_replicas_for_analytics
caching:
- redis_cluster: 3_nodes_with_replication
- application_cache: django_cache_framework
- cdn_cache: cloudfront_edge_caching
- browser_cache: appropriate_cache_headers
application:
- async_processing: celery_task_queue
- static_compression: gzip_and_brotli
- image_optimization: automated_resizing
- api_response_compression: enabled
Monitoring Performance Metrics
Copy
performance_metrics:
application_metrics:
- response_time_p95: target_200ms
- throughput: requests_per_second
- error_rate: target_below_0.1%
- availability: target_99.9%
infrastructure_metrics:
- cpu_utilization: target_below_70%
- memory_utilization: target_below_80%
- disk_io: iops_and_throughput
- network_latency: cross_az_communication
business_metrics:
- interview_completion_rate: target_above_95%
- user_satisfaction: response_time_perception
- conversion_funnel: signup_to_active_user
- revenue_impact: performance_correlation
