Pertemuan 4: Manajemen Infrastruktur Cloud Progress: 0%

Manajemen Infrastruktur Cloud

Provisioning, Scaling, dan Monitoring di Multi-Cloud Environment

Panduan Praktikum: Pelajari konsep manajemen infrastruktur cloud modern termasuk automated provisioning, auto-scaling, dan comprehensive monitoring di semua platform.

Tujuan Pembelajaran

Automated Provisioning

Mengimplementasikan automated infrastructure provisioning menggunakan cloud-init dan templates

Scaling Strategies

Menerapkan vertical dan horizontal scaling di berbagai platform cloud

Comprehensive Monitoring

Setup monitoring dan alerting systems untuk infrastructure management

Konsep Manajemen Infrastruktur Cloud

1. Tiga Pilar Utama Manajemen Infrastruktur

Provisioning

Definisi: Proses penyediaan dan penyiapan resource cloud

  • Dapat dilakukan secara manual, scripted, atau automated
  • Membuat VM, storage, network components
  • Infrastructure as Code (IaC) approach
  • Cloud-init untuk automated setup
Automation

Scaling

Definisi: Kemampuan menambah/mengurangi resource berdasarkan beban

  • Vertical Scaling: Menambah spesifikasi instance
  • Horizontal Scaling: Menambah jumlah instance
  • Auto-scaling berdasarkan metrics
  • Load balancing integration
Elasticity

Monitoring

Definisi: Pemantauan performa, kesehatan, dan utilisasi resource

  • Mendeteksi issues sebelum berdampak pada pengguna
  • Real-time metrics collection
  • Alerting dan notification systems
  • Performance analysis dan optimization
Observability

2. Workflow Manajemen Infrastruktur Modern

Provisioning
Automated Setup
Deployment
Application Deployment
Monitoring
Real-time Metrics
Scaling
Auto-scaling
Optimization
Continuous Improvement

Continuous Feedback Loop

Setiap tahap memberikan feedback ke tahap sebelumnya untuk continuous improvement dan optimization berdasarkan real-time metrics dan performance data.

Automated Provisioning dengan Cloud-Init

Cloud-Init: Automated Server Provisioning

Cloud-Init adalah standar industri untuk customizing cloud instances selama boot process. Mendukung semua major cloud platforms dan distribusi Linux.

OpenNebula Contextualization

OpenNebula menggunakan contextualization untuk inject custom data ke VM:

# OpenNebula Contextualization CONTEXT = [ TOKEN = "YES", SSH_PUBLIC_KEY = "$USER[SSH_PUBLIC_KEY]", HOSTNAME = "web-server-$VMID", USER_DATA = "base64:/path/to/cloud-init.yaml" ]
AWS User Data

AWS menggunakan User Data field di EC2 launch configuration:

# AWS User Data (base64 encoded) #!/bin/bash yum update -y yum install -y httpd systemctl start httpd systemctl enable httpd
Huawei Cloud User Data

Huawei Cloud mendukung user data injection saat create ECS:

# Huawei Cloud User Data #!/bin/bash apt-get update apt-get install -y nginx systemctl start nginx systemctl enable nginx

Advanced Cloud-Init Configuration

#cloud-config # Comprehensive cloud-init script for web server provisioning package_update: true package_upgrade: true # Install required packages packages: - nginx - mysql-server - php-fpm - php-mysql - python3-pip - stress-ng - htop - nethogs # Create custom user users: - name: webadmin groups: sudo shell: /bin/bash sudo: ['ALL=(ALL) NOPASSWD:ALL'] ssh-authorized-keys: - ssh-rsa AAAAB3NzaC1yc2E... your-public-key # Write configuration files write_files: - path: /etc/nginx/sites-available/default content: | server { listen 80; server_name _; root /var/www/html; index index.html index.php; location / { try_files $uri $uri/ =404; } location ~ \.php$ { include snippets/fastcgi-php.conf; fastcgi_pass unix:/var/run/php/php8.1-fpm.sock; } } - path: /var/www/html/index.php content: | <?php phpinfo(); ?> - path: /home/webadmin/monitoring-script.sh content: | #!/bin/bash while true; do echo "$(date) - CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}')% | Memory: $(free -h | grep Mem | awk '{print $3"/"$2}')" >> /var/log/system-metrics.log sleep 30 done permissions: '0755' # Run commands runcmd: - systemctl enable nginx - systemctl start nginx - systemctl enable php8.1-fpm - systemctl start php8.1-fpm - ufw allow 'Nginx Full' - chown -R webadmin:webadmin /var/www/html - nohup /home/webadmin/monitoring-script.sh > /dev/null 2>&1 & # Final message final_message: "🎉 System provisioning completed successfully! Run 'sudo systemctl status nginx' to verify."
Package Management
Automatic installation
User Management
Custom user creation
Configuration
File management
Commands
Service startup

Scaling Strategies

Vertical vs Horizontal Scaling

⬆️

Vertical Scaling

Scale up/down - Mengubah kapasitas instance yang ada

Use Cases:
  • Database servers
  • Applications dengan single thread
  • Memory-intensive workloads
↔️

Horizontal Scaling

Scale out/in - Menambah/mengurangi jumlah instances

Use Cases:
  • Web applications
  • Microservices
  • Stateless applications

OpenNebula Scaling

Vertical Scaling
# Resize VM melalui CLI onevm resize [vm-id] --memory 4096 --cpu 4 # Atau melalui Sunstone Dashboard # VM → Update → CPU/Memory Configuration
Horizontal Scaling dengan OneFlow
# Service template untuk auto-scaling NAME = "web-service" SERVICE_TEMPLATE = [ ROLE = [ NAME = "web", CARDINALITY = "2", MIN_VMS = "1", MAX_VMS = "5", ELASTICITY = [ EXPRESSION = "ATTRIBUTE=\"CPU_USAGE\" PERCENTAGE=\"80\"", TYPE = "CHANGE", ADJUST = "1" ] ] ]

AWS Auto Scaling

Vertical Scaling (EC2 Resize)
# Stop instance terlebih dahulu aws ec2 stop-instances --instance-ids i-1234567890abcdef0 # Modify instance type aws ec2 modify-instance-attribute \ --instance-id i-1234567890abcdef0 \ --instance-type "t3.medium" # Start instance kembali aws ec2 start-instances --instance-ids i-1234567890abcdef0
Horizontal Scaling (Auto Scaling Group)
# Create Auto Scaling Group aws autoscaling create-auto-scaling-group \ --auto-scaling-group-name web-asg \ --launch-template LaunchTemplateName=web-template,Version='$Latest' \ --min-size 1 \ --max-size 10 \ --desired-capacity 2 \ --vpc-zone-identifier "subnet-123456,subnet-789012" \ --target-group-arns "arn:aws:elasticloadbalancing:..." # Scaling policy aws autoscaling put-scaling-policy \ --policy-name scale-out-cpu \ --auto-scaling-group-name web-asg \ --scaling-adjustment 1 \ --adjustment-type ChangeInCapacity \ --cooldown 300

Huawei Cloud Scaling

Vertical Scaling (ECS Resize)
# Through Console: # ECS → Instance → More → Modify Specifications # Atau melalui API POST https://ecs.{region}.myhuaweicloud.com/v1/{project_id}/cloudservers/{server_id}/resize { "resize": { "flavorRef": "s6.medium.2" } }
Horizontal Scaling (Auto Scaling)
# Create scaling group { "scaling_group_name": "web-scaling-group", "scaling_configuration_id": "config-123", "desire_instance_number": 2, "min_instance_number": 1, "max_instance_number": 5, "cool_down_time": 300 } # Create scaling policy { "scaling_policy_name": "cpu-scale-out", "scaling_group_id": "group-123", "scaling_policy_type": "ALARM", "alarm_id": "alarm-123", "cool_down_time": 300 }

Load Balancer Configuration

OpenNebula
# Virtual Router dengan load balancing TYPE = "VIRTUALROUTER" SCHED_REQUIREMENTS = "CLUSTER_ID=0" # Forward rules FORWARD = [ PROTOCOL = "TCP", EXTERNAL_PORT = "80", INTERNAL_PORT = "80" ]
AWS ELB
# Create Application Load Balancer aws elbv2 create-load-balancer \ --name web-alb \ --subnets subnet-123 subnet-456 \ --security-groups sg-789 # Create target group aws elbv2 create-target-group \ --name web-targets \ --protocol HTTP \ --port 80
Huawei Cloud ELB
# Create load balancer { "name": "web-elb", "description": "Web application load balancer", "vpc_id": "vpc-123", "bandwidth": 5, "type": "External" } # Add listeners { "protocol": "HTTP", "port": 80, "backend_protocol": "HTTP", "backend_port": 80 }

Monitoring & Alerting

Comprehensive Monitoring Architecture

📊
Metrics Collection
CPU, Memory, Disk, Network
🚨
Alerting
Real-time notifications
📈
Visualization
Dashboards & Reports
🔍
Logging
Centralized log management
OpenNebula Monitoring
  • Built-in Monitoring: VM metrics di Sunstone
  • Zabbix Integration: Enterprise monitoring
  • Custom Scripts: SSH-based monitoring
  • OneMonit: Advanced monitoring solution
AWS CloudWatch
  • CloudWatch: Comprehensive monitoring
  • CloudWatch Agent: Enhanced metrics
  • CloudWatch Logs: Centralized logging
  • CloudWatch Alarms: Automated alerting
Huawei Cloud Eye
  • Cloud Eye: Real-time monitoring
  • CES Agent: Custom metrics
  • Log Service: Log management
  • Alarm Notifications: SMS/Email alerts

Custom Monitoring Script

#!/bin/bash # comprehensive-monitoring.sh # Configuration LOG_FILE="/var/log/system-metrics.log" ALERT_THRESHOLD_CPU=80 ALERT_THRESHOLD_MEMORY=90 ALERT_THRESHOLD_DISK=85 while true; do TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S') # CPU Usage CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) # Memory Usage MEMORY_USAGE=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}') # Disk Usage DISK_USAGE=$(df / | awk 'END{print $5}' | sed 's/%//') # Network Usage NETWORK_RX=$(cat /sys/class/net/eth0/statistics/rx_bytes) NETWORK_TX=$(cat /sys/class/net/eth0/statistics/tx_bytes) # Log metrics echo "$TIMESTAMP - CPU: ${CPU_USAGE}% | Memory: ${MEMORY_USAGE}% | Disk: ${DISK_USAGE}%" >> $LOG_FILE # Check alerts if (( $(echo "$CPU_USAGE > $ALERT_THRESHOLD_CPU" | bc -l) )); then echo "ALERT: High CPU usage: ${CPU_USAGE}%" >> $LOG_FILE # Send notification (email, slack, etc.) fi if (( $(echo "$MEMORY_USAGE > $ALERT_THRESHOLD_MEMORY" | bc -l) )); then echo "ALERT: High Memory usage: ${MEMORY_USAGE}%" >> $LOG_FILE fi if (( $(echo "$DISK_USAGE > $ALERT_THRESHOLD_DISK" | bc -l) )); then echo "ALERT: High Disk usage: ${DISK_USAGE}%" >> $LOG_FILE fi sleep 60 done

AWS CloudWatch Agent Config

{ "agent": { "metrics_collection_interval": 60, "run_as_user": "root" }, "metrics": { "append_dimensions": { "InstanceId": "${aws:InstanceId}" }, "metrics_collected": { "cpu": { "measurement": [ "cpu_usage_idle", "cpu_usage_user", "cpu_usage_system" ], "metrics_collection_interval": 60 }, "disk": { "measurement": [ "used_percent" ], "metrics_collection_interval": 60, "resources": [ "/" ] }, "mem": { "measurement": [ "mem_used_percent" ], "metrics_collection_interval": 60 } } } }

Stress Testing & Performance Analysis

# Install stress tool sudo apt install stress-ng -y # Ubuntu sudo yum install stress-ng -y # CentOS/Amazon Linux # Generate CPU load (80% for 5 minutes) stress-ng --cpu 4 --cpu-load 80 --timeout 300 & # Generate memory load stress-ng --vm 2 --vm-bytes 1G --timeout 300 & # Generate disk I/O load stress-ng --hdd 2 --hdd-bytes 1G --timeout 300 & # Monitor resource usage in real-time watch -n 1 'echo "CPU: $(top -bn1 | grep Cpu | awk \"{print \$2}\")% | Memory: $(free -h | grep Mem | awk \"{print \$3}/\$2*100\" | bc -l)% | Disk: $(df -h / | awk \"NR==2 {print \$5}\")"' # Performance testing with ab (Apache Bench) ab -n 1000 -c 10 http://your-website.com/
CPU Stress
Multi-core testing
Memory Stress
RAM utilization
Disk I/O
Storage performance
Network
Bandwidth testing

Tugas Praktikum & Penilaian

Tugas 1: Automated Provisioning

Objective: Implementasi automated infrastructure provisioning

  • Buat cloud-init script untuk automated web server setup
  • Deploy di minimal 2 platform cloud berbeda
  • Verifikasi konsistensi deployment
  • Dokumentasi provisioning process
Requirements:
  • Package installation (nginx, monitoring tools)
  • User creation dengan SSH keys
  • Configuration file management
  • Service startup dan monitoring
Nilai: 35%

Tugas 2: Scaling Implementation

Objective: Implementasi vertical dan horizontal scaling

  • Lakukan vertical scaling pada satu instance
  • Setup load balancer dan auto scaling group
  • Test horizontal scaling dengan generate load
  • Monitor scaling events dan performance impact
# Scaling test commands stress-ng --cpu 4 --cpu-load 90 --timeout 600 ab -n 5000 -c 50 http://your-load-balancer/
Nilai: 35%

Tugas 3: Monitoring Dashboard

Objective: Setup comprehensive monitoring system

  • Implementasi custom metrics monitoring
  • Setup alarm untuk high CPU/memory usage
  • Buat monitoring dashboard
  • Performance analysis dan reporting
# Monitoring script example #!/bin/bash while true; do CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) MEM=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}') echo "$(date) - CPU: ${CPU}% | Memory: ${MEM}%" sleep 30 done
Nilai: 30%

Kriteria Penilaian

Technical Implementation (70%)

  • Automated provisioning (25%)
  • Scaling implementation (25%)
  • Monitoring setup (20%)

Documentation (20%)

  • Process documentation (10%)
  • Performance analysis (10%)

Participation (10%)

  • Class participation (5%)
  • Team collaboration (5%)

Best Practices

Provisioning Best Practices

  • Infrastructure as Code: Gunakan Terraform, CloudFormation
  • Golden Images: Implementasi custom AMI/Images
  • Configuration Management: Ansible, Chef, Puppet
  • Version Control: Store scripts di Git
  • Testing: Validate scripts sebelum production

Scaling Best Practices

  • Health Checks: Setup proper health checks
  • Graceful Shutdown: Implementasi graceful shutdown
  • Cost Monitoring: Monitor costs saat scaling
  • Performance Testing: Test scaling limits
  • Backup Strategy: Maintain data consistency

Monitoring Best Practices

  • Business Metrics: Monitor business metrics, bukan hanya technical
  • Centralized Logging: Implementasi centralized logging
  • Alerting Thresholds: Setup proper alerting thresholds
  • Dashboard Design: Create meaningful dashboards
  • Capacity Planning: Use metrics untuk capacity planning