Deployment & Operations
Related docs:
SDLC/DEPLOYMENT·CI/CD Pipeline·Hosting Infrastructure·Microservices
1. Deployment Process
1.1 Pre-Deployment Checklist
- All tests pass in CI (
ci.yml) - Code review approved (minimum 1 reviewer)
- Staging deployment verified
- E2E tests pass on staging
- Performance tests (k6) pass on staging
- Security scan (OWASP ZAP) passes
- Database migration script reviewed
- Feature flags configured for production
- Rollback plan documented
1.2 Deployment Flow
Feature Branch (feature/*)
│
▼ (PR, review, CI pass)
main (Staging)
│
▼ (verify on staging.pakashop.store)
├── E2E tests (Playwright)
├── Performance tests (k6)
└── Security scan (OWASP ZAP)
│
▼ (PR from main → production)
production (Live)
│
▼ (pre-deploy DB backup)
├── Deploy to EC2
├── Run migrations
├── Restart services
└── Verify health checks
2. Step-by-Step Deploy
2.1 Staging Deploy (Automatic)
Triggered on push to main:
# GitHub Actions runs deploy-staging.yml
# SSH into staging EC2 and executes:
cd /opt/pakashop
sudo ./scripts/deploy.sh staging
2.2 Production Deploy (Manual Gate)
- Create PR from
maintoproduction - Review and approve
- Merge to trigger
deploy-production.yml - Pre-deploy backup runs automatically
- Deploy executes on production EC2
# On production EC2
cd /opt/pakashop
sudo ./scripts/deploy.sh production
2.3 Deploy Script (scripts/deploy.sh)
#!/bin/bash
set -e
ENV=$1
BRANCH=$(git rev-parse --abbrev-ref HEAD)
echo "=== Deploying to $ENV (branch: $BRANCH) ==="
# Validate environment
if [[ "$ENV" != "staging" && "$ENV" != "production" ]]; then
echo "Error: Environment must be 'staging' or 'production'"
exit 1
fi
# Pull latest code
git fetch origin
git reset --hard origin/$BRANCH
# Node.js services
NODE_SERVICES=(
"gateway"
"backend"
"config"
"notifications"
"tracking"
"scheduler"
"fraud"
"coupon"
"loyalty"
"whatsapp"
"reports"
"reconciliation"
"invoicing"
"pricing"
"settlement"
)
for service in "${NODE_SERVICES[@]}"; do
echo "--- Deploying pakashop-$service ---"
cd services/$service
npm ci --production
if [ -f "prisma/schema.prisma" ]; then
npx prisma migrate deploy
fi
cd ../..
sudo systemctl restart pakashop-$service
sleep 2
sudo systemctl is-active --quiet pakashop-$service || exit 1
echo "✓ pakashop-$service is active"
done
# Go services
GO_SERVICES=("search" "analytics")
for service in "${GO_SERVICES[@]}"; do
echo "--- Deploying pakashop-$service ---"
cd services/$service
go build -o bin/$service ./src
cd ../..
sudo systemctl restart pakashop-$service
sleep 2
sudo systemctl is-active --quiet pakashop-$service || exit 1
echo "✓ pakashop-$service is active"
done
# Python services
PYTHON_SERVICES=("moderation" "recommendations")
for service in "${PYTHON_SERVICES[@]}"; do
echo "--- Deploying pakashop-$service ---"
cd services/$service
source venv/bin/activate
pip install -r requirements.txt
cd ../..
sudo systemctl restart pakashop-$service
sleep 2
sudo systemctl is-active --quiet pakashop-$service || exit 1
echo "✓ pakashop-$service is active"
done
# Reload nginx
sudo systemctl reload nginx
echo "=== Deployment to $ENV completed successfully ==="
3. systemd Management
3.1 Service Status
# Check all services
./scripts/pakashop-status.sh
# Check specific service
systemctl status pakashop-backend.service
# View logs
journalctl -u pakashop-backend.service -f
# View logs since last boot
journalctl -u pakashop-backend.service --since "1 hour ago"
3.2 Service Operations
# Start a service
sudo systemctl start pakashop-backend.service
# Stop a service
sudo systemctl stop pakashop-backend.service
# Restart a service
sudo systemctl restart pakashop-backend.service
# Enable auto-start on boot
sudo systemctl enable pakashop-backend.service
# Disable auto-start
sudo systemctl disable pakashop-backend.service
# Reload systemd daemon (after unit file changes)
sudo systemctl daemon-reload
3.3 Service Dependency Order
Services must restart in dependency order:
# Phase 1: Infrastructure
sudo systemctl restart postgresql redis meilisearch
# Phase 2: Core Services
sudo systemctl restart pakashop-config
sleep 2
sudo systemctl restart pakashop-gateway
sleep 2
sudo systemctl restart pakashop-backend
sleep 2
# Phase 3: Supporting Services (can restart in parallel)
sudo systemctl restart pakashop-search pakashop-analytics \
pakashop-notifications pakashop-tracking pakashop-moderation \
pakashop-recommendations pakashop-scheduler pakashop-fraud \
pakashop-coupon pakashop-loyalty pakashop-whatsapp \
pakashop-reports pakashop-reconciliation pakashop-invoicing \
pakashop-pricing pakashop-settlement
# Phase 4: Reverse Proxy
sudo systemctl reload nginx
4. Health Checks
4.1 Service Health Endpoints
Every service exposes:
| Endpoint | Purpose | Expected Response |
|---|---|---|
GET /health | Liveness | {"status":"ok","service":"pakashop-backend"} |
GET /health/ready | Readiness | {"status":"ready","dependencies":{"postgres":"ok","redis":"ok"}} |
GET /health/metrics | Prometheus metrics | Raw metrics output |
4.2 Manual Health Check
# Check all services
curl -s http://localhost:3080/health | jq
curl -s http://localhost:3005/health | jq
curl -s http://localhost:3120/health | jq
curl -s http://localhost:3110/health | jq
# Check gateway
curl -s http://localhost:8000/health | jq
4.3 Automated Health Checks
GitHub Actions health-check.yml runs every 15 minutes:
name: Health Check
on:
schedule:
- cron: '*/15 * * * *'
jobs:
check:
runs-on: ubuntu-latest
steps:
- name: Check Production
run: |
curl -f https://pakashop.store/api/v1/health || exit 1
curl -f https://pakashop.store/api/v1/health/ready || exit 1
- name: Check Staging
run: |
curl -f https://staging.pakashop.store/api/v1/health || exit 1
curl -f https://staging.pakashop.store/api/v1/health/ready || exit 1
5. Rollback Procedures
5.1 Code Rollback
# On the affected EC2 host
cd /opt/pakashop
# Revert to previous commit
git revert HEAD --no-edit
git push origin $(git rev-parse --abbrev-ref HEAD)
# Re-deploy
sudo ./scripts/deploy.sh production
5.2 Database Rollback
# Restore from backup
aws s3 cp s3://pakashop-backups/production/pakashop-prod-YYYYMMDD-HHMMSS.sql.gz .
gunzip pakashop-prod-YYYYMMDD-HHMMSS.sql.gz
psql $DATABASE_URL < pakashop-prod-YYYYMMDD-HHMMSS.sql
# Resolve migration state
npx prisma migrate resolve --rolled-back <migration_name>
5.3 Service Rollback
# Roll back a single service
cd services/backend
git checkout production~1 -- .
npm ci --production
sudo systemctl restart pakashop-backend
5.4 Emergency Rollback
If the platform is completely down:
# 1. Revert production branch
git revert HEAD --no-edit
git push origin production
# 2. The CD pipeline will auto-deploy the previous stable state
# 3. If CD is unavailable, manually deploy:
sudo ./scripts/deploy.sh production
# 4. Verify all services
./scripts/pakashop-status.sh
6. Monitoring & Alerting
6.1 Log Aggregation
# View all services logs
journalctl -u 'pakashop-*' -f
# View specific time range
journalctl -u pakashop-backend.service --since "2026-06-09 10:00" --until "2026-06-09 12:00"
# Export logs for analysis
journalctl -u pakashop-backend.service --since "24 hours ago" > /tmp/backend-logs.txt
6.2 Middleware.io Alerts
| Alert | Threshold | Action |
|---|---|---|
api_p95_latency > 500ms | Warning | Check DB query performance |
api_error_rate > 5% | Critical | Investigate failing endpoints |
payment_error_rate > 5% | Critical | Check gateway status |
redis_memory > 80% | Warning | Review cache TTLs |
postgres_connections > 80% | Critical | Check for connection leaks |
unhandled_exceptions > 0 | Critical | Find and fix bug |
6.3 On-Call Procedures
- Acknowledge alert within 5 minutes
- Assess severity (P1-P4)
- Mitigate (rollback if necessary)
- Communicate in #incidents Slack channel
- Resolve and document in incident log
- Post-mortem within 24 hours for P1/P2
7. Database Operations
7.1 Backup
# Manual backup
pg_dump $DATABASE_URL | gzip > backup-$(date +%Y%m%d-%H%M%S).sql.gz
aws s3 cp backup-*.sql.gz s3://pakashop-backups/manual/
# Automated (nightly via GitHub Actions)
# See db-backup.yml
7.2 Migration
# Development
npx prisma migrate dev --name add_new_feature
# Production (deploy script handles this)
npx prisma migrate deploy
# Check migration status
npx prisma migrate status
7.3 Recovery
# List available backups
aws s3 ls s3://pakashop-backups/daily/ | sort
# Restore specific backup
aws s3 cp s3://pakashop-backups/daily/pakashop-prod-20260609-020000.sql.gz .
gunzip pakashop-prod-20260609-020000.sql.gz
psql $DATABASE_URL < pakashop-prod-20260609-020000.sql
For internal use only. Do not distribute outside Pakashop engineering.