Upgrades SLURP¶
Introduction¶
SLURP (Skip Level Upgrade Release Process) permet de sauter une version majeure d'OpenStack lors des mises à jour. Cette section couvre la planification et l'exécution des upgrades avec Kolla-Ansible.
Prérequis¶
- OpenStack en production stable
- Backups complets et vérifiés
- Environnement de staging pour tests
- Fenêtre de maintenance planifiée
Points à apprendre¶
Cycle de releases OpenStack¶
graph LR
subgraph "2024"
caracal["2024.1 Caracal<br/>(SLURP)"]
dalmatian["2024.2 Dalmatian"]
end
subgraph "2025"
epoxy["2025.1 Epoxy<br/>(SLURP)"]
future["2025.2 (Future)"]
end
caracal --> dalmatian
dalmatian --> epoxy
caracal -.->|SLURP skip one| epoxy
caracal:::slurp
epoxy:::slurp
dalmatian:::nonslurp
future:::nonslurp
classDef slurp fill:#90EE90
classDef nonslurp fill:#FFFFE0
SLURP releases: - Support étendu - Upgrade skip-level - Recommandé production
Non-SLURP: - Features nouvelles - Upgrade séquentiel requis - Development/staging
Stratégie d'upgrade¶
flowchart TD
Start([Start]) --> Plan[Planification<br/>- Review release notes<br/>- Identifier deprecations<br/>- Planifier fenêtre maintenance]
Plan --> Prep[Préparation environnement]
subgraph "Staging"
Prep --> Clone[Clone configuration]
Clone --> Deploy[Déployer version cible]
Deploy --> TestFunc[Tests fonctionnels]
TestFunc --> TestPerf[Tests performance]
end
TestPerf --> TestOK{Tests OK?}
TestOK -->|Oui| PlanProd[Planifier production]
TestOK -->|Non| Fix[Corriger problèmes]
Fix --> Retest[Retester]
Retest --> TestOK
subgraph "Pre-upgrade Production"
PlanProd --> Backup[Backup complet]
Backup --> Snapshot[Snapshot VMs critiques]
Snapshot --> Comm1[Communication utilisateurs]
Comm1 --> Drain[Drain workloads optionnel]
end
subgraph "Upgrade Production"
Drain --> UpCtrl[Upgrade control plane]
UpCtrl --> VerifServ[Vérification services]
VerifServ --> UpCompute[Upgrade compute nodes]
UpCompute --> VerifVMs[Vérification VMs]
end
subgraph "Post-upgrade"
VerifVMs --> TestValid[Tests validation]
TestValid --> Monitor[Monitoring intensif]
Monitor --> Comm2[Communication fin maintenance]
end
Comm2 --> Critical{Problèmes<br/>critiques?}
Critical -->|Oui| Rollback[Rollback]
Rollback --> PostMortem[Post-mortem]
Critical -->|Non| Doc[Documenter upgrade]
Doc --> Archive[Archiver backups]
PostMortem --> End([Stop])
Archive --> End
Préparation upgrade Kolla-Ansible¶
#!/bin/bash
# prepare-upgrade.sh
SOURCE_VERSION="2024.1" # Caracal
TARGET_VERSION="2025.1" # Epoxy
echo "=== Préparation upgrade $SOURCE_VERSION → $TARGET_VERSION ==="
# 1. Backup configuration actuelle
BACKUP_DIR="/backup/upgrade-$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR
cp -r /etc/kolla $BACKUP_DIR/
cp /etc/ansible/inventory $BACKUP_DIR/
# 2. Mettre à jour kolla-ansible
pip install --upgrade "kolla-ansible==${TARGET_VERSION}.*"
# 3. Copier nouvelle configuration
cp -r /usr/share/kolla-ansible/etc_examples/kolla/* /etc/kolla/
# Merger avec l'ancienne config (manuel)
# 4. Générer les différences de config
kolla-ansible -i /etc/kolla/inventory genconfig --check
# 5. Télécharger nouvelles images
kolla-ansible -i /etc/kolla/inventory pull
echo "Préparation terminée. Vérifier les configurations avant upgrade."
Configuration upgrade¶
# /etc/kolla/globals.yml
# Version cible
openstack_release: "2025.1"
kolla_base_distro: "ubuntu"
kolla_install_type: "source"
# Options d'upgrade
kolla_upgrade_allow_stop_services: "yes"
kolla_upgrade_skip_deprecation_warnings: "no"
# Rolling upgrade (minimize downtime)
kolla_enable_rolling_upgrade: "yes"
kolla_serial: "1" # Upgrade un node à la fois
# Database migrations
database_upgrade_max_retries: 5
database_upgrade_retry_delay: 30
Script upgrade complet¶
#!/bin/bash
# upgrade-openstack.sh
set -e
LOG_FILE="/var/log/openstack-upgrade-$(date +%Y%m%d).log"
exec > >(tee -a $LOG_FILE) 2>&1
SOURCE_VERSION="2024.1"
TARGET_VERSION="2025.1"
INVENTORY="/etc/kolla/inventory"
echo "=========================================="
echo "OpenStack Upgrade: $SOURCE_VERSION → $TARGET_VERSION"
echo "Started: $(date)"
echo "=========================================="
# Fonction de vérification
check_services() {
echo "Checking OpenStack services..."
openstack service list
openstack compute service list
openstack network agent list
openstack volume service list
}
# === Phase 1: Pre-checks ===
echo -e "\n=== Phase 1: Pre-checks ==="
# Vérifier état actuel
check_services
# Vérifier espace disque
df -h /var/lib/docker
if [ $(df /var/lib/docker | tail -1 | awk '{print $5}' | tr -d '%') -gt 80 ]; then
echo "ERROR: Docker storage > 80% used"
exit 1
fi
# Vérifier backups
LATEST_BACKUP=$(ls -t /backup/mariadb/*.tar.gz 2>/dev/null | head -1)
if [ -z "$LATEST_BACKUP" ]; then
echo "ERROR: No recent backup found"
exit 1
fi
echo "Latest backup: $LATEST_BACKUP"
# === Phase 2: Pull new images ===
echo -e "\n=== Phase 2: Pull new images ==="
kolla-ansible -i $INVENTORY pull
# Vérifier images
docker images | grep ${TARGET_VERSION} | head -10
# === Phase 3: Pre-upgrade checks ===
echo -e "\n=== Phase 3: Pre-upgrade checks ==="
kolla-ansible -i $INVENTORY prechecks
# === Phase 4: Upgrade control plane ===
echo -e "\n=== Phase 4: Upgrade control plane ==="
# Upgrade database schema first
echo "Upgrading database schemas..."
kolla-ansible -i $INVENTORY upgrade --tags mariadb,keystone,glance,cinder,neutron,nova,heat,horizon
# === Phase 5: Verify control plane ===
echo -e "\n=== Phase 5: Verify control plane ==="
sleep 30 # Wait for services to stabilize
# Test Keystone
openstack token issue
# Test Nova
openstack server list --all-projects --limit 5
# Test Neutron
openstack network list --limit 5
# === Phase 6: Upgrade compute nodes ===
echo -e "\n=== Phase 6: Upgrade compute nodes ==="
# Rolling upgrade des computes
for compute in $(grep -E '^\[compute\]' -A100 $INVENTORY | grep -v '^\[' | grep -v '^$' | head -10); do
echo "Upgrading compute: $compute"
# Migrate VMs away (optional)
# openstack compute service set --disable $compute nova-compute
# nova host-evacuate-live $compute
# Upgrade compute
kolla-ansible -i $INVENTORY upgrade --limit $compute --tags nova-compute
# Re-enable
# openstack compute service set --enable $compute nova-compute
# Verify
sleep 10
openstack compute service list | grep $compute
done
# === Phase 7: Post-upgrade validation ===
echo -e "\n=== Phase 7: Post-upgrade validation ==="
check_services
# Test création VM
echo "Testing VM creation..."
openstack server create --flavor m1.tiny --image cirros --network internal \
test-upgrade-vm --wait
openstack server show test-upgrade-vm
openstack server delete test-upgrade-vm
# === Phase 8: Cleanup ===
echo -e "\n=== Phase 8: Cleanup ==="
# Remove old images
docker image prune -f
# Cleanup old config backups (keep last 5)
ls -t /backup/kolla_config_*.tar.gz | tail -n +6 | xargs rm -f
echo -e "\n=========================================="
echo "Upgrade Completed: $(date)"
echo "=========================================="
Rollback procedure¶
#!/bin/bash
# rollback-upgrade.sh
set -e
BACKUP_DIR=$1
if [ -z "$BACKUP_DIR" ]; then
echo "Usage: $0 <backup_directory>"
echo "Available backups:"
ls -la /backup/upgrade-*/
exit 1
fi
echo "=== Rolling back to backup: $BACKUP_DIR ==="
# 1. Arrêter les services
echo "Stopping all services..."
kolla-ansible -i /etc/kolla/inventory stop --yes-i-really-really-mean-it
# 2. Restaurer configuration
echo "Restoring configuration..."
rm -rf /etc/kolla
cp -r $BACKUP_DIR/kolla /etc/
# 3. Restaurer database
echo "Restoring database..."
LATEST_DB=$(ls -t $BACKUP_DIR/mariadb*.tar.gz | head -1)
/opt/scripts/restore-mariadb.sh $LATEST_DB
# 4. Downgrade images
echo "Pulling old images..."
kolla-ansible -i /etc/kolla/inventory pull
# 5. Redéployer
echo "Redeploying services..."
kolla-ansible -i /etc/kolla/inventory deploy
# 6. Vérifier
echo "Verifying services..."
openstack service list
openstack compute service list
echo "Rollback completed"
Diagramme séquence upgrade¶
sequenceDiagram
participant Admin
participant Kolla-Ansible
participant Control Plane
participant Database
participant Compute Nodes
participant VMs
Admin->>Kolla-Ansible: pull new images
Kolla-Ansible->>Control Plane: Download images
Kolla-Ansible->>Compute Nodes: Download images
Admin->>Kolla-Ansible: prechecks
Kolla-Ansible->>Control Plane: Verify requirements
Kolla-Ansible->>Database: Check connectivity
Admin->>Kolla-Ansible: upgrade control plane
Kolla-Ansible->>Database: Stop MariaDB
Kolla-Ansible->>Database: Upgrade schema
Kolla-Ansible->>Database: Start MariaDB
Kolla-Ansible->>Control Plane: Stop Keystone
Kolla-Ansible->>Control Plane: Deploy new Keystone
Kolla-Ansible->>Control Plane: Start Keystone
loop Each service
Kolla-Ansible->>Control Plane: Stop service
Kolla-Ansible->>Database: Run migrations
Kolla-Ansible->>Control Plane: Deploy new version
Kolla-Ansible->>Control Plane: Start service
end
Admin->>Admin: Verify control plane
loop Each compute node
Admin->>Kolla-Ansible: upgrade compute
Kolla-Ansible->>Compute Nodes: Stop nova-compute
Note over VMs: VMs continue running
Kolla-Ansible->>Compute Nodes: Deploy new nova-compute
Kolla-Ansible->>Compute Nodes: Start nova-compute
Admin->>Admin: Verify compute
end
Admin->>Admin: Full validation
Admin->>Admin: Cleanup old images
Upgrade Ceph (séparé)¶
#!/bin/bash
# upgrade-ceph.sh
# Les upgrades Ceph se font séparément d'OpenStack
CURRENT="reef"
TARGET="squid"
echo "=== Ceph Upgrade: $CURRENT → $TARGET ==="
# 1. Vérifier santé cluster
ceph health
ceph osd tree
# 2. Définir les flags
ceph osd set noout
ceph osd set norebalance
# 3. Upgrade MONs (un par un)
for mon in mon1 mon2 mon3; do
echo "Upgrading MON: $mon"
cephadm shell --mon-id $mon -- ceph orch upgrade start --ceph-version $TARGET
sleep 60
ceph -s
done
# 4. Upgrade MGRs
for mgr in mgr1 mgr2; do
echo "Upgrading MGR: $mgr"
# Automatic via orchestrator
done
# 5. Upgrade OSDs (automatique rolling)
ceph orch upgrade start --ceph-version $TARGET
# Surveiller
watch ceph orch upgrade status
# 6. Retirer les flags
ceph osd unset noout
ceph osd unset norebalance
# 7. Vérifier
ceph health
ceph versions
Exemples pratiques¶
Checklist upgrade¶
## Pre-Upgrade Checklist
### Planning (1 semaine avant)
- [ ] Review release notes OpenStack ${TARGET_VERSION}
- [ ] Identifier breaking changes
- [ ] Tester en staging
- [ ] Planifier fenêtre maintenance
- [ ] Communiquer aux utilisateurs
### Jour J-1
- [ ] Full backup MariaDB
- [ ] Backup configurations
- [ ] Snapshot VMs critiques
- [ ] Vérifier espace disque
- [ ] Pull nouvelles images
### Jour J
- [ ] Début fenêtre maintenance
- [ ] Vérifier état cluster
- [ ] Exécuter upgrade control plane
- [ ] Vérifier services
- [ ] Upgrade compute nodes
- [ ] Tests validation
### Post-Upgrade
- [ ] Monitoring intensif 24h
- [ ] Communication fin maintenance
- [ ] Documenter l'upgrade
- [ ] Post-mortem si incidents
Test validation post-upgrade¶
#!/bin/bash
# validate-upgrade.sh
ERRORS=0
echo "=== OpenStack Post-Upgrade Validation ==="
# 1. API Endpoints
echo -e "\n[1] Testing API endpoints..."
for service in identity compute network image volume; do
if openstack endpoint list --service $service | grep -q "https"; then
echo "✓ $service endpoint OK"
else
echo "✗ $service endpoint FAILED"
((ERRORS++))
fi
done
# 2. Service status
echo -e "\n[2] Checking service status..."
if openstack compute service list | grep -q "down"; then
echo "✗ Some compute services are down"
openstack compute service list | grep down
((ERRORS++))
else
echo "✓ All compute services up"
fi
# 3. Network agents
echo -e "\n[3] Checking network agents..."
if openstack network agent list | grep -q "XXX"; then
echo "✗ Some network agents are down"
((ERRORS++))
else
echo "✓ All network agents up"
fi
# 4. Create test resources
echo -e "\n[4] Testing resource creation..."
openstack network create test-upgrade-net --provider-network-type vxlan
openstack subnet create test-upgrade-subnet --network test-upgrade-net --subnet-range 192.168.99.0/24
openstack server create --flavor m1.tiny --image cirros --network test-upgrade-net test-upgrade-vm --wait
if openstack server show test-upgrade-vm | grep -q "ACTIVE"; then
echo "✓ VM creation successful"
else
echo "✗ VM creation failed"
((ERRORS++))
fi
# Cleanup
openstack server delete test-upgrade-vm --wait
openstack subnet delete test-upgrade-subnet
openstack network delete test-upgrade-net
# Summary
echo -e "\n=== Validation Summary ==="
if [ $ERRORS -eq 0 ]; then
echo "All tests PASSED"
exit 0
else
echo "$ERRORS test(s) FAILED"
exit 1
fi
Ressources¶
Checkpoint¶
- Stratégie SLURP comprise
- Environnement staging préparé
- Scripts upgrade créés
- Procédure rollback documentée
- Checklist upgrade complète
- Tests validation automatisés
- Premier upgrade testé en staging
- Documentation mise à jour