Server move workbook
Appearance
This is a prettified version of items done on the server move on 2019-12-20.
Pre switch over action plan
- Decide a switch over date and communicate with other sysadmins
- Announce in Support
- Add a sitenotice some days or a week before
- Email hexmode new IP addresses for email relay
- ccp: reduce DNS TTL to 900
- web1->web2: Copy over stuffs
sudo -E rsync -ahAX --info=progress2 /etc/webauth root@web2.translatewiki.net:/etc sudo -E rsync -ahAX --info=progress2 --del /etc/letsencrypt root@web2.translatewiki.net:/etc sudo -E rsync -ahAX --info=progress2 --del /home /www /srv root@web2.translatewiki.net:/ # 2h20m sudo -E rsync -ahAX --info=progress2 --del /resources/caches/translatewiki.net root@web2.translatewiki.net:/resources/caches #20min sudo -E rsync -ahAX --info=progress2 --del /resources/{abi,amir,nike,raymond,siebrand} root@web2.translatewiki.net:/scratch #~24h
- web2 preps
cd /resources; for i in /scratch/*; do ln -s $i $(basename $i); done # <--- can probably be puppetized (or create an ansible playbook?) # yeah, for each user set it up automatically cd resources; mkdir projects; chown betawiki:users projects: ln -s /home/betawiki/config/repoconfig.yaml /resources/projects/; sudo -u betawiki /home/betawiki/config/bin/repomulti update # 20min
Pre switch over tests
- web2: arm/test keyholder
- web2: verify mysql access
- web2: verify ttmserver/search index update scripts do not fail
- ccp: ensure rDNS entries for IPv4 and IPv6 (we found that IPv6 was missing)
Switch over action plan
- ccp: reduce DNS TTL to 300
- web1->web2
sudo -E rsync -ahAX --info=progress2 --del /resources/{abi,amir,nike,raymond,siebrand} root@web2.translatewiki.net:/scratch # slow
- web1: Disable cron jobs
sudo nano /etc/cron.d/{awstats,backup,certbot,wikimaintenance,wikistats} sudo crontab -u root -e
- web1: Drain the jobqueue
sudo systemctl stop mw-jobrunner php /srv/mediawiki/workdir/maintenance/runJobs.php --wait
- web1: update sitenotice to say it's now read only
- web1: Set old mediawiki to read only
b nano /home/betawiki/config/TranslatewikiSettings.php # no need to deploy
- web1: Export an SQL dump
mydumper -B translatewiki_net -u root -o dump -e -c # 10m
- web1->web2: rsync SQL dump
sudo -E rsync -ahAX --info=progress2 dump root@web2.translatewiki.net:/root # 1m
- web2: Import an SQL dump
myloader -B translatewiki_net -d dump -u root # 17m
- web2: add GRANTs
- web1->web2: rsync logs/stats
sudo -E rsync -ahAX --info=progress2 --del /home /www /srv root@web2.translatewiki.net:/ sudo -E rsync -ahAX --info=progress2 --del /resources/caches/translatewiki.net root@web2.translatewiki.net:/resources/caches sudo /usr/lib/cgi-bin/awstats.pl -update -config=translatewiki.net sudo -E rsync -ahAX --info=progress2 --del /var/lib/awstats root@web2.translatewiki.net:/var/lib
- web2: restart memcached
sudo systemctl restart memcached
- web2: Test that MediaWiki works
- ccp: Switch over DNS
- web2: Disable read only
b nano /home/betawiki/config/TranslatewikiSettings.php # no need to deploy
- web2: update site notice
- web2: Update configuration for Elasticsearch (from pointing to localhost instead of es.twn.net)
- web2: Start search index boostrap script
screen cd /www/translatewiki.net/docroot/w/extensions/CirrusSearch/maintenance php updateSearchIndexConfig.php --startOver php forceSearchIndex.php --skipLinks --indexOnSkip --buildChunks 1000 --batch-size=100 | nice parallel --eta --joblog ~/reindex-1.log --no-notice php forceSearchIndex.php --skipParse --buildChunks 1000 --batch-size=100 | nice parallel --eta --joblog ~/reindex-2.log --no-notice
- web2: Start translation memory bootstrap script
screen nice php /www/translatewiki.net/docroot/w/extensions/Translate/scripts/ttmserver-export.php --threads 4 --reindex
- web1: stop IRC bots
- web2: start IRC bots
- web2: enable cron jobs
Post switch over action plan
- Update News
- web2: re-enable letsencrypt
- ccp: Increase DNS expiry age back to normal <--- set to 7200.
- uptimerobot: remove monitor for es.translatewiki.net
- clean up puppet repo (remove es/web1 legacy) https://gerrit.wikimedia.org/r/#/c/translatewiki/+/559781 Goodbye old servers
- web1: take backups of everything
- shut down servers es
- shut down servers web1
- archive this workplan on a wiki page