Automating Couchbase Backups - Backup to S3

One of our customers has a five node Couchbase cluster running on AWS EC2. Though we have EBS snapshots enabled on all these instances, the recommended way of backing up couchbase buckets is a running cbbackup tool. cbbackup can backup an entire cluster, a single node or specified buckets. We wanted automate these backups to s3 and the script has to take care of the following:

  • Connect to a specified master node and get the list of nodes in the cluster. BTW, couchbase clusters are masterless. we are saying master just for the sake of convinience.
  • Iterate through each node and run a backup. On successful backup, exit.
  • If a node is down, move on to others untill a successful backup is done.
  • Upload backups to s3 bucket.

Below are the script we came up with. You need an s3 bucket to store the pem key for ssh access to all the nodes, paramiko library for ssh access in order to use these scripts.

  • couch-backup.py - connects to a couhbase node, downloads cluster definition, loops through nodes and triggers backup script
  • master-couch.py - runs cbbackup, zips the backup folder, uploads to s3 bucket. It uses aws s3 mv via cli so each of the nodes should be configured with aws cli with proper access.

Couch-backup.py

from __future__ import print_function  
from datetime import datetime  
from time import gmtime, strftime  
import boto3  
import time  
import os  
import gzip  
import commands  
import paramiko  
import sys

DATE = time.strftime('%m%d%Y-%H%M%S')  
Master_node="52.xx.xx.xx"  
port="8091"  
DUMP_PATH='/tmp/'  
username="Administrator"  
password="'supersecret'"

BACKUPNAME =  DUMP_PATH + DATE

bucket='lambda-work-raju'  
s3 = boto3.client('s3')  

k = paramiko.RSAKey.from_private_key_file("/tmp/wgo.pem")  
c = paramiko.SSHClient()  
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())


command = "cd /opt/couch-cli/couchbase-cli/ && ./couchbase-cli server-list -c " + Master_node + ":8091" + " -u " + username + " -p " + password + "|" + "cut -f2 -d"'@'"" + "|" + "awk '{ print $1 }'" + ">" + "/tmp/nodelist.txt"  
check = os.system(command)

print (check)  
filename="/tmp/nodelist.txt"  
with open(filename) as f:  
    data = f.read().splitlines() 
    for datas in data:        
        host=datas
        print (host)
        print ("connecting to " + str(host))
        c.connect(hostname=str(host), username="ubuntu", pkey=k)
        print ("connected to"+host)

        scriptrun = [
        "aws s3 cp  s3://couchbasebackups/master-couch.py /home/ubuntu/master-couch.py",
        "cd /home/ubuntu/ && chmod 755 master-couch.py",
        "python /home/ubuntu/master-couch.py"
        ]

        for command in scriptrun:
            print ("Executing {}".format(command))
            stdin , stdout, stderr = c.exec_command(command)
            print (stdout.read())
            print (stderr.read())  
        sys.exit()         
    c.close()

master-couch.py

import time  
import os  
import gzip  
import commands  
username="Administrator"  
password="'supersecret'"

DATE = time.strftime('%m%d%Y-%H%M%S')  
DUMP_PATH="/tmp/"  
BACKUPNAME =  DUMP_PATH + DATE  
if not os.path.exists(BACKUPNAME):  
    os.makedirs(BACKUPNAME)
    print (BACKUPNAME)
commands="cd /opt/couchbase/bin && ./cbbackup http://localhost:8091 "  + BACKUPNAME + " -u " + username + " -p " + password  
run=os.system(commands)  
zipit="zip -r " + BACKUPNAME + ".zip " + BACKUPNAME  
os.system(zipit)  
upload= "aws s3 mv " + BACKUPNAME +".zip s3://couchbasebackups/"  
os.system(upload)  

The scripts are available on github.

Happy backup-ing! :)

Raju Banerjee

Raju is a Cloud Solution Architect - His strengths are AWS and Google Cloud, and all things DevOps. He likes automating boring stuff using Python.

comments powered by Disqus