Archive old Voicemail, Abandoned zero-length Voicemails, renumbering Voicemail messages - bash scripts

Dana Harding, mail address - 2009-06-23

Suggestions/Comments welcome.


Introduction


One issue and a desired feature related to voicemail were required for an Asterisk installation that I solved with shell scripts called by timely cronjobs. The scripts outlined here work for me in Asterisk 1.4.23.1, should work for 1.2 versions, and will probably continue to function in 1.6 if you are using the 'default' voicemail system.


The solution to both required a means to renumber the voicemail files, in addition to their respective abandoned_voicemail and archiving duties.

The scope of this discussion does not include odbc, imap, or other voicemail setups.

Background


Asterisk's 'default' voicemail system uses a filesystem hierchy to store voicemail messages, and looks like:

/var/spool/asterisk/voicemail
/VM_CONTEXT
/MAILBOX1
unavail.gsm
unavail.WAV
unavail.wav
/INBOX
msg0000.gsm
msg0000.txt
msg0000.WAV
msg0000.wav
msg0001.gsm
msg0001.txt
msg0001.WAV
msg0001.wav
/Old
/Family
/Friends
/Work
/MAILBOX2
/INBOX
/Old
and so forth...


When the message is recorded, and whenever a message is listened to, the modified date of the relevant msgNNNN.* files is touched. The archiving script uses the modified dates to determine whether a file is old enough to archive or not. If a user continuously listens to an old message - it will never be archived (and will remain in the user's voicemail box until the user elects to delete it.)

Notice the filename layout for the stored messages. There is a .txt file, and then audio recordings in the various formats specified in voicemail.conf. In this case, voicemail.conf contains    format=wav49|wav|gsm  Hence the three audio files.

There are INBOX and Old folders by default, Family, Friends, and Work appear if/when the user attempts to access either of them. In my current version of Asterisk running at home (1.4.23.1) it also creates a /tmp, /temp, and /unavail that don't get removed (some debug code left in the source?). Ultimately - the scripts simply crawl all directores to account for any newly created ones.

For the zero-length messages, it was observed that the msgNNNN.txt was created, but the related media files were not. The abandoned_voicemail script takes advantage of this. This occurrence was actually pretty rare, but made an impact when it did happen because it annoyed the users and it would persist until intervention by an administrator (me).




The Meat




renumber.sh
This specific instance will crawl all voicemail boxes under /var/spool/asterisk/voicemail/default/ except 200 and 1234, renumbering the msgNNNN.* files whenever they are out of order.

The messages become out of order when some are removed for archiving, or the msgNNNN.txt is deleted from a zero-length message, making this a necessary cleanup script after doing either.
#!/bin/sh

# Asterisk doesn't handle out of order messages very well.
# Script written assuming worst case scenario that random patches of messages
# have been removed, instead of a solid block.


voicemailroot=/var/spool/asterisk/voicemail/default
process=yes
logging=yes


function logit() {
if [ "$logging" = "yes" ]; then
        echo $1
fi
}

padit() {
case $1 in
        "in")
                case ${#in} in
                        1) instr=000$in;;
                        2) instr=00$in;;
                        3) instr=0$in;;
                        4) instr=$in;;
                esac;;
        "out")
                case ${#out} in
                        1) outstr=000$out;;
                        2) outstr=00$out;;
                        3) outstr=0$out;;
                        4) outstr=$out;;
                esac;;
        esac
}

function renumberfiles() {
out=0
padit out
numberfiles=`find . -type f -name "msg*.txt"|wc -l`
for (( in=0 ; in<=9999 ; in+=1 )); do
padit in
if ( test -e msg$instr.txt ); then
        if ( test -e msg$outstr.txt );
                then
                # Infile number is already lowest possible, no renaming necessary
                [ 1 ]
                else
                for ext in txt WAV wav gsm; do mv msg$instr.$ext msg$outstr.$ext; done
        fi
        let out+=1
        padit out
        if (( $out == $numberfiles )); then break; fi
fi
done

}

echo STARTING RENUMBERING `date`
for maindir in `find $voicemailroot -type d -maxdepth 1|awk -F / '{print $7}'`; do
processme=$process
if [ "$maindir" = "1234" ]; then
        processme=no
elif [ "$maindir" = "200" ]; then
        processme=no
fi
if [ "$processme" = "no" ]; then
        echo skipping $maindir
        continue
fi
cd $voicemailroot/$maindir
for subdir in `find . -type d|awk -F / '{print $2}'`; do
cd $voicemailroot/$maindir/$subdir
if [ "$(ls -A ./)" ]; then
        logit "Renumbering $maindir $subdir"
        renumberfiles
else
        logit "No messages in $maindir $subdir to renumber"
        continue
fi
done
done
echo FINISHED RENUMBERING `date`




abandoned_voicemail.sh

This script uses find to locate all msg*.txt files recursively from the voicemailroot directory.
For each msgNNNN.txt file found, it then uses awk to cut up the "./MAILBOX1/INBOX/msgNNNN.txt" string where:
	maindir = voicemail extension number
	subdir = folder (INBOX, Old, Family, etc.)
	file = msgNNNN.txt
	filebase = msgNNNN
It then counts all the msgNNNN.* files in that particular mailbox/folder. If there are 4 files (the .txt, and the three audio format files) then we assume the message is okay. Anything other than 4, and the file is deleted. If you do not have the three default audio formats in your voicemail.conf then modify this number. (and replace "rm" with "echo" until you know it works so you don't wipe out all your voicemails......)


#!/bin/bash
# Sometimes a zero length voicemail message is recorded,
# The message leaves a .txt file but none of the audio formats
# When the user attempts to listen to this message, there is no audio file
# so the message is blank. The user cannot delete this message, but
# Asterisk still indicates the presence of a message.

# Future iteration could check $maindir value against a list of mailboxes to never touch.

voicemailbasedir=/var/spool/asterisk/voicemail/default
renumberexecute=/home/asterisk/renumber.sh
renumber=0

cd $voicemailbasedir
for str in `find . -type f -name "msg*.txt"`; do
        maindir=`echo "$str"|awk -F / '{print $2}'`
        subdir=`echo "$str"|awk -F / '{print $3}'`
        file=`echo "$str"|awk -F / '{print $4}'`
        filebase=`echo "$file"|awk -F . '{print $1}'`

        cd $voicemailbasedir/$maindir/$subdir
        if (( `find . -type f -name "$filebase.*"|wc -l` == 4 )); then
                [ 1 ]   # file is okay
        else
                echo $maindir $subdir $file BAD - Deleting
                rm $file
                let renumber=1
        fi
done
if (( renumber == 1 )); then
        echo Starting Renumbering: $renumberexecute
        exec $renumberexecute
fi






archive_voicemail.sh

This specific instance crawls voicemail boxes: 1234, 100, 105, 107, and 108, ignores 1234 and 100 (same as if they weren't included in the list, except a "skipping <mailbox>" will show up on stdout). Extension 105 wanted any messages more than 7 days old cleared, 107 wanted 14 days. The others (only 108 in this example) get the default of messages 60+ days old archived.

After figuring out which messages are old and moving them to a staging directory, they are uuencoded to be sent out in an email and a tarball is created and stored locally.
Then the renumbering script is called to clean up. In the ideal world - the oldest messages would be a sequential block of NNNN numbers at the end of the msgNNNN.* list and not require any kind of renumbering. But we also know that simply listening to a message causes the file modification date to be touched - meaning that user actions in the voicemail system could possibly affect how this archiving script works: it's better to have the extra layer of robustness there.
#!/bin/bash
# Script to automatically archive old voicemail messages
# Based on modified date of files,  when messages are listened to the date is updated.
# Archived messages are also e-mailed to the voicemailbox owner as defined in voicemail.conf
# Interesting/bad things might happen if sendemail=yes and no email address is specified.

# Possible improvements:
#	Read the date of the messages directly from the .txt file instead of from the filesystem.
#	Read options from config file to allow defaults and mailbox specific options that aren't
#		hard coded into the script
#	Use an AGI or other interface to allow users to modify options via the telephone interface.
#	Logging to a specified logfile instead of simply stdout
#	Allow 'sendemail' to be yes or no per mailbox instead of globally


archivefolder=/home/asterisk/oldvoicemail
voicemailroot=/var/spool/asterisk/voicemail/default
process=yes
logging=yes
defaultdaysold=60
sendemail=yes

function logit() {
	if [ "$logging" = "yes" ]; then
	        echo $1
	fi
}


function processold() {
	if ( test -d /tmp/$maindir ); then
		logit "/tmp/$maindir Already exists.  Deleting contents"
		rm /tmp/$maindir/*
	else
		mkdir /tmp/$maindir
	fi

	find . -type f -name 'msg*' -daystart -mtime +$daysold -exec mv {} /tmp/$maindir \;

	# Let's make one .txt file for an e-mail that we will send out.
	# first a header explaining what's going on
	cd /tmp/$maindir
	echo -e "Messages older then $daysold days in voicemail box $maindir folder $subdir are being archived and removed from the phone system.\n\nThese have been previously sent to your e-mail when they were received, and are attached here for reference" >> email.txt
	echo -e "\n\n" >>email.txt

	for txtname in msg*.txt; do
		echo -n "`echo $txtname|awk -F . '{print $1}'` " >> email.txt
		echo -n `grep -e "callerid=" -e "origdate=" -e "duration=" $txtname` >> email.txt
		echo " seconds" >> email.txt
	done

	echo -e "\n\n" >> email.txt
	# now the fun stuff - uuencode all the .WAV files that were previously e-mailed
	for filename in msg*.WAV; do
		uuencode $filename $filename >> email.txt
	done

	# who do we e-mail this to?
	sendtoemail=`grep "$maindir => " /etc/asterisk/voicemail.conf| awk -F , '{print $3}'`

	todaydate=`date +%Y%m%d`

	if [ "$sendemail" = "yes" ]; then
		logit "Sending e-mail to $sendtoemail"
		/usr/bin/mail -s "Archived Voicemail Messages $todaydate" -a 'From: Asterisk PBX <asterisk@asteriskbox>' $sendtoemail < email.txt
	fi

	logit "Tarballing the old messages."
	tarfile=$maindir\_$todaydate\_$subdir.tar.gz
	tar -czf $tarfile --no-recursion --remove-files ./msg*
	chmod 600 $tarfile
	mv ./$tarfile $archivefolder
	rm /tmp/$maindir/*
	rmdir /tmp/$maindir
}


#process all voicemail boxes:
#for maindir in `find $voicemailroot -type d -maxdepth 1|awk -F / '{print $7}'`; do

#only process specific voicemail boxes:
for maindir in 1234 100 105 107 108; do

	#assign defaults:
	processme=$process
	daysold=$defaultdaysold

	#replace defaults for specific mailboxes
	#any mailboxes not listed receive the default values from above
	case $maindir in
		1234) processme=no;;
		100) processme=no;;
		105) let daysold=7;;
		107) let daysold=14;;
	esac

	if [ "$processme" = "no" ]; then
		echo skipping $maindir
		continue
	fi

	cd $voicemailroot/$maindir

	# to only process a specific voicemail box folder:
	#for subdir in INBOX; do

	# to process them all:
	for subdir in `find . -type d|awk -F / '{print $2}'`; do

		cd $voicemailroot/$maindir/$subdir
		if ( test -e msg0000.txt ); then
			messages=`ls *.WAV|grep -c msg`
			oldmessages=`find . -type f -name '*.WAV' -daystart -mtime +$daysold|grep -c msg`

			if [ $oldmessages = 0 ]; then logit "$maindir $subdir $oldmessages/$messages No old messages to process";continue;fi

			logit "$maindir $subdir $oldmessages/$messages Processing old messages"
			processold
		else
			logit "$maindir $subdir 0/0 No messages to process"
		fi
	done
done
/home/asterisk/renumber.sh