Steganography is the practice of undetectably
altering a work to embed a secret message, which the existence of a message is
secret and the sender and the receiver are the only ones that are aware of the
secret communication. Cover media can be either image or audio, where covert
message can be audio, image or text. In this thesis, cover would be audio and
covert is text. Steganography method should be performed in a way that
intentional manipulation of cover media, which we call it attack, leads to
minimum change in the covert message. Some of these steganography methods
operates in the wavelet domain, in a way that covert message is embedded in
wavelet coefficients of cover media. As examples we can refer to methods like
embedding message using mean wavelet quantization and also defining a threshold
for embedding message in each coefficient. In this thesis we aim to implement
an algorithm for audio steganography which is robust against mp3 compression
attack. We did the steganography using discrete wavelet transform with three
level decomposition, in two ways. In the first way we divide the audio signal
into frames and apply wavelet transform to each frame. After that we embed each
bit of covert data into the largest wavelet coefficient of each frame. In the
second method we apply wavelet transform to the signal directly, without
dividing it into frames and then we choose the largest wavelet coefficients for
embedding the covert bits. This embedding procedure takes place in the third,
eighth, tenth or twelfth bit of wavelet coefficients of detail or approximation
sub-bands for both methods. For evaluating our method we used a dataset
including ten music files (in pop, rock, classic, jazz and blues) and ten
speech files (in English and Chinese) and text messages with length of 25, 50
and 70 characters. For evaluating measures we used two measures including PSNR
and bit error rate (BER), which PSNR indicates similarity between the original
signal and the stego signal and we are looking for maximizing it, BER shows the
difference between the original message and the extracted one and we want to
minimize it. We defined a new measurement by dividing PSNR to BER which higher
values of this measurement shows better performance of the related method. For
constructing attacks we used MP3 compression methods with 32, 48, 64, 96, 128
kbits per second (kbps). After some experiments we concluded that the most
robust method for speech files is to embed covert message into the twelfth bit
of coefficients in detail sub-band using framing. On the other hand for music
files the most robust way is to embed covert message into the twelfth bit of
the coefficients in detail sub-band without framing. Thus, we can say that for
speech or music files the most robust place for embedding covert bits is the
twelfth bit of detail sub-band, the only difference is that we achieved better
results without framing for music files and with framing for speech files.