Performance Comparison between RSA and El-Gamal Algorithms for Speech Data Encryption and Decryption

ABSTRACT

war. Ensuring speech security is quite difficult because the speech file includes a lot of redundant data in comparison to the videos, digital images and text messages [1]. Traditional techniques are vulnerable to the intruder attacks with the evolution of vigorous computers. The fast and increased developing of data transmission over the Internet and sharing networks requires strong and reliable security methods to provide privacy and to prevent the illegal access to the transmitted message content. Among many solutions, encryption is employed to safeguard the secret data during transferring in insecure channel [2]. Encryption algorithms convert the multimedia data from its readable form (plaintext) to invisible form (cipher text) to increase the security and secrecy. Decryption operation is applied to restore the original message at the receptor [3]. Generally, encryption can be categorized into two major sorts: symmetric key encryption and asymmetric key encryption. In the symmetric key encryption, one unique key is shared privately between the sender and recipient for the encryption and decryption processes. Data Encryption Standard (DES), Advanced Encryption Standard (AES), Rivest Cipher 4 (RC4) and Blowfish algorithms are examples of symmetric key encryption. In the asymmetric key encryption or public key encryption, two different keys are employed: public and private. The public key is known to everyone and it is utilized for the encryption, while the private key is kept secret and it is utilized for the decryption. RSA, El-Gamal and Diffie Hellman algorithms are examples of asymmetric key encryption. In the public key cryptosystems, anyone can encrypt the message, but only the person who possesses the corresponding private key can decrypt it [4]. The asymmetric key encryption solves the problem of key distribution because there is no necessity to share the secret key between the communicating participates. Furthermore, the asymmetric key encryption depends on trap door or one-way functions. These mathematical functions can be simply calculated in one direction, while in the inverse direction, they are very difficult to solve unless the secret key is found. Hence, the asymmetric encryption scheme can provide more security than the symmetric scheme because it employs two keys [5].
Because of the complexity of encryption algorithms, they need to be utilized upon flexible platforms to meet the real-time speech encryption demands [1,6]. Moreover, there is a risk of speech data leakage in the transmission operation. For this reason, speech file encryption is of great importance and many researchers focus on speech cryptographic mechanisms and they proposed several speech cryptosystems that based on different techniques for secure speech communications. For instance, the study in [6] combines chaotic maps and k-means clustering technology for ciphering the speech files. Two permutation steps are utilized in this system. The first step depends on binary representation shuffling mechanism, whereas the second step relies upon k-means principle. The introduced cryptosystem is assessed via various speech quality metrics. The input speech data is compressed in [7] to ensure the signal quality. Then, the compressed information is encrypted by utilizing chaotic map and Fuzzy means method to obtain the final cipher signal. Several chaotic systems are adopted in [8] to achieve the encrypting process at the sender side. Hashing and blowfish algorithms are also employed to increase the speech system security. The authors in [9] designed a chaotic speech encryption approach that based on three stages. The speech samples are scrambled in the first phase followed by implementing Deoxyribonucleic Acid (DNA) code in the second phase in order to flip the sample bits. Substitution process is executed in the eventual phase to accomplish the encryption procedure. The method is validated by utilizing a variety of measurements. The researchers in [10] developed a speech cryptographic model relies on voice over Internet protocol and chaos concept so as to protect the data throughout transferring. The chaotic map generates the key stream to encipher each speech data in the packet. The suggested model is evaluated via sundry encryption/ decryption performance criteria. Chen chaotic system and fast Walsh Hadamar technology are integrated in [11] to encode the speech signal. The audio content in the method is converted into rectangular functions. After that, shuffling/diffusion architecture is carried out to realize the encrypting operation. Modified chaotic map is designed in [12] by merging two classical chaotic maps. The random sequence produced by the new map is used to perform the confusion-diffusion encryption structure upon the input speech file. The plain speech sample is encrypted/decrypted in [13] via elliptic chaotic map. The map is created firstly based on its initial-control parameters. Then, the gained sequences from the chaotic map are employed for encrypting the speech message. A cryptographic scheme is studied in [14] to encipher the digital speech information. The data are scrambled by rearranging the files as cubic sample of six sides. The next step is applying two different maps: Gingerbread and Hénon chaotic maps to attain the encryption stage. Many standard statistical tests are performed to measure the method quality. The audio and speech signals are fused in [1] to encrypt the audio frames during communication. After this, the data is encrypted via chaotic mapping by utilizing two layers: substitution and permutation to acquire the ultimate ciphered signal. Various assessment tests are performed to quantify the cryptosystem performance.
Based on the above literature review, it is obvious that the researchers proposed many solutions and suggestions in order to protect the important and confidential speech files during the transmission. However, adopting chaotic maps or chaotic maps merged with other techniques is not always the best solution for speech encryption. Speech cryptosystems based on low dimension chaotic maps suffer from weak security and little key space. On the other hand, cryptosystems based upon high dimension chaotic maps have some flaws like increasing the implementation cost and computation complexity, and decreasing the ciphering speed. Moreover, many chaotic encryption approaches can be attacked by some cryptographic analyses [15,16]. Therefore, a simple method should be developed to encrypt speech signal for meeting the real-time application requirements. So as to conquer the above shortcomings, an effective method for speech data encryption/decryption using asymmetric key algorithms is introduced in this article. This scheme is partitioned into two parts. The first part deals with encrypting and decrypting the input speech signal via the RSA technique, whereas the second part deals with encrypting and decrypting the input speech signal via the El-Gamal technique. After executing the RSA and El-Gamal mechanisms on the equivalent speech signal samples, a comparative analysis between the two algorithms is presented based on several different statistical and experimental encryption/decryption analyses, such as common quantitative measures, histogram, spectrogram, correlation coefficient, differential, speed performance and noise influence tests. The numerical and visual outcomes confirm that the methods can be used to transmit data securely with high degree of secrecy. Besides, the RSA mechanism gives better outcomes in comparison with the El-Gamal outcomes in most situations. The main contributions of this work are: (1) Analyzing the two models in order to measure their ability to protect speech data. (2) Evaluating the two encryption techniques performance based on standard speech criteria. (3) Finding the suitable technique to encrypt/decrypt the speech data through simulation.
This article is arranged as follows, Sections 2 and 3 explain the public key systems RSA and El-Gamal encryption /decryption mechanisms, respectively, while Section 4 introduces the presented work. The performance measurements are presented in Section 5 followed by the simulation outcomes for both RSA and El-Gamal algorithms in terms of encryption and decryption stages in Section 6. Finally, the conclusions based on the given results are discussed in Section 7.

RSA algorithm
In 1977, Ronald Rivest, Adi Shamir and Leonard Adelman were the first who invented the most widely asymmetric key cryptosystem known as RSA algorithm. It is an encryption and authentication cryptosystem that has been employed since that time in many cryptographic applications such as e-mail security, banking, ecommerce and digital signature on the Internet. The security of this algorithm depends on the difficulty of finding prime factors of large integers. RSA operation consists of three main stages: key generation, encryption and decryption processes which are illustrated briefly as follows [17].
4. Calculate which is the private exponent, such that × = 1 ∅( ), where symbolizes to the modulus operation or the reminder after division. Hence, ( , ) represents the public encryption key, while ( , ) represents the private decryption key [18].

Encryption/Decryption Processes
Let be a message that wanted to be encrypted, then the encrypted message is calculated via the public key ( , ) using the equation: = .
To extract the original message , the received encrypted message is decrypted via the private key ( , ) using the equation: = [19].

El-Gamal algorithm
El-Gamal algorithm provides an alternative method of RSA for asymmetric key encryption. Taher El-Gamal was the first who describe this algorithm in 1984. It has been used in Guard software, free GNU Privacy, PGP recent variations and other different cryptosystems. The security of this algorithm depends on the difficulty of calculating discrete logarithms of large prime numbers. If the same plaintext is encrypted using this cryptosystem, then a different cipher text is obtained in each time of encryption. El-Gamal operation can be described in three main steps: key generation, encryption and decryption processes which are explained briefly as follows [3].

Key Generation
1. First, select a random prime number and two other random numbers and , such that both of them are less than . 2. Calculate using the formula: Thus, ( , , ) represents the public key which can be shared between a group of users, while represents the private key which should be kept secret [20].

Encryption/Decryption Processes
To encrypt a message , firstly, a random integer number is selected, such that is relatively prime with ( − 1). Secondly, the cipher text pairs ( 1 , 2 ) is calculated using the equations: . Finally, the cipher text ( 1 , 2 ) is transmitted to the recipient. To decrypt the cipher text, pair ( 1 , 2 ), the private key is employed to recover the original message using the equation:

Presented work
This speech cryptosystem is divided into two parts: (1) Speech ciphering/deciphering process using RSA algorithm. (2) Speech ciphering/deciphering process using El-Gamal algorithm, these two parts are discussed in this section with details.

Speech Ciphering/Deciphering Operation
Using RSA System Step 1: Generate the public key ( , ) by the transmitter according to Section 2.1.
Step 3: The speech samples obtained from Step 2 are altered by utilizing the following formulas: where represents the round function to the nearest integer number.
Step 4: Encrypt 2 ( , ) according to Section 2.2 by implementing the public key to get 1 ( , ) as shown: Step 5: Convert 1 ( , ) into one dimensional signal 2 ( ), where 2 represents the final cipher speech signal which will be transferred to the receptor.
Step 6: Generate the private key ( , )by the receiver according to Section 2.1.
Step 8: Decrypt 1 ( , ) according to Section 2.2 by employing the secret key to acquire 2 ( , ) as described below: Step 9: The gained decrypted speech samples from Step 8 are restored by applying the equation: Step 10: The signal 1 ( , ) is transformed into one dimensional signal 2 ( ), where 2 represents the original reconstructed speech signal.

Speech Ciphering/Deciphering Operation Using El-Gamal System
Step 1: Generate the public key by the sender according to Section 3.1.
Step 4: Encrypt 2 ( , ) according to Section 3.2 by carrying out the public key to acquire the cipher pair ( 1 , 2 ).
Step 5: Transform 2 ( , ) into one dimensional signal 3 ( ), where ( 1 , 3 ) represents the cipher text pair which will be send to the recipient.
Step 6: Produce the secret key by the receptor according to Section 3.1.

Performance metrics
To assess the cryptosystem performance, a number of common quantitative measures are employed for both encrypted and decrypted speech signals using the RSA and El-Gamal cryptosystems. These measures are Signal to Noise Ratio (SNR), Segmental Signal to Noise Ratio (SNRseg), Segmental Spectral Signal to Noise Ratio (SSSNR), Frequency Weighted Segmental Signal to Noise Ratio (fwSNRseg), Log Likelihood Ratio (LLR) and Bit Error Rate (BER). These metrics are explained as follows.

Signal to Noise Ratio (SNR)
This metric is defined as: where represents the number of speech samples, while represent the original and encrypted speech signals, respectively [2].

Segmental Signal to Noise Ratio (SNRseg)
SNRseg is computed as: where represents the number of frames in the speech signal [3].

Segmental Spectral Signal to Noise Ratio (SSSNR)
Segmental Spectral Signal to Noise Ratio or SSSNR is described as: where represent the DFT of the original and encrypted speech signals, respectively [7].

Frequency Weighted Segmental Signal to
Noise Ratio (fwSNRseg) fwSNRseg is expressed as: where ( , ) refers to the weight of the frequency band, ( , ) and ̂( , ) are the spectrums of the input and output speech signals, respectively [4].

Log Likelihood Ratio (LLR)
LLR can be calculated as: where and indicate to the LPC vectors of the plain and ciphered or deciphered signals, respectively, whilst represents the autocorrelation matrix of the encrypted or decrypted speech signal [9].

Bit Error Rate (BER)
This measurement is represented as: where symbolizes to the energy of average bits, 0 symbolizes to the spectral density of noise [4].

Simulation outcomes
In order to quantify and assess the performance of two systems, several tests are carried out. These tests are designed and implemented on Matlab (R2013a), Windows 7, a laptop machine equipped with Processer of Intel Core i3, RAM of 3.90 GB and CPU of 2.40 GHz. The speech signals utilized in all tests are spoken sentences that recorded from different males and females in English language with sampling rate of 16 KHz for each signal and different interval (from 1 to 5 seconds) after eliminating all silence durations from them. The database utilized in this simulation is TIMIT database. This database is designed to supply speech information for acoustic researches, and for the evolution and assessment of automatic voice recognition systems. It includes wide band records of 630 speakers of 8 main American English languages. TIMIT comprises four groups of samples: phonemes, transcripts, audio and word list [21]. The cryptosystem is first implemented using the RSA algorithm. The variable values and in this work are set as 3 and 7, respectively, while the value of is chosen to be 5. Thus, the value of public key ( , ) equals to (21,5) and the value of private key ( , ) equals to (21,5). The work is then implemented using the El-Gamal algorithm.
The variable values , , are set as 13, 7, 3 and 9, respectively. Hence, the public key value ( , , ) equals to (13,7,5) and the private key value ( ) equals to (3). Many values have been tested using Matlab program to generate the keys for both systems: ( , , ) of RSA and ( , , , ) of El-Gamal. These parameters are specified in this simulation because they produce the best encryption result. The results of implementing the two mentioned techniques on the input speech signal are clarified in Figure  1. Figure 1 (a) shows the original signal; Figure  1 (b, c) illustrates the encrypted and decrypted speech signals by employing the RSA, whereas Figure 1 (d, e) illustrates the encrypted and decrypted speech signals by employing the El-Gamal. This figure demonstrates that the ciphered speech signals of the two techniques are quite different from the input speech. In addition, the deciphered signals resulting from applying the two algorithms are identical to the input signal. These visual outcomes prove the high ciphering and deciphering quality of both the approaches. Ciphered and deciphered speech signals, respectively using the El-Gamal system

Quality of speech encryption
To assess the quality of speech signal encryption, six quality metrics are used which have been aforementioned: SNR, SNRseg, SSSNR, fwSNRseg, LLR and BER. The quality of encryption is high when the values of LLR and BER increase, whereas the values of SNR, SNRseg, SSSNR and fwSNRseg decrease [1,7]. The numerical outcomes of the presented system for both RSA and El-Gamal schemes are explained in Table 1. It can be found from this table that LLR and BER scores are high, contrary, SNR, SNRseg, SSSNR and fwSNRseg scores are low for both methods, which means that the encryption quality is high for both systems. But the RSA technique gives lower value results in terms of SNR, SNRseg, SSSNR and fwSNRseg, and higher value results in terms of LLR and BER than the El-Gamal technique. This implies that the RSA ciphering performance is better than that for the El-Gamal method for the same input test speech signals.

Quality of speech decryption
The same six quality metrics are used to measure the quality of speech signal decryption: SNR, SNRseg, SSSNR, fwSNRseg, LLR and BER. The quality of decryption is high when the values of LLR and BER decrease, whilst the values of SNR, SNRseg, SSSNR and fwSNRseg increase [3,4]. The results of the proposed method for both RSA and El-Gamal mechanisms are illustrated in Table 2. It can be noticed from this table that LLR and BER outcomes are low. On the other hand, SNR, SNRseg, SSSNR and fwSNRseg outcomes are high for both systems, which indicate that the decryption quality is high for both cryptosystems. But the El-Gamal approach gives higher value results in terms of SNR, SNRseg, SSSNR and fwSNRseg and lower value results in terms of LLR and BER than the RSA technique. This demonstrates that the El-Gamal deciphering performance is better than that for the RSA method for the same input test speech signals.

Histogram analysis
Histogram is an approximate depiction of the distribution for numerical or categorical information. The speech sample values should have a roughly flat distribution after utilizing the ciphering operation in order to endure the statistical attack [11]. Figure 2 (a) depicts the input signal histogram; Figure 2 (b, c) depicts the encrypted and decrypted signal histograms after employing the RSA technology, and Figure 2 (d, e) depicts the encrypted and decrypted signal histograms after applying the El-Gamal technology. It is obvious from Figure  2 that the consequent histograms after implementing the two algorithms are different from the input file histogram and the output speech samples possess approximately flat distribution, which confirms the good encryption performance for the two described schemes. Moreover, the decrypted file histograms are similar to their corresponding original file, which proves the good decryption performance for the two schemes. However, by comparing Figures 2 (b) and 2 (d), it can be noticed that the output speech file histogram from the RSA encryption is flatter than the output signal histogram from the El-Gamal encryption. This refers that the ciphering process with the application of the RSA is more efficient than the El-Gamal, whilst the deciphering process is equally reliable and efficient with the utilization of the two methods for the same test speech file.

Spectrogram analysis
Spectrogram points to a visual image of the spectrum for audio signal frequency when it changes with time [12]. Figure 3 (a) exhibits the input file spectrogram; Figure 3 (b) presents the variations in the input file spectrogram after the RSA encryption and Figure 3 (c) presents the recovered file spectrogram after the RSA decryption, whereas Figure 3 (d) shows the changes in the plain file spectrogram after the El-Gamal encryption and Figure 3 (e) shows the restored file spectrogram after the El-Gamal decryption. It is clear that the spectrograms of the cipher signals for the RSA and El-Gamal systems are fully different from the original version spectrogram. Further, the plain and decrypted signal spectrograms are identical. This indicates the high ciphering and deciphering properties of the two audio cryptosystems for the same input audio signal.

Correlation coefficient analysis
Correlation Coefficient or CC is a major index to assess the ciphering/deciphering quality of speech cryptosystem. If the CC value is zero or near to zero, then the relation between speech samples in the plain and its corresponding cipher signals is weak, this demonstrates a high ciphering effect. Inversely, if the CC value is one or close to one, then the relationship between speech samples in the input and output signals is strong, this refers to a high deciphering effect. This indicator is computed as [13,14,19]: where symbolizes to the total number of speech samples employed in the calculations, and denote the values of speech signals for the input and output files, respectively. Table 3 exhibits the ciphering and deciphering outcomes of both exploited methods for the test speech files. This table reveals that both adopted mechanisms achieve very small CC scores (almost zero) in the encryption process and quite high CC values (one) in the decryption process for all input signals. Hence, the RSA and El-Gamal systems are considered efficient and complicated approaches for ciphering audio samples, and they can produce a deciphered signal that totally corresponds to the original one. Additionally, Figure 4 clarifies the scatter plots of input, encrypted and reconstructed audio files for the RSA and El-Gamal schemes. Obviously, the speech samples in Figure 4 Table 3 and Figure 4 that both described systems can reduce the CC values between samples in the original signal and increase the CC values to one in the restored signal; thereby RSA and El-Gamal cryptosystems can counter this analysis successfully.

Differential analysis
Numbers of Samples Change Rate (NSCR) and Unified Average Changing Intensity (UACI) parameters are generally utilized to analyze the resistance of encryption process against differential attack. Two different signals are enciphered in this test via same keys; the original signals are differ only by one sample. Next, the resultant cipher speech files are compared by applying the NSCR and UACI. These two parameters are given by [11,19]: 1 and 2 denote the two encrypted files which their input signals differ by one sample, indicates the overall number of audio samples. The optimal values for NSCR and UACI are 99% and 33%, respectively. A secure ciphering algorithm should possess NSCR and UACI parameters that are close to the idealistic values. The NSCR and UACI outcomes are computed for the test files by executing the RSA and El-Gamal methods, and the obtained results are given in Table 4. Both NSCR and UACI values in this table are near to the optimal values, which clearly reveal that the clear and ciphered signals produced by both systems are totally different. Also, it can be observed that the obtained NSCR and UACI scores for the RSA are better than those scores for the El-Gamal This manifests the high degree of security of the RSA in this analysis in comparison with its counterpart the El-Gamal for encrypting the same different plain signals.

Speed performance analysis
The speed is an important parameter that must be considered in order to analyze the cryptosystem efficiency [4]. The hardware configuration utilized for simulation findings has been mentioned in Section 6. The required time (seconds) is computed in this analysis of encryption/decryption for both techniques implementation on the input speech files. Table  5 contains the total computational time results of ciphering and deciphering procedures for the RSA and El-Gamal methods. According to this table, the execution times of encryption/decryption operations for the two cryptosystems are quite short and satisfactory.
Further, the encryption/decryption time increases as the input signal length increases for both systems. Besides, the deciphering time consumes more time than the enciphering time for the two schemes. Also, it can be shown that the encryption/decryption times for the El-Gamal cryptosystem are shorter than the corresponding times for the RSA cryptosystem. Table 5 manifests that the El-Gamal system is faster than the RSA system at encryption/decryption for different test plain signals.

Noise influence analysis
In this analysis, the deciphered speech signal is assessed at the receptor side in the noise existence with various SNR estimations [15,19]. Additive White Gaussian Noise (AWGN) is added to the ciphered signal with different SNR (5dB-50dB). The noise influence on the performance criterion SNR, SNRseg, SSSNR, fwSNRseg, LLR, BER and CC are computed for the decrypted signal, and the obtained outcomes for the adopted techniques RSA and El-Gamal are represented in Tables 6 and 7, respectively. The larger SNR, SNRseg, SSSNR, fwSNRseg and CC values, and the lower LLR, BER values between the input and reconstructed signals, yields a good deciphering quality. It can be seen from the tables that the SNR, SNRseg, SSSNR, fwSNRseg and CC scores increase, whilst LLR and BER scores decrease as the input SNR of noise increases gradually for both algorithms. This reflects the robustness of the cryptosystems to noise distortion. Furthermore, it can be realized from Tables 6 and 7 that the obtained SNR, SNRseg, SSSNR, fwSNRseg and CC results are always greater, whereas LLR and BER results are always lower for the RSA than those results obtained for the El-Gamal at all input SNR values of noise. According to the outcomes in Tables 6 and 7, it is clear that the RSA method outperforms the El-Gamal method in noise invulnerability by the means of performance metrics.

Conclusions
This work performs a comparative study between the RSA and El-Gamal techniques in order to determine which of the methods is more effective for encrypting speech files. The two schemes are tested and compared via sundry empirical analyses: SNR, SNRseg, SSSNR, fwSNRseg, LLR, BER at encryption and decryption processes, histogram, spectrogram, correlation coefficient, and differential analyses, time for enciphering/deciphering operations, and finally, the effect of noise analysis. It can be concluded from the empirical and visual outcomes that the two speech cryptosystems are robust and can provide a reliable method to encipher and decipher the speech data with high level of confidently, security and privacy. Additionally, the RSA cryptosystem performance is superior to that of the El-Gamal cryptosystem in most analyses by the means of sundry enciphering and deciphering speech quality indicators.