TEXT STEGANOGRAPHY BASED ON FONT TYPE IN MS-WORD DOCUMENTS

With the rapid development of Internet, safe covert communications in the network environment become an essential research direction. Steganography is a significant means that secret information is embedded into cover data imperceptibly for transmission, so that information cannot be easily aware by others. Text Steganography is low in redundancy and related to natural language rules these lead to limit manipulation of text, so they are both great challenges to conceal message in text properly and to detect such concealment. This study proposes a novel text steganography method which takes into account the Font Types. This new method depends on the Similarity of English Font Types; we called it (SEFT) technique. It works by replace font by more similar fonts. The secret message was encoded and embedded as similar fonts in capital Letters of cover document. Proposed text steganography method can works in different cover documents of different font types. The size of cover and stego document was increased about 0.766% from original size. The capacity of this method is very high and the secret message was inconspicuous to an adversary.


INTRODUCTION
Steganography is the ancient art and young science of hidden communication. A broad definition of the subject includes all endeavours to communicate in such a way that the existence of the message cannot be detected. Unlike cryptography, which merely ensures the confidentiality of the message content, steganography adds another layer of secrecy by keeping confidential even the fact that secret communication takes place. The corresponding protection goal is called undetectability (Bِ hme, 2010).
Earlier information hiding methods merely embed payload (external information) into a cover (e.g., text document, image and audio) and in recent years, specialized data hiding methods are proposed to serve specific purposes. For instance, in steganography, the cover content is carefully manipulated to encode payload while aiming to conceal the very existence of the encoded information (Por et al., 2012).
Steganography literally means "covered writing" and is the art of hiding the very existence of a message. A message is the information to be hidden, anything that can be embedded into a bit stream. Together the cover carrier and the embedded message create a steganocarrier. Hiding information may require a stegano key which is additional secret information, such as a password, required for embedding the information (Khandekar and Dixit, 2012).
Texts are used of a wide range, as numerous text materials are transported on the network every day. However, the studies on text steganography is relatively backward compared to those mainstream hiding methods that use images, audios and videos as cover data, which is due to the lack of redundancy in text (Liu et al., 2009) In this study, a text steganography method used in MS-Word documents is proposed, which depending on Science Publications JCS the Similarity of English Font Types. Rest of the study is organized as follows. Section 2 introduces a related works, Section 3 presents materials and methods and Section 4 demonstrated the experimental results. Finally conclusions are provided in Section 5. Elkamchouchi and Negm (2003) proposed algorithm to apply the principle of walernmrking for hiding English information in Arabic text. Hassan and Shirali-Shahreza (2008) proposed method for hiding information in Persian and Arabic Unicode texts Zhong et al. (2007) proposes a steganography technique for hiding data in a kind of PDF texts. Por and Delina (2008) propose a new approach for information hiding using inter-word spacing and inter-paragraph spacing as a hybrid method Por and Delina (2008). Shirali-Shahreza (2008) proposes a Text Steganography by Changing Words Spelling. In this method the US and UK spellings of words substituted in order to hide data in an English text (Shirali-Shahreza, 2008). Khairullah (2009) proposes a new approach for steganography in Microsoft Word documents, by setting any foreground color for invisible characters such as the space or the carriage return is not reflected or viewed in the document. Liu et al. (2009) propose algorithm to be used in online chat. Shakir et al. (2010) develop a text steganography by using the diacritics-Harakat-of Arabic language as a covered medium to hide the Chinese stroke text. Yang et al. (2011) propose a new steganography proposed in MS Excel Document using text rotation technique. Text hiding in mobile phone simple message service using fonts was proposed by Bhaya (2011) which suggests a method of hiding the information (0,1) in cover SMS message by changing the fonts of each character using two fonts of mobile devices. Moraldo (2012) introduces a text steganography method based on Markov chains together with a reference implementation. This method allows for information hiding in texts that are automatically generated following a given Markov model (Moraldo, 2012).

MATERIALS AND METHODS
In this section, we explain proposed method in detail with encoding, embedding and extraction procedures. In addition, overview of fonts was presented.

Fonts Overview
Microsoft Word is popular word processing software which comes with Microsoft Office package. One of the reasons behind its popularity is huge number of text formatting features. One of these features is font format which have advantages of great capacity, good imperceptibility and wide application range (Khairullah, 2009).
A font is a graphic design that is applied to a collection of numbers, symbols and characters. A font describes a certain typeface, together with other qualities such as size, spacing and pitch. Fonts are used to display text on the screen and to print text. Fonts have font styles such as italic, bold and bold italic.a MS-Word provides a number of standard fonts including Times new Roman, Courier New, Arial and many others which will spice up your document content. Some fonts can be the same size, but look bigger, due to the x-height. The x-height is literally the height of the small letter x in the font family. Different fonts have different x-heights and as a result, some fonts look larger than others, even though they are the same point size. The illustration in Fig. 1 shows how the font size and xheights are measured.
Some newer font families, such as Tahoma and Verdana, have been designed with large x-heights. That means different font families that are all the same point size, some look bigger, however, because of their larger x-height (Weinschenk, 2011).

Proposed Method (SEFT)
This study based on Text documents which are more prevalent and indispensable form of information nowadays and always be used as a cover medium. Most text steganography are based on the format TXT, MS Word, PDF, PPT.
Proposed method introduces a new method for writing hidden messages in text of document file format (which lack of redundancy compared to images or audio) called (Similar English Font Types, SEFT, Technique) use the most similarity types of English fonts in hiding message by changing the font to another.
In general, any type of font has many of types similar to its fronts. This property is the basic of this study.

Steps of Proposed Method
In this section, we describe the proposed method in detail, which has been implemented in C#.net language. It essentially consists of four main components: • Create similar font array • Create code

Create Similar Font Array
This is the most important component of the method. Begin by determine the type of document font and then find the more similar types of it. In this study, assumed (15) type of cover document fonts which are more usable and prevalent in text documents (TXT, MS Word, PDF, PPT). Table 1 explains the cover document fonts and their similars; three similars will be used for each type.

Create Code Table
The coding of each symbol in secret message represented by three types of fonts, thus, 27 characters (English alphabets with space) can be hidden in 3 letters of cover using 3 different fonts, for example: similar font array of Century font is: Century = {Century751 BT,CenturyOldStyle,CenturyExpdBT}.
As we will see, if the code of current symbol is (1, 1, 1), then we will use (first similar, first similar, first similar) fonts from similar font array. Also, (1, 2, 2) means (first similar, second similar, second similar). (3, 1, 1) mean (third similar, first similar, first similar) and so on. The begin of message start from first capital letter in document and the end of message represented by code (0, 0, 0), which means the original document font.

Embedding Process
In this study, secret message was embedded in Capital letters only of cover document, because the capital letters different in pattern from small English alphabet letters. Embedding process consist of three steps. The first step is determining cover document font to retrieve its similar fonts array. In the second step, scan cover document to find English capital letters, as we saw, need three capitals letters to hide one symbol.
Finally, in third step, change the font type of first three capitals letters by similar fonts depending on code. The following procedures explain these processes.

Embedding Process
• Open cover document, find its type of font • Scan cover document to find capitals English letters, • Compute number of capitals English letters to check the capability of embedding • For each symbol in secret message • Retrieve its code • Change font type of three capitals letters by similar font array according to its code

Extracting Process
Each three capitals letters; determine the code of one hiding symbol. The steps below show the extract process.

Extracting Process
• Open Stego document • For each three capitals letters • Determine the code • If the code is (0, 0, 0), then the end of secret • message was reached • Else, find corresponding secret symbol, using code table

Explain of Proposed Technique by Example
More details can be found in this section of implementation the software. The corresponding GUI for the proposed SEFT technique was shown in Fig. 2.
The following block diagram, Fig. 3, explains the GUI operations performed by sender to implement hiding process. The hider chooses cover document and inputs secret message. The system will get font type and check the capability of hiding in selected cover file (by compute the number of capitals letters in cover file with input secrets characters). Finally the system coding the secret letter and hides characters.   As we can see from example, the input in our example is: Happy day. The number of symbols is 6 (with space). According to Code Table of Table 2, Symbol (h) coded as:

JCS
(1, 3, 2) → (first similar, third similar, second similar) and so on The cover document font is times new roman, then the symbol (h) coded by first three capitals letters of cover document, by replacing with its similar, Table 1. Figure 4 and 5 represent cover document and stego document respectively.

RESULTS
The proposed method of the text steganography method is tested by taking different cover documents of different font types and hiding the same secret message in some of them. We need three characters to hide one character of secret message (one symbol in three capitals letters). That mean, if the cover file contains six characters, we can hiding two characters in it. The results that are got from these experiments can be summarized in the Table 3. The size of cover and stego document was compared and shown the average of size for Stego document (after hiding) is increased about 0.766% from original size.

DISCUSSION
As it is seen in the Table 3, proposed method has good perceptual transparency based on font types similarity, high capacity and robust to digital copy-past operation. This three dimensions are not independent, but should rather be considered as competing goals, which can be balanced when designing a steganographic system. The increasing in stego document size result from using various font types.

CONCLUSION
This study proposed a novel method of hiding information in Microsoft Word documents. Microsoft Word documents are very much common in everyday life of today's digital world. The capacity of this method is very high, depending on the number of Capital Letters in cover document. As we show in Table 3, some fonts take large size when replace it with their similarity such as (Arial) font type and some fonts are not, such as (Lucida Sans, Century) font types.
Because the stego document will not change during compression, copying and paste between computer programs, the data hidden in texts remains intact during these operations.