Tue 05 October 2021 in steganography by Regalis
Steganography - brief introduction
In this article, I will introduce the interesting field of steganography. My goal is to show the basics of steganography and to present a few techniques that will allow you to write clean, simple and, yet powerful program.
What is steganography
Steganography is a practice of concealing a message within another message (or a physical object) in such a way, that the message does not attract attention to itself as an object of scrutiny.
Whereas cryptography is the practice of protecting the content of a message alone, steganography is concerned with concealing the fact that a secret message is sent.
Steganography in real world
Let's say Alice want to send Bob a secret message in such a way that no one else will be able to notice the fact that they are communicating witch each other.
They can simply agree to use a local newspaper for this purpose. Let's say communication is started by placing a special ad in the "cat adoptions" section. A cat must be a female of black color and has to limp on her front left leg. Then, the "date of availability" stores the page number of the secret message, and the "phone number" represents the individual words on that page. If the "phone number" is too short to encode all the word indexes, then the advert should contain an email address where all double digits represent the missing word indexes.
Consider the following ad placed by Alice:
Quick facts ----------- Gender: Female Breed(s): Domestic Shorthair Gender: Female Color: Back Health: after an accident; she is in the process of healing her front left leg Availability: 20 September 2021 Contact ------- Phone: 1225030830 Email: alice4590@regalis.tech
The ad above should trigger Bob to decode the messages. Based on a prior
agreement with Alice, he knows that the secret message contains 7 words
and was
placed at page number 20
. After splitting a phone number and an email address
into a sequence of double digit numbers, he can find individual words:
Phone: 1225030830 12 25 03 08 30 Email: alice4590@regalis.tech: 45 90
Now, Bob can decode message by simply finding words placed at positions
(counting from first word at page number 20
) 12, 25, 3, 8, 30, 45 and 90.
As you can imagine, this form of communication is not particularly convenient.
Steganography in digital world
Nowadays, when we have the Internet, websites, social media - we can use similar techniques but in a much more practical way. Simply by replacing an idea of a newspaper with some predetermined internet forums - we can significantly increase the convenience and the speed of communication.
But we might be smarter than this...
I strongly believe you have missed the cat in the picture above, haven't you? There he is and in the next section of this article we will extract him from this picture using a tool that we will build from scratch.
Embedding secret message inside an image
Digital images are an excellent medium for hiding secret messages. In order to use the images, we need to know their structure.
Image internals
An image is made up of pixels, each pixel is made up of colors and a color is represented by numbers within a certain range. There are a lot of different ways to represent the color - RGB, RYB, CMYK, HSL etc. For the purpose of this article, we will cover the first one which is RGB.
Color schemes internals
In RGB color scheme - each color is expressed as an RGB triplet
(r, g, b)
. Each individual component can vary from zero to a defined
maximum value. If all the components are at zero the result is black.
If all are at maximum, the result is the brightest representable white.
Now, let's discuss the accuracy. If each of the R, G and B components can be in
the range from 0 to 255 - we are talking about 8-bits per channel. This is
because 255 can be stored in 8-bit (single byte) number - 255 in decimal
= 11111111 in binary. Or, in other words 28 - 1 = 255
.
This means that we need 3 * 8 = 24bits
to represent a color, that is why we
sometimes refer to it as to 24-bit color depth (the total number of bits
used for an RGB color).
24-bit color depth is also known as True color.
How many colors can be used in RGB with 24-bit color depth
?
0 - 255 (256 different values) for R channel 0 - 255 (256 different values) for G channel 0 - 255 (256 different values) for B channel 256 * 256 * 256 = 16777216
Note on color depth
Unfortunately, color depth sometimes refers to either the number of bits used to indicate the color of a single pixel or to the number of bits used for each color component of a single pixel.
I'm using the first definition.
Different color depths:
color depth | number of bits per channel | values for each channel | number of possible colors |
---|---|---|---|
8-bit | 3 bits for R and G, 2 bits for B |
0 - 7 for R and G, 0 - 3 for B |
8 * 8 * 4 = 256 |
12-bit | 4 | 0 - 15 | 16 * 16 * 16 = 4096 |
16-bit | 5 | 0 - 31 | 32 * 32 * 32 = 32768 |
30-bit (Deep color) | 10 | 0 - 1023 | 1024 * 1024 * 1024 = 1073741824 |
48-bit | 16 | 0 - 65535 | 65536 * 65536 * 65536 = 2.814749767×1014 |
Example images with different color depths:
Some Q&A to sum this up:
- How many bytes are needed to store raw (without any compression) VGA (640x480) image data using RGB with 24-bit color depth?
- 921600 bytes; 24 bits per pixel = 3 bytes per pixel; 307200 pixels (640 * 480), 3 * 640 * 480 = 921600 bytes = 900KiB = 0.879 MiB
- How to calculate number of possible colors in RGB with 2-bit color depth?
- It is impossible to store R, G and B components in 2-bit number, this number can only fit four different values (from 0 to 3). With only four colors, this color depth is mainly used with fixed palettes (for example four different shades of gray).
- I am lost, do I have to understand everything to be able to move on?
- No, the description of the color representation system was intended to warm you up and remind you about the binary number system.
Deep dive into 24-bit RGB
In this section we need to learn how to deal with 24-bit RGB colors because we are going to use them to hide the secret message (and to extract the cat from the previous image).
Let's start with a few examples showing how to represent several colors:
Color | Decimal notation | Hexadecimal notation | Binary notation |
---|---|---|---|
white | (255, 255, 255) | (ff, ff, ff) | (11111111, 11111111, 11111111) |
black | (0, 0, 0) | (00, 00, 00) | (00000000, 00000000, 00000000) |
red | (255, 0, 0) | (ff, 00, 00) | (11111111, 00000000, 00000000) |
orange | (255, 165, 0) | (ff, a5, 00) | (11111111, 10100101, 00000000) |
pink | (255, 192, 203) | (ff, c0, cb) | (11111111, 11000000, 11001011) |
In HTML and CSS we can use several different notations which exactly mean the same,
for example we can use background-color: pink
or background-color: rgb(255, 192, 203)
or #ffc0cb
.
Notice, that the last one is just a hexadecimal representation
of all R, G and B components.
To make everything clear, I have prepared the following animation:
Embedding characters into an image
We've come to a point where we can finally talk about hiding the message in the picture. As you remember, an image is simply a sequence of pixels, each pixel consists of a color represented as RGB components.
By knowing this, how can we imagine an 4x1
(width x height) image
with four red pixels?
Well, we can simply write:
255 0 0 <- first pixel (R is 255, G is 0, B is 0) 255 0 0 <- second pixel 255 0 0 <- third pixel 255 0 0 <- fourth pixel
The same, but in a binary number format:
11111111 00000000 00000000 11111111 00000000 00000000 11111111 00000000 00000000 11111111 00000000 00000000
What if we store our data in the least significant bits of each RGB
component? A single ASCII character (for example a
) takes exactly one byte,
if we use just two bits of each RGB component - we can embed the letter
in exactly two pixels (whole RGB from the first pixel, R from the second one).
ASCII table
Computers can only understand numbers, so an ASCII code is the numerical
representation of a character such as a
. For each printable character,
there is exactly one number which identifies this character. For example
a
is 97
in decimal and 01100001
in binary, b
is 98
in decimal
and so on.
Let's try to go through this process:
1. Convert a letter 'a' into a binary form: a = 01100001 2. Split into pairs of two bits: a = 01 10 00 01 3. Store each pair in the highlighted positions of our image:11111111 00000000 00000000 11111111 00000000 00000000 11111111 00000000 00000000 11111111 00000000 000000004. Here is the final result:11111101 00000010 00000000 11111101 00000000 00000000 11111111 00000000 00000000 11111111 00000000 00000000
Looks as if we still have room for two additional characters... Let's embed b
and
c
:
a = 97 = 01 10 00 01 b = 98 = 01 10 00 10 c = 99 = 01 10 00 11 After embedding:11111101 00000010 00000000 11111101 00000001 00000010 11111100 00000010 00000001 11111110 00000000 00000011
Are you wondering if such a change is visible? Let's take a look at our picture before and after embedding:
Try it yourself!
Try to manually extract characters from the picture above. Use a color picker, write down a decimal or hexadecimal representation of each "pixel", convert it into binary, mark last two bits of each RGB component. Remember, the picture above is magnified 64 times.
Working with actual images - PPM image format
PPM (portable pixmap format) is probably one of the simplest file formats you'll ever see in your life. A format was designed in 1980s as the format that allowed bitmaps to be transmitted as a plain ASCII text. This allows you to literally write pictures by hand.
Syntax for the RGB image is really simple:
P3 1 4 255 255 0 0 255 0 0 255 0 0 255 0 0
That's all! As you can guess, the image above contains our four red pixels.
You can copy and paste the example above into your favorite text editor,
save the file as an example-image.ppm
and open it with your image viewer application.
Let's take a closer look at this format:
P3 <- P3 is a prefix which corresponds to RGB image 1 4 <- width and height 255 <- maximum value for R, G and B components (value 255 defines 24-bit color depth) 255 0 0 <- RGB components for the first pixel 255 0 0 <- RGB components for the second pixel 255 0 0 <- RGB components for the third pixel 255 0 0 <- RGB components for the fourth pixel
Try it yourself!
Try to recreate the following image:
The P3
image type is not optimal, the data takes up a lot of space. Since
each printable character such as 2
or 0
takes up one byte, storing
an 1920x1080 image requires about 24883217 bytes, which is about 23.7MiB.
On the other hand, you can work effectively with such a format without
any additional libraries. Basic knowledge about input/output streams from
the C++ iostream
header is just sufficient. You can load, manipulate PPM images
using standard std::cin
and std::cout
.
For example, you can go through the entire image with just a few lines of code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Let's go back to our red pixels with embedded characters a
, b
and c
:
P3 1 4 255 253 2 0 253 1 2 252 2 1 254 0 3
You can use the program above and see how it goes:
1 2 3 4 5 6 |
|
Now, let's convert your favourite image in jpg
or png
format using the
convert
tool from the ImageMagick
package and pipe it to our program:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Please note, that the -compress none
and -strip
arguments for convert
are
mandatory. The first one ensures that convert
will use the P3
PPM format instead
of P6
(pixel data in a binary form), the second one ensures that the output
file will not contain any comments.
Try to experiment with convert
, prepare a couple of images for the rest of
this article:
1 2 3 4 5 6 7 8 |
|
A brief reminder of bitwise operations
Before we go any further, I would need to discuss bitwise operations. We need to know how to read specifi bits from a number. We are going to use a bitwise AND operation to read bits, and a bitwise OR operation to set bits.
In many programming languages the AND operation is performed using an &
operator, and the OR operation is performed using an |
operator.
Below is a reminder of how these operations work:
A | B | A & B |
---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
A | B | A | B |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
And some examples in C++:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|
The operations above work exactly the same in JavaScript, Python and many other languages, if you feel more comfortable with another language - go ahead and try to write everything in your favorite language.
Let's extract something...
I have decided to start with extracting rather than embedding. This is because extracting is a little bit easier to implement and it will be useful for testing our next program.
We can upgrade our previous example so that after reading a given pixel - it reads the last two bits from each R, G and B component.
First of all, we need to reorganize our main loop which currently is going through
pixels. What we need is to read the header and then read exactly X
RGB components, where X
is a number of components that we need in order to extract
a single byte of a hidden message. Since previously we used two bits per each
component, we need exactly four components to extract a single byte.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
Let's try to compile and run the program above. Remember our previous magnified image with four red pixels? We can use this image to feed our program!
1 2 3 4 5 6 7 |
|
Boom! We can see our embedded message: abc
!
There was a lot going on there, as you can see convert
is a really
powerful tool that can do a lot of things on the fly... Let's split these
operations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
Embedding text messages
Please check out the following animation, I hope it will bring you even closer to the principle of how embedding process works:
Try to experiment with colors - instead of two, use the last three or four bits of each RGB component to embed characters and see how this affects the picture. Write it down on a piece of paper and try to embed different characters. Familiarize yourself with this procedure, we will need it to write the final program.
The following source code contains an example implementation of an embedding procedure. It reads a source image from the standard input and the message from a file provided as the first argument. Each RGB component from the source image is modified on the fly and sent directly to the standard output.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|
Finally we have a complete set of tools. Take a look at example usage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
You can convert your output image to PNG format:
1 |
|
Here is my result:
Can we embed our message "directly" into a PNG image without creating any temporary files? Of course:
1 2 |
|
You can even work on images from the Internet! Let's embed our message directly
into an image from Wikipedia and store it in secret.png
:
1 2 3 |
|
Here is my result:
Check my results!
Try to extract secret message from the images above. You can use convert
with URL provided as an input image: convert -compress none IMAGE_URL ppm:- | ./regalis-extract
.
It is just the beginning of different possibilities
I have only shown you a fraction of what is possible with our simple program. Have you noticed that we are able to work with many different image formats without even digging into it?
We have only leveraged the most basic elements of programming and yet we are able to handle JPG, PNG, SVG, TIFF image formats, we can even use HTTP, HTTPS to embed messages into images from the Internet. If you happen to be a Windows user, or you have just started your programming adventure - this may look a little bit confusing to you. Do not be confused, what you have seen is just a power of GNU/Linux operating system - an excellent environment without any limitations. In this environment, you can build simple applications which get superpower when combined with other tools which are already there. You have seen a great example of using the KISS principle.
KISS
KISS, an acronym for keep it simple, stupid. The KISS principle states that most systems work best if they are kept simple rather than made complicated; therefore, simplicity should be a key goal in design, and an unnecessary complexity should be avoided.
Instead of giving you a tool, I have provided you with an idea behind a steganography, I have explained you some internals and guided you through the process of implementing it. I have intentionally omitted all things related to the file handling and instead decided to use the standard input/output. I have skipped all things related to handling complex file formats (like JPEG).
Unfortunately, nowadays many programmers who are on their early days get completely overwhelmed by the amount of knowledge required to build something useful. Well, as you have already seen, this is simply not a case in GNU/Linux environment. You don't need to waste your time to learn how to build a fancy GUI, how to manually handle a JPEG file and so on... In fact, I'm fully convinced that you should never allow your environment to take control over your learning curve, especially when there are big corporations behind such an environment. You can build powerful tools right now, just by switching your environment to GNU/Linux - be productive, be powerful - be free.
In the meanwhile, let's jump back into more advanced usage examples.
Advanced usage examples
Up until now, we mostly focus on embedding text messages, but what about other types? Like for example audio files, or even images, programs? Can we do that?
Do we need to learn how to deal with binary data? Turns out we don't, we can simply use one of available binary to text encoding algorithms, for example base64 which is wildly used all over the world (emails, web pages etc.).
Base64
Without unnecessary formalities, let's play with it:
1 2 3 4 5 6 7 8 9 10 11 |
|
Encoding is not encryption
Do not confuse encoding with encryption. With encoding - we only change the form of a data, it is always a way to bring it back to it's original form without any special knowledge such as passwords etc.
Here is how to encode image in PNG format into text using base64
:
1 2 3 4 5 |
|
There will be a lot of data, so we can redirect it directly to a file
image-base64.txt
:
1 |
|
To convert it back to it's original form, you can do the opposite:
1 |
|
Ready for inception?
What about embedding an image which contains embedded secret message into another image? That shouldn't be a problem:
Believe or not, but the image above contains another image which we previously used as an example of embedding text messages... Can you spot the difference?
You can try it yourself! Prepare three elements:
- some random image of your choice -
image.jpg
, - secret message which will be embedded inside the image above -
msg01.txt
, - some random image of your choice (must be much bigger then the first one) -
image-inception.jpg
Then, you can simply compose the operations like this:
1 2 3 4 5 6 7 8 |
|
What about getting it back? We need to compose the following operations:
- convert
image-incetpion-final.png
into PPM image format, - extract secret message (will be the
msg02.txt
), - decode secret message (will be the
image-with-msg01.png
), - convert decoded image in to PPM image format,
- extract secret message (will be the
msg01.txt
).
Let's go then, we can do it all at once:
1 |
|
The exact content of your msg01.txt
should be the result of the operation
above.
We didn't even touch our programs (regalis-extract
and regalis-embed
) and yet we are
able to handle embedding images inside images. That's exactly the point of
KISS principle.
Since embedding music, archives (like tar
, zip
) and any other types of file
will work exactly the same, I will go ahead and show you something a little bit
different.
Combining steganography with encryption
Hiding messages in images gives us some possibilities, but If you want to send something very confidential - you will be in troubles if someone learns our techniques... We can fight it with encryption. All we need is to encrypt the message before hiding it (or hide it even deeper, what about triple inception?).
We will use gpg (Gnu Privacy Guard - OpenPGP encryption and signing tool). Let's start with some examples, without touching images:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Since saving secret messages on disk in plain form can leak your data - you can pass it directly into gpg's standard input:
1 2 3 |
|
gpg
accepts any data as an input - you can encrypt literally
everything, from text files, to images, compressed archives etc.:
1 2 3 4 5 |
|
To decrypt your message, just use --decrypt
option:
1 2 |
|
The result of gpg --symmetric --armor
is just a text, you can embed it like
any other message:
1 |
|
Extracting encrypted message:
1 |
|
By using symmetric encryption, we run into the problem, that someone may guess our password and then successfully decode and decrypt our message...
Asymmetric encryption, also known as public key encryption will completely eliminate this problem; actually asymmetric encryption is the gpg's default mode of operation. This is beyond the scope of this article, feel free to do your own research.
Other ideas
This article is probably too long by now, so I will stop here and suggest some other interesting possibilities:
- you can wrap
regalis-extract
andregalis-embed
with a custom shell script, this will enable easier use of the tools. Consider the following example usage:steg embed msg01.txt input.png output.png
,steg extract output.png
, - while playing with "inception", use
file
utility to determine what exactly is the type of file you decoded withbase64
(it may be an archive, image etc.), - try to write a script/program which will tell you how many data you can embed into specified image,
convert
utility can handle GIF files as well, you can wrapregalis-embed
andregalis-extract
tools and hide enormous amount of data inside GIF images,- try to rewrite
regalis-embed
andregalis-extract
using different programming language.
Please share your own ideas.
Final thoughts
We've reached the end, haven't we? Even as it comes to images we barely touched the tip of the iceberg. The technique I presented can't be used with "analog images", you can't just put a picture in a frame and hang it on the wall - your embedded data won't survive printing. Doing something like this requires a lot of sophisticated techniques.
What about different media types? Steganography offers great opportunities, we don't have to end with images - there are movies, audio files.
This was the first article in the steganography series, there may be more to come. I look forward to receiving your feedback!
By the way... Don't forget to extract the cat from the first picture! Can you find his name?
Stay tuned!