Thursday, June 28, 2012

What Is Snapshot and How It Works?

Hello everybody,

What is snapshot in computer worlds and how it works? How you ever heard about snapshots in Cloud, SAN or Virtualization industry? No. Don’t worry. You can get your answer right now:

Snapshot is like a backup of your data, somehow, but it uses a different method. Traditional backup copies your data bit by bit to another disk or tape. It’s like an image of your data to somewhere else.  Snapshot uses Metadata copy of your data. Meta means change and data means your information. So, it copies your changed data to somewhere that reserved for data changes (PiT) and it doesn’t touch those data on original disk or source disk that it doesn’t change.  

In order to achieve this goal, software of snapshot creates a table called Bitmap of Track Pointers and manages it. Assume there is a pointer on Bitmap table for each track on disk. Of course, it depends on software design but for our purpose let’s assume this. Therefore, software uses pointer 1 for track 1 and pointer 2 for track 2 and so forth.

So, when you make a snapshot, it creates Bitmap of Track Pointers. Now, please look at picture 1, if any data changes on source or original disk, it copies the original value to PiT area and then overwrite new data to related track. Then it marks the corresponded pointer in Bitmap table as a changed track. For example, you want to write a file which goes to track 5 and 6 on disk. Before overwriting data with new values, software copies the values on track 5 and 6 to PiT area and then it overwrites to disk. Also, it marks or tags pointer 5 and 6 in Bitmap of Track Pointers table as a metadata or data changed.

Figure 1

Now, let’s assume that you want to revert your server for any reason. In other words, you want to restore your server to pervious state before getting snapshot. In this case, software goes through the Bitmap of Track Pointers table. Let’s back to our example above. Software goes to marked or tagged pointers in tables and finds track 5 and 6. It copies the values of track 5 and 6 on PiT area to track 5 and 6 on original disk and doesn’t change other tracks since there were no changes on those tracks. And that’s all. Your server restored to the same state that it was before getting snapshot.

 Figure 2

  1. It’s so fast. Getting snapshots is so fast and it takes few seconds. It’s not like image copy that takes hours and hours to copy your data.
  2. Restoration is also fast since it copies only changed data.
  3. You save disk space because you save only changes not all data. 
  1. It’s not a complete or image backup. So, if you original disk die, you will lose your data.
  2. You may experience performance issue during snapshot
That’s all.
Khosro Taraghi

1 comment:

  1. Hey,
    I liked your explanation. Would really appreciate it if you could give some references as well. Thanks a bunch.