888.4.NATNET
Facebook Twitter LinkIn Blog Instagram YouTube Google Plus

Monthly Archives: July 2010

23
Jul
2010

What is 95th Percentile?

by Administrator

What is this 95th Percentile (or, the difference between throughput and transfer)?

Many, if not most hosting companies sell and bill bandwidth based on a method called the 95th percentile. Many, if not most customers, don’t have a clue what the 95th percentile really is. In this article, I’ll try to shed some light on what 95th percentile is.

In order to explain, we must first understand the difference between the two types of bandwidth billing methods. Those two types are TRANSFER (95th percentile billing) and THROUGHPUT (per-gig billing). Let’s look at them individually….

Throughput is the actual total SIZE of the combined files that are sent by the server. Throughput is sold in Gigabytes (GB) and is an aggregate monthly total. So, for example, let’s say you have a web page called THISPAGE.HTML and the actual page is 25k, on this page you have 3 graphic images that are 25k each which is a total of 100k. If 100,000 people downloaded that page over the course of a month then your Throughput would be calculated as 100kB X 100,000 = 10,000,000kB or 10GB. So for that month your THROUGHPUT would be 10GB. This does not take into account if all 100,000 people hit the server at the same time or were evenly spread out over the course of the month; it is still 10GB of THROUGHPUT for the month.

Now, on to TRANSFER, but before we begin let me state that in *NO* circumstances can you mix Throughput and Transfer. It is physically impossible (it’s like trying to add up gallons and nickles). They are two different things.

TRANSFER is measured in Megabits Per Second (Mbps) and measures how much information is traveling through the Internet “pipe” at any given time. I like to compare TRANSFER to water in a series of water pipes. Imagine that your home PC has a water hose connected to it instead of an Internet connection. The water hose is 1/2″ and is connected to the side of your house where it meets a 2″ pipe and your house is connected to the Water Main, which is a 12″ Pipe. In this example your ½” water hose is your home Internet connection and your 2″ pipe to your house is your ISP and the 12″ water main is the backbone of the Internet. It does not matter how hard you try you are only going to get 1/2″ of water into your PC at any given time because the “pipe” is only a 1/2″ water hose.

Now if I were going to sell you water BY THE GALLON, that would be called Throughput (see above), or I can sell you a PIPE and just charge you for the amount of water that you push through the pipe at any given time…this is called TRANSFER. For example, if I take a measurement right now and you are pushing 1″ of water through the pipe and I look again in five minutes and you are pushing 1″ still and I look again in five more minutes and you are pushing 1/2″ and I look again in five more minutes and you are pushing 2″ then how big of a pipe do you need to accommodate your traffic flow without any water being backed up like a funnel??? You would need a 2″ pipe, but you are not using 2″ all the time, so why do you have to pay for a 2″ pipe all the time?? This is where the 95% comes in.

The 95th percentile (which is an industry standard) simply means that the hosting company will look at your pipe every five minutes and take a reading and add that reading to a long list that they keep for 30 days. At the end of the month that list will contain 8640 readings (there are 12 five minute intervals in an hour, 24 hours a day for 30 days). They will then take that list and sort it from the biggest number to the smallest number so that your largest five minute reading is on the top, the second largest is next, the third largest is next and so on. The top 432 entries (the top 5%) are discarded and the 433rd is considered your “95th Precentile” and that is the number that you pay for. The 95th percentile was designed to help chop off wild peaks and only bill you for what you are sustaining on a regular basis. This is a rolling 30 day number that is constantly changing. In other words, once you get the 8640 data points, every time a new data point is added and the list is sorted, the oldest data point is dropped off.

As for what is more advantageous, it depends on the traffic patterns of your site. THROUGHPUT (95%) is good for almost all sites with very few exceptions. TRANSFER is recommended for sites that have extremely high spikes or very inconsistent traffic. For example, if you have very high traffic every Monday but the rest of the week is very low traffic, then being billed on THROUGHPUT may be the best for you. In this case, you would have lots of big numbers due to that high traffic on Monday, which would create an inflated 95th percentile. However, very few sites have this type of traffic pattern.

With TRANSFER host should provide 95th percentile graphs (usually MRTG graphs which is the industry standard) and you can see your transfer yourself. You should check these graphs every day as they can indicate problems as well as let you know your traffic patterns. You should see highs and lows each day and these patterns of highs and lows should follow the sun. If you see a flat line across the top of the graphs then you know that your hosting company doesn’t have enough bandwidth to handle your needs (and this is much more common than one would think). ***IF YOU ARE BEING BILLED ON 95TH PERCENTILE MAKE SURE YOUR HOSTING COMPANY PROVIDES YOU WITH THOSE GRAPHS*** If they refuse, they obviously have something to hide.

Hopefully this helps you understand what 95th percentile is.

Share and Enjoy
  • Print
  • Facebook
  • Twitter
  • Add to favorites
  • RSS
  • Google Bookmarks
  • Technorati
  • Yahoo! Buzz
12
Jul
2010

RAID Simplified

by Administrator

If you’re a webmaster or someone that has ever dealt with a server, you have probably heard the term RAID. RAID, which stands for Redundant Array of Inexpensive Disks, or occasionally Redundant Array of Independent Disks, is a way to put 2 or more hard drives together in different configurations to meet certain criteria, for better redundancy, faster speeds or both. While there are many sites on the internet that explain RAID already, many of them are quite technical in nature so this explanation will simplify it by describing each RAID type, what is required and the pros and cons of each. There are actually 13 different RAID types but only 4 that are commonly used. I will cover these 4 in detail.

RAID 0: RAID 0 is sometimes called striping. RAID 0 requires at least 2 drives. Data is written sequentially to all drives, which means that the pieces of a file will be written across all the drives. Because of this, this file can be read from the drives much faster as the reads come from all the drives simultaneously. A RAID 0 works well for a server where increased disk space is desired but redundancy is not an issue. RAID 0 may be used for file servers where a backup file server is also in place in case of data loss.

Pros:

  • Easy to create
  • Fast reads and writes
  • Can be done with only 2 drives
  • Disk capacity is the combined size of the drives (ie, 2 200 GB drives would give you 400 GB of capacity

Cons:

  • No redundancy. If any drive in the RAID set fails, you lose all the data on that drive
  • Not a true “RAID” due to the lack of redundancy (remember, RAID stands for “Redundant Array of Inexpensive Disks”)

 

RAID 1: RAID 1 is mirroring and requires a minimum of 2 drives and the drives must be installed in pairs (2, 4, 6, etc). Each 2 drives is a mirror of each other where all data on each drive is identical to its pair. RAID 1 is perfect for a web server where 95% of the disk access is read from the drive to deliver web content and the other 5% is FTP uploads where speed isn’t really an issue. By default, every managed server that NationalNet builds comes with RAID 1 (for speed and redundancy) unless otherwise specified by the customer, or the server is a database or some other type of specialized server that requires a different type of RAID.

Pros:

  • Twice the read speed of a single drive
  • Perfect for a web server where most of the activity is reads from the disk
  • True redundancy in that if a drive fails, you just replace it and the RAID automatically rebuilds

Cons:

  • Slower writes than a typical RAID
  • The capacity is that of the single disk (ie, 2 200 GB drives in a RAID 1 give you 200 GB of capacity)

 

RAID 5: RAID 5 requires at least 3 drives. The data is written to all drives with sections of the drives dedicated to the parity bits. Without getting too technical, the best way to explain parity bits is that they are in charge of ensuring data written to the disk is correct and not corrupted. Because of the way the data and parity bits are written to all three drives, each drive can fill in for any other failed drive. The capacity of a RAID 5 is N-1 (ie, you lose one drive to the RAID), which means that if you have 4 500 GB drives, your capacity would be 1500 GB. RAID 5 works well where more disk space is required than what can be had with a single drive.

Pros:

  • Highest READ speed of all RAID
  • Good disk speed
  • Good redundancy

Cons:

  • Disk failure can impact performance
  • Slower write speeds
  • Expensive to implement. Requires at least 3 matching drives and an expensive RAID card

 

RAID 10: RAID 10 is two mirrored drives (see RAID 1) striped together (see RAID 0). It requires a minimum of 4 drives to implement and like both RAID 0 and RAID 1, must be done in pairs. RAID 10 is very fast for both reads and writes and works well for servers that require high availability as well as fast read and write disk speeds. A database server would be a good example where you would implement RAID 10.

Pros:

  • Very high disk speeds for both read and write access
  • Given a 4 disk RAID 10, you could lose two drives and not lose any data provided it was one drive from each RAID 1 set in the RAID 10. Given this same 4 disk RAID 10, the failure of one drive would never affect the data

Cons:<

  • Expensive to implement. Requires 4 drives and an expensive RAID card
  • Limited scalability

 

These are the 4 most often used RAID types. Here is a condensed list of the other, lesser used, RAID types. These RAID types are rarely used due to the fact that the disadvantages outweigh the advantages or due to cost constraints or both.

RAID 0+1: Similar to RAID 10 in that it’s a mirror/striping combination but without any redundancy. Any single drive failure causes total loss of data on the failed drive.

RAID 2: Requires expensive specialized disks and uses ECC code. Rarely if ever used.

RAID 3: Uses parallel disk writing method. Requires a minimum of 3 drives and uses 1 drive dedicated to the parity bit (see RAID 5). Very slow after disk failure and does not use disk space very efficiently (lots of wasted disk space)

RAID 4: Independent data disks with one disk dedicated to parity bit. Requires minimum of 3 drives. Very slow disk writes. Difficult to rebuild after a failure.

RAID 6: Very similar to RAID 5 only with a second set of parity bits written, which gives it higher fault tolerance in a mission critical situation. Very complex to implement and very poor write performance. Requires a minimum of 4 drives due to the extra parity bit.

RAID 7: Unlike the other RAID levels, RAID 7 isn’t an open industry standard; it is really a trademarked marketing term of Storage Computer Corporation, used to describe their proprietary RAID design. RAID 7 is based on concepts used in RAID levels 3 and 4, but greatly enhanced to address some of the limitations of those levels

RAID 1E: Simply put, RAID 1E is variation of RAID 10 only with more implementation headaches and less redundancy

RAID 50: Without getting too technical, a RAID 50 is similar to putting a RAID 5 and a RAID 0 together. Better redundancy but a high level of complication to implement and maintain

RAID 53: Very similar to a RAID 5 and RAID 3 put together.

Hopefully, you found this information helpful and maybe, just maybe…when you’re selecting your web hosting company and they ask you if you need RAID, you’ll now be able to hold your own in that part of the conversation.

Share and Enjoy
  • Print
  • Facebook
  • Twitter
  • Add to favorites
  • RSS
  • Google Bookmarks
  • Technorati
  • Yahoo! Buzz
NationalNet, Inc., Internet - Web Hosting, Marietta, GA
Apache Linux MySQL Cisco CPanel Intel Wowza