Every time you make a decision to trust a vendor, whether it be a storage provider in the cloud, the company handling your taxes online or your favorite social network you put faith in them that they will live up to the security claims provided via their website. Is your online backup vendor really keeping your data secure? When you click “Share my data with only friends” on Facebook what assures that this is true?
Due to the expansiveness of this topic I’m starting with two posts describing data security concepts you should understand if you are looking to evaluate vendor claims. The third blog post will be a hit list with examples of language to look for on vendor sites, how to decipher it, and identify valuable security statements vs. buzzwords and hype, which permeate the industry.
Data Security and Encryption Concepts:
In this first section we talk about security and encryption concepts. This post is about defining security terms and understanding important technical components such as what encryption is, the types of encryption, and the differences between encryption keys and password hashes. Don’t worry, there is no associated test.
Understanding the difference between Features and Vulnerabilities
A Feature is placed on purpose. The username and password used to log into your favorite site, known as your authentication method, is a feature they have implemented to differentiate and secure user accounts. The “Share data only with friends” button on Facebook is intentional and under normal circumstances will only allow your friends to see your data; this would also be a feature. A programmer built this feature into the software and it is behaving as expected.
Vulnerabilities are unintentional, they are usually introduced when someone is writing a piece of computer code and does not perform the proper checks on that code. By attempting to manipulate the program in unexpected ways someone might find the vulnerability, which allows them to gain access to other data, crash the system or load their own programs onto the system. There are other types of vulnerabilities that can happen at other levels in computers, including at the hardware level but software vulnerabilities are the most common.
As an example of a vulnerability, if a programmer wrote code for a search box on a website and only expected people to search using letters, not numbers or symbols but didn’t write the code to check that only text is entered in the box they may have unintentionally introduced a vulnerability. Later someone might come along and by attempting to insert non-letter characters in the search box learns that they can have the website show them other people’s private data. This newly discovered “capability” would be a vulnerability.
At a basic level encryption is using math to scramble (encrypt) data so that unless you have the proper key you cannot unscramble (decrypt) the data later. You can expect that fifty years from now that the data will be accessible due to increases in computing power but over time the confidentiality and value of data is reduced so from an effective standpoint it can be labeled secure.
The primary encryption algorithm used when I started implementing systems was DES, which was used for 20+ years before a machine was built that could crack the encryption in less than a week for a reasonable amount of money. DES is still in use today in some systems that have not migrated to newer systems such as AES, which is now about 12 years old and the current standard endorsed by the U.S. government for Top Secret data. With that said the hunt for the next standard is always underway.
Single Key Encryption and Public Key Encryption
The two methods of encryption you should know about are called Single Key (symmetric) encryption and Public Key (asymmetric) encryption. Both types of encryption are referenced by other names depending on whom you are talking to.
It is also important to know that they are often used together since each one serves certain purposes more effectively than others or is more efficient. As an example it is generally recognized that Public Key encryption is far more processing intensive to utilize so you might use Public Key encryption to setup and validate a secure channel between entities, then switch to symmetric key encryption once you are sending data.
Single Key Encryption:
Single key encryption has only one key used for both encryption and decryption of the data. This method comes in most value when the data is only going to be accessed by one group or individual with 100% trust between the parties or a method to share the key between parties that is secure.
An example of single key encryption is if you were to have a document on your computer that you encrypted using a password. You are the only one that knows the password and it can be used to encrypt and decrypt the document. If the password is lost, which in this case is also acting as an encryption key, the document will be unreadable.
Public Key Encryption:
In Public Key Encryption there are two keys that are mathematically linked. When someone encrypts data using the Public Key the only person who can decrypt the data is the person with the Private Key. Public Key cryptography is also used to generate and apply digital signatures to files or documents, which is beyond the scope of this blog post.
An example of Public Key Encryption is being able to send an email that is secured in a way where only the recipient could decipher it. By having the recipient’s Public Key you can encrypt the email before sending it. When they receive the email they will decrypt it using their Private Key, which only they have. This same scenario works in reverse if they want to send an email back to you; they would use your Public Key to encrypt the email, send it to you, then you would use your Private Key to decrypt it. In this situation, even if someone eavesdrops the email they are left with unintelligible data since they don’t have the Private Key.
In your web browsing every day you use a combination of these types of encryption. Anytime you browse to a secure website, marked by a URL with “https” in the beginning, you are using both Public Key and Single Key encryption compliments of TLS/SSL.
The Encryption Key used to encrypt and decrypt data can be anything. Usually it is a long series of random characters, which is used by the encryption algorithm to process the data. The outputs from the encryption process are blobs of scrambled data called ciphertext, which is your data, only encrypted. Think of the key as a really long, secure password.
If you have chosen your key wisely and the encryption algorithm is solid, assuming no one steals your key and data, your secrets will be safe for many years to come.
An important rule when generating a key is that it should be large and random. If the encryption algorithm is secure but the key is non-random or too short someone can use a type of attack called a brute-force attack, where they try every combination of potential keys to access the data. This is also true for passwords, which we will discuss below.
It is worth noting that all encryption keys and passwords can be brute-forced if given enough computing power and time. If, with modern computers it will take 200 years to access a piece of data it is effectively secure. It is not that data cannot be accessed; it is that the data cannot be accessed in any practical timeframe.
Good encryption for the average person is a combination of a strong algorithm AND a large, random key. If you are lacking one of these your data will not be secure.
For this reason it is important to know what encryption system people are using to secure your data. Any time I see the word “Proprietary” with no additional details I become concerned. Thinking that a single organization has built an encryption algorithm stronger than the publically accepted gold standard, which as of this writing is AES, is highly unlikely. There are other systems that companies might choose to use such as Twofish but they should be willing to reveal what method they are using and you should be able to find information on the encryption they are using through a search engine.
Keys vs. Passwords
Passwords and encryption keys are very similar in concept; they are unique pieces of data that provide assurance that you are the only one accessing a system or a piece of data.
The difference between a password and encryption key is defined by what the computer does with it. Above we discussed what an encryption key is used for; to secure your data against use even if the data itself is compromised. Without the key the data is useless.
A password in its standard form is used to authenticate your identity and authorize access to certain data on a system. It is not used to encrypt or otherwise obscure your data so that it is unreadable.
As an example of simple password use we will use Facebook. When you provide a password to Facebook it allows you to access your profile and data. In this case your password is used to validate your identity and grant access to different pages and profiles on the Facebook website. If someone hacked into Facebook and was able to download your profile data that is exactly what they would have – your completely readable profile. It was not encrypted so the data can be read easily.
I will not over-muddle the use of passwords and encryption keys but these concepts overlap. Sometimes your password will also be used as an encryption key or be tied to an encryption system so that once you login you also gain access to your encryption keys. This type of system is often used in corporate environments where you cannot trust users to secure their own keys against being stolen or lost. In this situation when someone logs into the network they are also provided access to their encryption keys.
Who holds the keys to the Kingdom?
When using online services there is a risk in where the keys are stored. Are you holding the keys, is the vendor holding the keys or do both of you have copies? For this example we will assume that we are using single key encryption and you are going to use an online backup service.
Vendor Maintained Encryption Keys:
Most consumer-oriented services hold the encryption keys for you. Why would they do this? They understand that for a home user keeping their encryption keys in a safe location is a major issue so they setup infrastructure to house the keys.
In this case you will be forced to trust that the vendor is storing your keys and your data securely. If they are not there is potential of all of your data being compromised. A positive aspect of this is when your father’s computer fails you can recover his data even if he does not have his encryption keys. If the encryption keys were stored on the computer and he did not have backups (isn’t that what he is paying the vendor for?) his data becomes useless when the hard drive crashes and takes the keys with it.
A paranoid security expert ignoring ease-of-use or practicality would tell you that this type of service should never be used. Taking it from a more practical standpoint if the consumer of the service cannot be counted on to maintain encryption keys it is best to identify a vendor to provide the service or qualified technical help to assure he is securing his keys. I would not typically recommend this type of service for any business except for a tiny business without any technical staff, confidential data, or trade secrets and in that case it would come with a bucketful of disclaimers.
Customer Maintained Encryption Keys:
Backup vendors oriented toward businesses will often rely on the customer to store and maintain the encryption keys. This puts ownership onto the customer to have infrastructure and systems in place to manage and distribute keys in a secure manner but removes the potential for a vendor-side security compromise to reveal unencrypted data.
In this situation the vendor never has visibility into the unencrypted data. Before the data is uploaded to the vendor servers the data is encrypted, making it meaningless to a compromiser as long as the backup vendor’s encryption system is solid.
This places complete ownership of the keys onto the customer, which usually includes generating a proper length, random key then storing them in such a way as to assure that even if the customer’s servers melt-down there are copies of the keys available elsewhere. In situations like this the customer needs to make an appropriate number of copies of the keys then distribute them to safe locations where they can be recalled if need. If the keys are lost, so is the data since the vendor cannot decrypt it.
Other than staying away from Cloud-based or Hosted vendors this is the most secure method to handle your encryption keys but it places ownership onto you, the customer to keep the keys safe. If they are lost, your backups are worthless.
What is Hash other than a drug?
When your passwords are stored on most websites and servers they are hashed, which is different from encryption since the algorithm that generates hashes does not have the capability to turn the hash back into a password. It is a one-way function only.
If you hear in the news that an organization had a hundred million passwords compromised and they state that unless your password is simple or short it should be safe, that means they have hashed your password before it was stored.
At a basic level a hash is a fixed-length unique value that is returned when you feed a certain input into a hash algorithm. When I use a Hash Generator and feed “bob” into it I receive back “bf8bea686c94bce1a58631cf5a3e9cf9ebabb31e16e353f4caa97f052bb629ff2b945aaa8f8caaf5 1fdec2c7f874420e45617f6abcbf9407f08ef939c1aa1e11” as the Whirlpool hash. This number is for our purposes unique to the word “bob”. If two inputs generate the same hash it is called a “collision” and from a mathematical perspective is extremely rare with modern hash algorithms.
When using hashes you expect someone to know what input it takes to derive the proper hash value, which is why it works well for items like passwords where you are confirming someone knows some piece of data, not trying to recover it. This is also why most websites cannot tell you your old password; they are only storing a hash of your password, not the password itself.
Putting Salt in the Hash
This is not an article on passwords alone but there is another term you will hear thrown around when talking about passwords. That term is “Salt”. When hashing data, Salting is the process of placing an additional bit of data into the hash process to assure the result is unique.
Think of it as a method to make your password stronger. By doing this you thwart certain types of attacks, specifically Rainbow Table attacks, where there is already a massive dictionary of pre-computed hash values available that a hacker can compare to customer’s (loose definition) hash database, which will tell them the passwords used. If a Salt is included when the password is hashed the dictionary is worthless unless it is a very large dictionary since the resultant hashes will be different from the dictionary. E.g If my password was “bob” and my Salt was “bigbadpasswordstuff” I would pass “bobbigbadpasswordstuff” through the hash, which would generate a different hash than just “bob” itself. A Rainbow Table is likely to contain “bob” but probably is not going to contain “bobbigbadpasswordstuff”.
Salt values are of maximum value if they are not revealed since it will strengthen the password but even in the case where a salt is discovered they still hinder Rainbow Table attacks since the entire table would need to be recomputed with the salt to identify the passwords easily. It won’t stop an attacker but may slow them down.
And if you survived this far, thanks for coming along. Next time on security concepts of the not-so-rich and not-so-famous we have:
When good Encryption Goes Bad or “What the hell happened to WEP?”
Are there cases where encryption is a liability?
What is Hacking?
The Terms of Service Tightrope
Continue on to Part 2