Encrypting and decrypting large data using Java and RSA

Encrypting large data using Java and RSA is not a lot different to encrypting small data, as long as you know the basics.

Our goal is to encrypt a String of arbitrary length, send it over the Internet and decrypt it again on the other end. We will not discuss key exchange here since that is a rather trivial task.

What we need first is a KeyPair. Where you get it from does not matter in the end – Here we will create one on the fly.

KeyPairGenerator kpg = KeyPairGenerator.getInstance("RSA");
kpg.initialize(1024);
this.keypair = kpg.generateKeyPair();

Now that we have our KeyPair we also need a Cipher that works with our Keys. I used a plain RSA Cipher without specifying padding etc.

this.cipher = Cipher.getInstance("RSA");

Next we would like to have 2 functions that can encrypt and decrypt. Here we will face 2 big problems:

  1. Ciphers do not use Strings, they use byte arrays.
  2. block ciphers cannot encrypt arbitrary long byte arrays directly.

Both of those points seem rather trivial to solve – but the devil is in the details.
First of all we have to understand that Strings and Strings can be different things. At the end of the day, a String represents the encoding of quite a lot of 0s and 1s. Even if our bytes are set in stone that does not mean that byte -> String -> byte will give us identical byte arrays.
Question: Why not?
Answer: Encodings!
I’m pretty sure you have heard about UTF-8 somewhere so far. UTF-8 only defines how a series of bytes is mapped to a char. The problem is that there are quite a lot of byte patterns that do not always make sense (in the context of UTF-8) or are not really standardized. This is why German letters like öäü etc or for example Japanese symbols sometimes get replaced by something else like a box, star etc. We have all seen it.
So we will need a better representation than “normal” Strings. Especially for transferring the String (or storing it or … Well basically anything) we want a representation that will keep the correct byte message and supports byte -> String -> byte operations.
The 2 most common ways are to use base 64 encoding or hex encoding. In my example I will use Hex encoding since I had to call REST services with encrypted Strings and base 64 encoding inserts CR/LF markers when it seems fit – something you do not really want in URLs.
But, before I go on, let’s have a look at the encrypt and decrypt functions:

public String encrypt(String plaintext) throws Exception{
	this.cipher.init(Cipher.ENCRYPT_MODE, this.keypair.getPublic());
	byte[] bytes = plaintext.getBytes("UTF-8");

	byte[] encrypted = blockCipher(bytes,Cipher.ENCRYPT_MODE);

	char[] encryptedTranspherable = Hex.encodeHex(encrypted);
	return new String(encryptedTranspherable);
}

First we init the cipher with encryption mode and our public key. We could also have gotten the key from somewhere else, the only important part is that you need a cipher and a key that work together or you will get exceptions.
After that, we convert the plaintext to a byte array. You can see that we assume the String to be in UTF-8. This could be skipped but might lead to side effects while recreating the string later. I included the UTF-8 for safety reasons.
Next we call the function blockCipher, which does all the magic of encrypting in blocks (we will come to that later).
Now we encode our new, encrypted byte[] into a Hex based String. For this purpose I used the org.apache.commons.codec.binary.Hex class. If you do not want to import that for any reason, have a look at the source code here: Kickjava.com

The String is now ready to be saved to the disk, transferred over the Internet or even sent via mail.

Decryption is much the same, just the other way round. This time we go from HexString -> byte[] -> String. Note that we again create a String that is UTF-8 based at the end.

public String decrypt(String encrypted) throws Exception{
	this.cipher.init(Cipher.DECRYPT_MODE, this.keypair.getPrivate());
	byte[] bts = Hex.decodeHex(encrypted.toCharArray());

	byte[] decrypted = blockCipher(bts,Cipher.DECRYPT_MODE);

	return new String(decrypted,"UTF-8");
}

So far so good, but what about the voodoo in blockCipher? Here’s the source:

private byte[] blockCipher(byte[] bytes, int mode) throws IllegalBlockSizeException, BadPaddingException{
	// string initialize 2 buffers.
	// scrambled will hold intermediate results
	byte[] scrambled = new byte[0];

	// toReturn will hold the total result
	byte[] toReturn = new byte[0];
	// if we encrypt we use 100 byte long blocks. Decryption requires 128 byte long blocks (because of RSA)
	int length = (mode == Cipher.ENCRYPT_MODE)? 100 : 128;

	// another buffer. this one will hold the bytes that have to be modified in this step
	byte[] buffer = new byte[length];

	for (int i=0; i< bytes.length; i++){

		// if we filled our buffer array we have our block ready for de- or encryption
		if ((i > 0) && (i % length == 0)){
			//execute the operation
			scrambled = cipher.doFinal(buffer);
			// add the result to our total result.
			toReturn = append(toReturn,scrambled);
			// here we calculate the length of the next buffer required
			int newlength = length;

			// if newlength would be longer than remaining bytes in the bytes array we shorten it.
			if (i + length > bytes.length) {
				 newlength = bytes.length - i;
			}
			// clean the buffer array
			buffer = new byte[newlength];
		}
		// copy byte into our buffer.
		buffer[i%length] = bytes[i];
	}

	// this step is needed if we had a trailing buffer. should only happen when encrypting.
	// example: we encrypt 110 bytes. 100 bytes per run means we "forgot" the last 10 bytes. they are in the buffer array
	scrambled = cipher.doFinal(buffer);

	// final step before we can return the modified data.
	toReturn = append(toReturn,scrambled);

	return toReturn;
}

I will not comment the source again, just go ahead and read it. The most important part is maybe this: int length = (mode == Cipher.ENCRYPT_MODE)? 100 : 128;
This part will tell the code wheter we chunks that are 100 bytes long or use 128 long chunks.
Why do we need that?
RSA is a block cipher. No matter how long (or rather: short) the input, it will produce a 128 byte long output. That explains the 128.
But why the 100? Could we not just use the whole byte array?
No we can’t. Most guides will not tell you this part at all since the authors forget that plaintext Strings can get quite large. No block cipher can ever encrypt a bitstring longer than the maximum block size. That’s why they are called block ciphers (opposed to stream ciphers that encrypt bit by bit or byte by byte).
If you ever find a class that can take arbitrary long input, uses a block cipher and generates an output, you can be 100% sure that the block ciphering is done internally.
So what we do is:
For ENcryption we use a maximum of 100 bytes of plaintext and encrypt each of those byte chunks to exactly 128 byte long ciphertext.
For DEcryption we use 128 bytes long chunks of ciphertext and decrypt each to a (maximal) 100 byte long plaintext.
Note that we do not have to use exactly 100 bytes. We could and maybe should use a slightly bigger byte range. As far as I can remember the maximum length is 116 or 117 bytes, but you can easily find that out with trial and error (You will get an IllegalBlockSizeException or similar).
One method that was used above but not stated yet is the following:

private byte[] append(byte[] prefix, byte[] suffix){
	byte[] toReturn = new byte[prefix.length + suffix.length];
	for (int i=0; i< prefix.length; i++){
		toReturn[i] = prefix[i];
	}
	for (int i=0; i< suffix.length; i++){
		toReturn[i+prefix.length] = suffix[i];
	}
	return toReturn;
}

This only appends 1 byte array to the other.
And, we’re done. With this you should have all things together to encrypt large data with RSA. Note that it will take a LOT of time to encrypt 1 mb of data with this algorithm (3 minutes and more). But the main goal was to encrypt large Strings, and a String with 1 mb is really HUGE.

Hope you enjoyed the trip! Leave comments if you like it, find bugs etc :)

Did this help? Then please consider donating.
I do not need nor want money instead buy me some music so I can have fun while writing another guide! CDs (Amazon)
Alternatively, I am also totally into books. Books (Amazon)

Leave a comment ?

39 Comments.

  1. Dario Pizzuto & Company

    thanks for the post, you solved a problem that I did not sleep for three nights.
    The idea of using the hexadecimal is excellent.
    I hope to achieve in future versions without conversion hexadecimal.
    Thanks again.

    D. Pizzuto & Company.
    University of Messina (Sicily,Italy)

  2. Very helpful article!!!

  3. Can you also write same thing for DES-X, Triple DES?!

  4. Thanx alot that was very helpful :grin:

  5. hmm the code does not work for german letters like öä…

  6. it should work as long as you make sure you use the right encoding at the encrypt/decrypt functions

  7. I have troubles encrypting data in PHP and decrypting it in Java.
    I wonder if you can help me with my problem.

    Thank you

  8. i copied all the code and i just change the certificate because i use a jks file (Java Key Store) but it dosent work when i Decrypt i got the error
    ==> Bad padding , Data must Start with zero
    any help please thinks in advance

  9. Hi, really nice kode sample.

    I’m having a bit of trouble with it though. That i suspect has to do with padding. I’m getting an ArrayIndexOutOfBoundsException on line 212 when decrypting. I’ve pasted the source here: http://pastebin.com/FWjB7QEc

    Please forgive the less then elegant solution, its my first crack at encryption. It will be refactored to use hex instead of base64 as well.

    The reason i suspect padding, is the way you define the array lengths, i understand your great explanation, yet i don’t see your cipher instance definition explained.

    Hope I’m not just being dumb as hell, sadly though, i think i am :-)

    In any case, its a lovely guide.

  10. As it turns out, i was being dumb… I was using a 256byte key to encrypt, and then a 128byte to pseudo decrypt.

    This stuff is hard on my tiny mind ;-)

    TY.

  11. @ Anon(DES-X, Triple DES)
    sorry, no
    DES-X, Triple DES are symmetric ciphers, that’s something completely different

    @Max
    sorry, i have not touched php since 10 years, last time i saw it it couldnt even do objects :) save yourself some time and grief and learn ruby

    @Mohammed: Looks like you’re messing with the block length there

    @Peter: glad it worked!

  12. Hello again Florian,

    Do you by chance know of any good sources, for details regarding padding specs? Manual Cross-platform encryption is tricky when you aren’t in the know.

    Best regards,

    Peter

  13. To be honest, the best source for that is wikipedia
    I have some material on that stuff from my studies, but i’m rather sure that i cannot distribute them since they are not publicly available

  14. Hi, I’m trying to use your code for a University project, but I’ve seen that if I encrypt the same String two times (with the SAME keypair) encrypt text is different…How it is possible? There’s a way to obtain the same encrypted text? Thanks in advance!

  15. That can’t be right. If you pipe the same text through the same implementation twice, it simply cannot change, that’s the point of encryption to begin with.
    Check if there is any Date, Random etc involved.

  16. Hi Florian, thank you for your quick response.
    I found the solution, we have to use:

    cipher = Cipher.getInstance(“RSA/NONE/NoPadding”);

    using BouncyCastle provider, because the padding operation result in different chiper text (also for same plaintext) to prevent chipertx known attack.

    Thank you

  17. I guess nobody ever stops learning! Thanks for pointing that out.

  18. You can download the jar file (Hex) from this URL: http://www.jarfinder.com/index.php/jars/versionInfo/37220

  19. Joe and Awen

    Thanks a lot for your code !
    We had troubles with this size limit in RSA encryption and you solved it :-)
    You just saved us ;-)

  20. Hey, WE need to Encrpyt large object which contains number of fields. can we do it using “AES”" ? We are getting “IllegalBlockSizeException” for this, can you please help us here ?

  21. Well, yes you can encrypt a large object with a ton of fields. Would I do it? No. Read up on the memento pattern (you can incorporate encryption into it, no problem there)
    As for the IllegalBlockSizeException: read the part about length = (mode == Cipher.ENCRYPT_MODE)? 100 : 128;
    You are most likely not using the correct block size when encrypting or decrypting.

  22. great tutorial – safed me a lot of time

    big thanks

  23. how to encrypt and decrypt the files using RSA in java

  24. Thanks a lot dude . I was suffering a lot to get this logic.

  25. @prashanth that’s out of the scope of this article, but you could pass a binary stream into the function instead of a string

  26. Hi Florian,
    your method doesn’t return the same data used in input because you return an array of wrong length.

    I suggest you to use CipherInputStream and CipherOutputStream, see below an example:

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    CipherInputStream cis = new CipherInputStream(new ByteArrayInputStream(bts), cipher);
    byte[] buffer = new byte[32];
    int i;
    while ((i = cis.read(buffer)) != -1) {
    baos.write(buffer, 0, i);
    }
    baos.close();

  27. Thank you for you explanation and example. mainly the encoded part using hex.. this is really helpful.
    thank you again.

  28. Thx man, this is the best solution!
    Fucking business allright ;)

  29. Good article!
    Very helpful.

    Thanks.

  30. @youest i don’t really understand what you mean? the method works fine. it is not supposed to work with the same lengths.
    besides, using a byte buffer that way is prone to buffer overflows, and you should always close all streams etc. in a finally block

  31. I converted your code to work with files as well. It works okay for plain text files, and can encrypt or decrypt large files, but does not do it for PDF files. It does encrypt PDF but decrypt, although completes without exception, creates a PDF with slight changes in embedded characters and thats why is not opening in PDF viewer

    PDF file is converted to encrypted .lic file and .lic converts to _restore.pdf

    private String getFileNameWithoutExtension(String fileName)
    throws Exception {
    String toReturn = null;
    if (fileName != null) {
    int indexOfDot = fileName.indexOf(“.”);
    toReturn = fileName.substring(0, indexOfDot);
    }
    System.out.println(“fileNameWithoutExtension => ” + toReturn);
    return toReturn;
    }

    public File encryptFile(String fileToEncrypt) throws Exception {
    File toReturn = null;
    InputStream in = this.getClass().getResourceAsStream(fileToEncrypt);
    URL url = this.getClass().getResource(fileToEncrypt);

    String path = url.getPath();

    path = path.substring(0, path.lastIndexOf(“/”));

    System.out.println(“path => ” + path);

    String fileAsString = new String(this.readInputStream(in), “UTF-8″);
    String encryptedText = this.encrypt(fileAsString);

    toReturn = new File(path + “/”
    + getFileNameWithoutExtension(fileToEncrypt) + “.lic”);

    if (toReturn.exists()) {
    toReturn.delete();
    }

    toReturn.createNewFile();

    FileWriter fw = new FileWriter(toReturn);

    fw.write(encryptedText);

    fw.flush();
    fw.close();

    return toReturn;
    }

    public File decryptFile(String fileToDecrypt) throws Exception {
    File toReturn = null;
    InputStream in = this.getClass().getResourceAsStream(fileToDecrypt);

    URL url = this.getClass().getResource(fileToDecrypt);
    String path = url.getPath();
    path = path.substring(0, path.lastIndexOf(“/”));

    System.out.println(“path => ” + path);

    String fileAsString = new String(this.readInputStream(in), “UTF-8″);
    String decryptedText = this.decrypt(fileAsString);

    toReturn = new File(path + “/”
    + getFileNameWithoutExtension(fileToDecrypt)+”_restore” + “.pdf”);

    if (toReturn.exists()) {
    toReturn.delete();
    }

    toReturn.createNewFile();

    FileWriter fw = new FileWriter(toReturn);

    fw.write(decryptedText);

    fw.flush();
    fw.close();

    return toReturn;
    }

  32. sorry missed to submit another required method

    private byte[] readInputStream(InputStream in) throws Exception {
    try {
    // System.out.println(“in => “+in);
    byte[] toReturn = new byte[0];
    Vector intermediate = new Vector();
    if (in != null) {
    int b = -1;
    while ((b = in.read()) != -1) {
    intermediate.add((byte) b);
    }
    // intermediate.add((byte)b);
    }
    // System.out.println(“intermediate => “+intermediate);
    // System.out.println(“intermediate.size() => “+intermediate.size());
    toReturn = new byte[intermediate.size()];
    for (int i = 0; i < intermediate.size(); i++) {
    toReturn[i] = (byte) intermediate.elementAt(i);
    }
    return toReturn;
    } finally {
    if (in != null)
    in.close();
    }
    }

  33. Hi Florian,
    Thank you so much for your code, it helped a lot when I couldn’t figure out how to encrypt long strings.
    I’m facing an issue now though, when trying to use the private key to decrypt a string previously encrypted with the public key, an exception is being thrown on
    //execute the operation
    scrambled = cipher.doFinal(buffer);

    saying: javax.crypto.BadPaddingException: Data must start with zero

    I haven’t been able to find any solution, except some people saying that the wrong private key is being used, which I don’t think I’m doing. :sad:

  34. Correction to my previous post, the issue was with the keysize of the keystore, when it’s 1024, it works fine. Do I need to change the buffer length according to the keysize coz this doesn’t work for 2048…

  35. Oops, one more comment; where you have

    byte[] buffer = new byte[length];

    it should be:

    byte[] buffer = new byte[(bytes.length > length ? length : bytes.length)];

    as if the text is less than the buffer length, the recalculation of buffer size will not take place and the decryption will result in some garbage characters being appended at the end of the string.

  36. I think if you’re using a 2048-bit key length, then you should be using a 256-length buffer for decryption (2048/8=256) and a 245-length buffer for encryption (256-11=245).

  37. Jose Luis Montes de Oca

    It may be a little late, but I came into the same issue. After a little research I found that the block size depends on the “keyLength”, so the code can be generalized as follows:

    int length = (mode == Cipher.ENCRYPT_MODE) ? (keyLength / 8 ) - 11 : (keyLength / 8 );

    Hope this helps…

  38. thank! about your code!

  39. As correctly mentioned by Lisa if text is relatively small than encryption and following decryption will give different results due to size of resulting buffer.
    Lisa, thank you.

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Switch to our mobile site