PDF格式分析(十五)PDF安全(加密、解密)
PDF加密方式目前已经增加为三种:
1、口令加密
2、证书加密
3、Adobe LiveCycle Rights Management
口令加密:
作为第一代PDF安全加密方式,到现在也一直广泛应用。
口令加密分为:文档打开密码(open password)、权限密码(permission password)。
文档打开密码:要求用户在打开文件时,需要输入密码
权限密码:打开PDF文件并进行阅读,并不需要权限密码,只有更改权限设置或进行受限制操作时(打印,编辑和复制PDF中的内容),才需要输入权限密码。
如果使用两种类型的密码保护PDF,则可以使用任一密码打开它。但是,只有权限密码才允许用户更改受限制的功能。
这种方式相对简单,加密算法和解密算法,在一些开源的PDF解析库(PDFBox)中,就可以很方便的获取到。
证书加密:
现在证书被大家广泛的应用,如:我们每天会访问大量的HTTPS网站,而这些网站的Web服务器正在使用基于证书的SSL加密来防止窃听和篡改。
在PDF文件中,我们也可以通过证书加密来确保PDF的安全。数字签名可确保收件人证明文件来自制作者,而证书加密可确保只有预期的收件人才能查看内容。
使用证书保护PDF时,可以指定收件人并为每个收件人或用户组定义文件访问级别。类似与口令加密的权限密码,可以进行权限限制,例如,允许一个组签名并填写表单,另一个组可以编辑文本或删除页面。您可以从可信任身份列表,磁盘上的文件,LDAP服务器或Windows证书存储区(仅限Windows)中选择证书。始终将您的证书包含在收件人列表中,以便以后可以打开该文档。
这种方式,稍微复杂一些,在一些开源的PDF解析库(PDFBox)中,可以获取到。
Adobe LiveCycle Rights Management
Adobe LiveCycle Rights Management ES是一个基于服务器的安全系统,可以对PDF进行动态控制。Adobe LiveCycle Rights Management ES可以配置为与LDAP,ADS和其他企业系统一起运行。Adobe LiveCycle Rights Management ES 提供的策略存储在服务器上,可以从服务器刷新。用户连接到Adobe LiveCycle Rights Management ES以使用这些策略。
安全策略存储在运行Adobe LiveCycle Rights Management ES 的服务器上,但PDF不存储。在某些情况下,用户需要连接到服务器才能打开或继续使用应用了安全策略的PDF。
这种方式,还没有深入了解,以后会花一些时间去研究一下。
加密算法:
早期版本 使用40-128(8的倍数)位RC4加密
Acrobat 6.0及更高版本(PDF 1.5)使用128位RC4加密文档。
Acrobat 7.0及更高版本(PDF 1.6)使用128位AES加密算法对文档进行加密。
Acrobat X及更高版本(PDF 1.7)使用256位AES加密文档。
加密内容:
- 加密所有文档内容
- 加密文档和文档元数据。如果选择此选项,则搜索引擎无法访问文档元数据。
- 加密除元数据之外的所有文档内容
- 仅加密文件附件
先介绍一下PDF解密的流程:
1、在trailer中找到Encrypt和ID
2、解析Encrypt字典,获取加密相关属性
口令加密:
46 0 obj<<
/Length 128 % 加密key位数128
/Filter/Standard % 口令加密
/O(=ÌhOÖ}R2¾Q¹ðQ{Ò¦áÄ'ââQ´fn) % 用来计算加密key,用来判断输入的owner password是否正确
/P -1036 % 权限,32bit,每个bit可表示一种权限,目前还有一半以上属于保留bit没有使用
/R 3 % revision,与/V相关,决定使用什么安全算法
/U(¼&Ž·6 sc¬Õyš‹¤€) % 用来判断是否user password加密,判断输入的user pwd或owner pwd是否正确
/V 2 % 指定采用哪种加密算法
>>
endobj
证书加密
51 0 obj
<<
/CF<<
/DefaultCryptFilter<<
/CFM/AESV2 % AES 加密
/Length 128 % 加密算法位数128
/Recipients[(0‚j *†H†÷\r ‚[0‚W1‚0ÿ0h0Z1\r0Ubill10\nU\nxxx10U\ntechnology10 *†H†÷\r \nxxx@qq.com10 UCN\nú´/Öé£.,0\r *†H†÷\r€\\8¯0þêþ›©¶ý¦ËßúBQ<”=«œJÙ*êî<ùÐ5"†›ÅÕÉÎŽ:º`Ö§fû|¶&Ðè*5œàÔMž{ª$´ø—0ö8Ï]àóúåíY…pÚ0Š-ŸÑ+\)€&ôFÁo;fÈJô\(rïxuÏÅŒ*áÕy|Pæô0L *†H†÷\r0 `†He°ì±XçwmHo ÍÃî€ üãTîÖ;ñ‹_?÷ê¡ÕÞù:£ÏaÍçúM¬)]
% 用来接收者是否有权利访问PDF
>>
>>
/Filter/Adobe.PubSec % 证书加密
/R 131105 % revision,与/V相关,决定使用什么安全算法
/StmF/DefaultCryptFilter
/StrF/DefaultCryptFilter
/SubFilter/adbe.pkcs7.s5 % 采用哪种证书
/V 4 % 指定采用哪种加密算法
>>
endobj
3、根据Encrypt字典中的参数,计算encryption key
4、解析对象时,对string对象和stream对象,使用encryption key进行解密。
解密算法(PDFBox):
列出了一部分代码,详情见:/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/encryption/下的文件。
PDFBox中加解密核心类:
StandardSecurityHandler:口令加解密类
PublicKeySecurityHandler:证书加解密类
SecurityHandler:PDF加解密抽象类
计算encryption Key————-口令解密
/**
* Prepares everything to decrypt the document.
*
* Only if decryption of single objects is needed this should be called.
*
* @param encryption encryption dictionary
* @param documentIDArray document id
* @param decryptionMaterial Information used to decrypt the document.
*
* @throws InvalidPasswordException If the password is incorrect.
* @throws IOException If there is an error accessing data.
*/
@Override
public void prepareForDecryption(PDEncryption encryption, COSArray documentIDArray,
DecryptionMaterial decryptionMaterial)
throws InvalidPasswordException, IOException
{
if(!(decryptionMaterial instanceof StandardDecryptionMaterial))
{
throw new IOException("Decryption material is not compatible with the document");
}
setDecryptMetadata(encryption.isEncryptMetaData());
StandardDecryptionMaterial material = (StandardDecryptionMaterial)decryptionMaterial;
String password = material.getPassword();
if(password == null)
{
password = "";
}
int dicPermissions = encryption.getPermissions();
int dicRevision = encryption.getRevision();
int dicLength = encryption.getVersion() == 1 ? 5 : encryption.getLength() / 8;
byte[] documentIDBytes = getDocumentIDBytes(documentIDArray);
// we need to know whether the meta data was encrypted for password calculation
boolean encryptMetadata = encryption.isEncryptMetaData();
byte[] userKey = encryption.getUserKey();
byte[] ownerKey = encryption.getOwnerKey();
byte[] ue = null, oe = null;
Charset passwordCharset = Charsets.ISO_8859_1;
if (dicRevision == 6 || dicRevision == 5)
{
passwordCharset = Charsets.UTF_8;
ue = encryption.getUserEncryptionKey();
oe = encryption.getOwnerEncryptionKey();
}
AccessPermission currentAccessPermission;
if( isOwnerPassword(password.getBytes(passwordCharset), userKey, ownerKey,
dicPermissions, documentIDBytes, dicRevision,
dicLength, encryptMetadata) )
{
currentAccessPermission = AccessPermission.getOwnerAccessPermission();
setCurrentAccessPermission(currentAccessPermission);
byte[] computedPassword;
if (dicRevision == 6 || dicRevision == 5)
{
computedPassword = password.getBytes(passwordCharset);
}
else
{
computedPassword = getUserPassword(password.getBytes(passwordCharset),
ownerKey, dicRevision, dicLength );
}
encryptionKey =
computeEncryptedKey(
computedPassword,
ownerKey, userKey, oe, ue,
dicPermissions,
documentIDBytes,
dicRevision,
dicLength,
encryptMetadata, true );
}
else if( isUserPassword(password.getBytes(passwordCharset), userKey, ownerKey,
dicPermissions, documentIDBytes, dicRevision,
dicLength, encryptMetadata) )
{
currentAccessPermission = new AccessPermission(dicPermissions);
currentAccessPermission.setReadOnly();
setCurrentAccessPermission(currentAccessPermission);
encryptionKey =
computeEncryptedKey(
password.getBytes(passwordCharset),
ownerKey, userKey, oe, ue,
dicPermissions,
documentIDBytes,
dicRevision,
dicLength,
encryptMetadata, false );
}
else
{
throw new InvalidPasswordException("Cannot decrypt PDF, the password is incorrect");
}
if (dicRevision == 6 || dicRevision == 5)
{
validatePerms(encryption, dicPermissions, encryptMetadata);
}
if (encryption.getVersion() == 4 || encryption.getVersion() == 5)
{
// detect whether AES encryption is used. This assumes that the encryption algo is
// stored in the PDCryptFilterDictionary
// However, crypt filters are used only when V is 4 or 5.
PDCryptFilterDictionary stdCryptFilterDictionary = encryption.getStdCryptFilterDictionary();
if (stdCryptFilterDictionary != null)
{
COSName cryptFilterMethod = stdCryptFilterDictionary.getCryptFilterMethod();
setAES(COSName.AESV2.equals(cryptFilterMethod) ||
COSName.AESV3.equals(cryptFilterMethod));
}
}
}
计算encryption Key————-证书解密
/**
* Prepares everything to decrypt the document.
*
* @param encryption encryption dictionary, can be retrieved via
* {@link PDDocument#getEncryption()}
* @param documentIDArray document id which is returned via
* {@link org.apache.pdfbox.cos.COSDocument#getDocumentID()} (not used by
* this handler)
* @param decryptionMaterial Information used to decrypt the document.
*
* @throws IOException If there is an error accessing data. If verbose mode
* is enabled, the exception message will provide more details why the
* match wasn't successful.
*/
@Override
public void prepareForDecryption(PDEncryption encryption, COSArray documentIDArray,
DecryptionMaterial decryptionMaterial)
throws IOException
{
if (!(decryptionMaterial instanceof PublicKeyDecryptionMaterial))
{
throw new IOException(
"Provided decryption material is not compatible with the document");
}
setDecryptMetadata(encryption.isEncryptMetaData());
if (encryption.getLength() != 0)
{
this.keyLength = encryption.getLength();
}
PublicKeyDecryptionMaterial material = (PublicKeyDecryptionMaterial) decryptionMaterial;
try
{
boolean foundRecipient = false;
X509Certificate certificate = material.getCertificate();
X509CertificateHolder materialCert = null;
if (certificate != null)
{
materialCert = new X509CertificateHolder(certificate.getEncoded());
}
// the decrypted content of the enveloped data that match
// the certificate in the decryption material provided
byte[] envelopedData = null;
// the bytes of each recipient in the recipients array
byte[][] recipientFieldsBytes = new byte[encryption.getRecipientsLength()][];
int recipientFieldsLength = 0;
int i = 0;
StringBuilder extraInfo = new StringBuilder();
for (; i < encryption.getRecipientsLength(); i++)
{
COSString recipientFieldString = encryption.getRecipientStringAt(i);
byte[] recipientBytes = recipientFieldString.getBytes();
CMSEnvelopedData data = new CMSEnvelopedData(recipientBytes);
Collection<RecipientInformation> recipCertificatesIt = data.getRecipientInfos()
.getRecipients();
int j = 0;
for (RecipientInformation ri : recipCertificatesIt)
{
// Impl: if a matching certificate was previously found it is an error,
// here we just don't care about it
RecipientId rid = ri.getRID();
if (!foundRecipient && rid.match(materialCert))
{
foundRecipient = true;
PrivateKey privateKey = (PrivateKey) material.getPrivateKey();
envelopedData = ri.getContent(new JceKeyTransEnvelopedRecipient(privateKey));
break;
}
j++;
if (certificate != null)
{
extraInfo.append('\n');
extraInfo.append(j);
extraInfo.append(": ");
if (rid instanceof KeyTransRecipientId)
{
appendCertInfo(extraInfo, (KeyTransRecipientId) rid, certificate, materialCert);
}
}
}
recipientFieldsBytes[i] = recipientBytes;
recipientFieldsLength += recipientBytes.length;
}
if (!foundRecipient || envelopedData == null)
{
throw new IOException("The certificate matches none of " + i
+ " recipient entries" + extraInfo.toString());
}
if (envelopedData.length != 24)
{
throw new IOException("The enveloped data does not contain 24 bytes");
}
// now envelopedData contains:
// - the 20 bytes seed
// - the 4 bytes of permission for the current user
byte[] accessBytes = new byte[4];
System.arraycopy(envelopedData, 20, accessBytes, 0, 4);
AccessPermission currentAccessPermission = new AccessPermission(accessBytes);
currentAccessPermission.setReadOnly();
setCurrentAccessPermission(currentAccessPermission);
// what we will put in the SHA1 = the seed + each byte contained in the recipients array
byte[] sha1Input = new byte[recipientFieldsLength + 20];
// put the seed in the sha1 input
System.arraycopy(envelopedData, 0, sha1Input, 0, 20);
// put each bytes of the recipients array in the sha1 input
int sha1InputOffset = 20;
for (byte[] recipientFieldsByte : recipientFieldsBytes)
{
System.arraycopy(recipientFieldsByte, 0, sha1Input, sha1InputOffset,
recipientFieldsByte.length);
sha1InputOffset += recipientFieldsByte.length;
}
MessageDigest md = MessageDigests.getSHA1();
byte[] mdResult = md.digest(sha1Input);
// we have the encryption key ...
encryptionKey = new byte[this.keyLength / 8];
System.arraycopy(mdResult, 0, encryptionKey, 0, this.keyLength / 8);
}
catch (CMSException e)
{
throw new IOException(e);
}
catch (KeyStoreException e)
{
throw new IOException(e);
}
catch (CertificateEncodingException e)
{
throw new IOException(e);
}
}
使用encryption key进行解密
/**
* This will dispatch to the correct method.
*
* @param obj The object to decrypt.
* @param objNum The object number.
* @param genNum The object generation Number.
*
* @throws IOException If there is an error getting the stream data.
*/
public void decrypt(COSBase obj, long objNum, long genNum) throws IOException
{
if (!objects.contains(obj))
{
objects.add(obj);
if (obj instanceof COSString)
{
decryptString((COSString) obj, objNum, genNum);
}
else if (obj instanceof COSStream)
{
decryptStream((COSStream) obj, objNum, genNum);
}
else if (obj instanceof COSDictionary)
{
decryptDictionary((COSDictionary) obj, objNum, genNum);
}
else if (obj instanceof COSArray)
{
decryptArray((COSArray) obj, objNum, genNum);
}
}
}
加密stream对象
/**
* This will encrypt a stream, but not the dictionary as the dictionary is
* encrypted by visitFromString() in COSWriter and we don't want to encrypt
* it twice.
*
* @param stream The stream to decrypt.
* @param objNum The object number.
* @param genNum The object generation number.
*
* @throws IOException If there is an error getting the stream data.
*/
public void encryptStream(COSStream stream, long objNum, int genNum) throws IOException
{
byte[] rawData = IOUtils.toByteArray(stream.createRawInputStream());
ByteArrayInputStream encryptedStream = new ByteArrayInputStream(rawData);
OutputStream output = stream.createRawOutputStream();
try
{
encryptData(objNum, genNum, encryptedStream, output, false /* encrypt */);
}
finally
{
output.close();
}
}
加密string对象
/**
* This will encrypt a string.
*
* @param string the string to encrypt.
* @param objNum The object number.
* @param genNum The object generation number.
*
* @throws IOException If an error occurs writing the new string.
*/
public void encryptString(COSString string, long objNum, int genNum) throws IOException
{
ByteArrayInputStream data = new ByteArrayInputStream(string.getBytes());
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
encryptData(objNum, genNum, data, buffer, false /* decrypt */);
string.setValue(buffer.toByteArray());
}
上面列举了解密主要的代码,加密的方法基本类似,详细的代码请大家参考:/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/encryption/下的文件。如果有需要进一步做解释的,请大家留言。