The XPath attack
Generally, most Web applications use relational databases to store and retrieve information. For example, if you have a Web site that requires authentication, you might have a table called users with a unique ID, a login name, a password, and perhaps some other sort of information like a role. A SQL query to retrieve a user from a users table might look like Listing 1.
Listing 1. SQL query to retrieve a user from a users table
Select * from users where loginID='foo' and password='bar'
In this query the user has to give the loginID and the password as input. If an attacker enters the following in the loginID field: ' or 1=1 and the password as: ' or 1=1, the query formed will be something like Listing 2.
Listing 2. Query formed from attacker entries
Select * from users where loginID = '' or 1=1 and password=' ' or 1=1
This will always result in a match so that the attacker gains entry to the system. XPath injection works much the same way. Assume, though, that instead of a table called users, you have an XML file that contains user information that looks like Listing 3.
Listing 3. user.xml
<?xml version="1.0" encoding="UTF-8"?> <users> <user> <firstname>Ben</firstname> <lastname>Elmore</lastname> <loginID>abc</loginID> <password>test123</password> </user> <user> <firstname>Shlomy</firstname> <lastname>Gantz</lastname> <loginID>xyz</loginID> <password>123test</password> </user> <user> <firstname>Jeghis</firstname> <lastname>Katz</lastname> <loginID>mrj</loginID> <password>jk2468</password> </user> <user> <firstname>Darien</firstname> <lastname>Heap</lastname> <loginID>drano</loginID> <password>2mne8s</password> </user> </users>
In XPath, a similar statement to the SQL query is shown in Listing 4.
Listing 4. XPath statement matching the SQL query
//users/user[loginID/text()='abc' and password/text()='test123']
And to do the same sort of attack to bypass authentication, you might do something like Listing 5.
Listing 5. Bypassing authentication
//users/user[LoginID/text()='' or 1=1 and password/text()='' or 1=1]
You might have a method such as doLogin in your Java application that performs the authentication again using the XML document in Listing 3. It might look like Listing 6.
Listing 6. XPathInjection.java
import java.io.IOException; import org.w3c.dom.*; import org.xml.sax.SAXException; import javax.xml.parsers.*; import javax.xml.xpath.*; public class XpathInjectionExample { public boolean doLogin(String loginID, String password) throws ParserConfigurationException, SAXException,IOException, XPathExpressionException { DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); domFactory.setNamespaceAware(true); DocumentBuilder builder = domFactory.newDocumentBuilder(); Document doc = builder.parse("users.xml"); XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); XPathExpression expr = xpath.compile("//users/user[loginID/text()='"+loginID+"' and password/text()='"+password+"' ]/firstname/text()"); Object result = expr.evaluate(doc, XPathConstants.NODESET); NodeList nodes = (NodeList) result; //print first names to the console for (int i = 0; i < nodes.getLength(); i++) { System.out.println(nodes.item(i).getNodeValue());} if (nodes.getLength() >= 1) { return true;} else {return false;} } }
For Listing 6, if you pass in a login and password such as loginID = 'abc' and password = 'test123' the class will return true (as well as for your examples case a list of first names printed to the console). If, for example, you pass in values like ' or 1=1 or ''=' you will always get a return value of true because XPath will end up seeing a string like the one shown in Listing 7.
Listing 7. String
//users/user[loginID/text()='' or 1=1 or ''='' and password/text()='' or 1=1 or ''='']
This will logically result in a query that always returns true and will always allow the attacker to gain access.
Another even more likely and possibly more troubling attack in XPath is the ability of attackers to exploit XPath to manipulate XML documents on the fly in an application.
Back to top
Extracting the XML document structure
The query used to bypass authentication can also be used to extract information about the XML document. Suppose an attacker makes a guess that the name of the first sub-node in the XML document is loginID and wants to confirm it. The attacker enters the input in Listing 8.
Listing 8. Input entered by attacker
abc' or name(//users/LoginID[1]) = 'LoginID' or 'a'='b
In place of 1=1 in Listing 7, the expression given in Listing 8 checks if the first subnode's name is loginID. The query formed is shown in Listing 9.
Listing 9. Query
String(//users[LoginID/text()='abc' or name(//users/LoginID[1]) = 'LoginID' or 'a=b' and password/text()=''])
By trial and error, the attacker can check the various child nodes of the XML document and gather information by seeing if this XPath expression results in a successful authentication. An attacker might then potentially write a simple script that sends various XPath injections and extracts an XML document from a system as mentioned in Klein's paper.
Back to top
XPath injection prevention
Since XPath injection attacks are much like SQL injection attacks, you can prevent with many of the same methods used to prevent SQL injection attacks. Not surprisingly most of these preventative methods are the same methods you can and should use to prevent other typical code injection attacks.
Validation
No matter what the application, environment, or language you should follow these best practices:
Unlike most database applications, XPath does not support the concept of parameterized queries, but you can mimic the concept using other APIs such as XQuery. Rather than build expressions as strings that then pass to the XPath parser for dynamic execution at run time as shown in Listing 10, you can parameterize your query by creating an external file that holds your query like Listing 11.
Listing 10. Strings passed to the XPath parser
"//users/user[LoginID/text()=' " + loginID+ " ' and password/text()=' "+ password +" ']"
In Listing 11, parameterize your query by creating an external file that holds your query.
Listing 11. dologin.xq
declare variable $loginID as xs:string external; declare variable $password as xs:string external;//users/user[@loginID= $loginID and @password=$password]
You could then do the same thing as Listing 11 with slight modification as shown in Listing 12.
Listing 12. XQuery snippet
Document doc = new Builder().build("users.xml"); XQuery xquery = new XQueryFactory().createXQuery(new File(" dologin.xq")); Map vars = new HashMap(); vars.put("loginid", "abc"); vars.put("password", "test123"); Nodes results = xquery.execute(doc, null, vars).toNodes(); for (int i=0; i < results.size(); i++) { System.out.println(results.get(i).toXML()); }
This keeps important explicit variables, $loginID and $password from being processed as executable expressions at runtime. This way your execution logic and data are separated; unfortunately, query parameterization is not part of XPath, but it is freely available in open source parsers such as SAXON (see Resources for a link). Some other parsers allow this sort of functionality, and it can be a solid way to protect against XPath injection.
Data inspection at the Web server
To protect against both XPath injection and other forms of code injection, you should check all data passed from your Web server to your backend services. For example, with Apache you could use a Mod_Security filter such as SecFilterSelective THE_REQUEST "(\'|\")" to look for single quotes and double quotes in strings and disallow them. You might use this same approach to filter and disallow other forms of special characters such as ("*^';&><</), which are all characters that can be used for various injection attacks. This approach might be very good for some applications that perhaps use REST- or SOAP-based XML services, but in other cases, it might not be possible. As always the best approach is intelligent secure design from the initial design through implementation of your application.
Back to top
What if?
Most organizations think of threat detection and threat denial but rarely do they plan, with a qualified security professional, what to do if or when their systems are breached. You should always assume the worst case scenario and plan for it.
This depends greatly on your organization and the type of system that is penetrated, but usually the best thing to do is bring your systems offline and wait until a professional forensic engineer can come to inspect the system. Sometimes people immediately take systems offline and reimage their drives, but this wipes the evidence of the crime as well as possible information on other compromises the intruder has made to this system. If possible, always try to preserve the state of the system for a security expert to review.
Summary :
Most applications that use XML will not be vulnerable to XPath injection attacks and XML applications should not be considered more at risk just because a specific vulnerability is found. At the same time, with the increased adoption of new platforms such as Ajax, RIA platforms such as FLEX, or Open Laszlo, as well as the federation of XML services from organizations such as Google that rely heavily on the use of XML for everything from communication with backend services to persistence, you the developer need to stay aware of the threats and risks created by these approaches.
Fortunately, while the specific threats are new, the problems and principles to solve them are not. Following good security best practices will help you protect yourself not only from XPath injection attacks but other forms of attacks as well.
Generally, most Web applications use relational databases to store and retrieve information. For example, if you have a Web site that requires authentication, you might have a table called users with a unique ID, a login name, a password, and perhaps some other sort of information like a role. A SQL query to retrieve a user from a users table might look like Listing 1.
Listing 1. SQL query to retrieve a user from a users table
Select * from users where loginID='foo' and password='bar'
In this query the user has to give the loginID and the password as input. If an attacker enters the following in the loginID field: ' or 1=1 and the password as: ' or 1=1, the query formed will be something like Listing 2.
Listing 2. Query formed from attacker entries
Select * from users where loginID = '' or 1=1 and password=' ' or 1=1
This will always result in a match so that the attacker gains entry to the system. XPath injection works much the same way. Assume, though, that instead of a table called users, you have an XML file that contains user information that looks like Listing 3.
Listing 3. user.xml
<?xml version="1.0" encoding="UTF-8"?> <users> <user> <firstname>Ben</firstname> <lastname>Elmore</lastname> <loginID>abc</loginID> <password>test123</password> </user> <user> <firstname>Shlomy</firstname> <lastname>Gantz</lastname> <loginID>xyz</loginID> <password>123test</password> </user> <user> <firstname>Jeghis</firstname> <lastname>Katz</lastname> <loginID>mrj</loginID> <password>jk2468</password> </user> <user> <firstname>Darien</firstname> <lastname>Heap</lastname> <loginID>drano</loginID> <password>2mne8s</password> </user> </users>
In XPath, a similar statement to the SQL query is shown in Listing 4.
Listing 4. XPath statement matching the SQL query
//users/user[loginID/text()='abc' and password/text()='test123']
And to do the same sort of attack to bypass authentication, you might do something like Listing 5.
Listing 5. Bypassing authentication
//users/user[LoginID/text()='' or 1=1 and password/text()='' or 1=1]
You might have a method such as doLogin in your Java application that performs the authentication again using the XML document in Listing 3. It might look like Listing 6.
Listing 6. XPathInjection.java
import java.io.IOException; import org.w3c.dom.*; import org.xml.sax.SAXException; import javax.xml.parsers.*; import javax.xml.xpath.*; public class XpathInjectionExample { public boolean doLogin(String loginID, String password) throws ParserConfigurationException, SAXException,IOException, XPathExpressionException { DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); domFactory.setNamespaceAware(true); DocumentBuilder builder = domFactory.newDocumentBuilder(); Document doc = builder.parse("users.xml"); XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); XPathExpression expr = xpath.compile("//users/user[loginID/text()='"+loginID+"' and password/text()='"+password+"' ]/firstname/text()"); Object result = expr.evaluate(doc, XPathConstants.NODESET); NodeList nodes = (NodeList) result; //print first names to the console for (int i = 0; i < nodes.getLength(); i++) { System.out.println(nodes.item(i).getNodeValue());} if (nodes.getLength() >= 1) { return true;} else {return false;} } }
For Listing 6, if you pass in a login and password such as loginID = 'abc' and password = 'test123' the class will return true (as well as for your examples case a list of first names printed to the console). If, for example, you pass in values like ' or 1=1 or ''=' you will always get a return value of true because XPath will end up seeing a string like the one shown in Listing 7.
Listing 7. String
//users/user[loginID/text()='' or 1=1 or ''='' and password/text()='' or 1=1 or ''='']
This will logically result in a query that always returns true and will always allow the attacker to gain access.
Another even more likely and possibly more troubling attack in XPath is the ability of attackers to exploit XPath to manipulate XML documents on the fly in an application.
Back to top
Extracting the XML document structure
The query used to bypass authentication can also be used to extract information about the XML document. Suppose an attacker makes a guess that the name of the first sub-node in the XML document is loginID and wants to confirm it. The attacker enters the input in Listing 8.
Listing 8. Input entered by attacker
abc' or name(//users/LoginID[1]) = 'LoginID' or 'a'='b
In place of 1=1 in Listing 7, the expression given in Listing 8 checks if the first subnode's name is loginID. The query formed is shown in Listing 9.
Listing 9. Query
String(//users[LoginID/text()='abc' or name(//users/LoginID[1]) = 'LoginID' or 'a=b' and password/text()=''])
By trial and error, the attacker can check the various child nodes of the XML document and gather information by seeing if this XPath expression results in a successful authentication. An attacker might then potentially write a simple script that sends various XPath injections and extracts an XML document from a system as mentioned in Klein's paper.
Back to top
XPath injection prevention
Since XPath injection attacks are much like SQL injection attacks, you can prevent with many of the same methods used to prevent SQL injection attacks. Not surprisingly most of these preventative methods are the same methods you can and should use to prevent other typical code injection attacks.
Validation
No matter what the application, environment, or language you should follow these best practices:
- Assume all input is suspect.
- Validate not only the type of data but also its format, length, range, and contents (for example, a simple regular expression such as if (/^"*^';&<>()/) should find most suspect special characters).
- Validate data both on the client and the server because client validation is extremely easy to circumvent.
- Follow a consistent written and [missing word] strategy toward application security based on secure software development best practices (see Apache's excellent list for Web Services in Resources).
- Test your applications for known threats before you release them. The article "Fuzz Testing", available in Resources, shows you how to do this.
Unlike most database applications, XPath does not support the concept of parameterized queries, but you can mimic the concept using other APIs such as XQuery. Rather than build expressions as strings that then pass to the XPath parser for dynamic execution at run time as shown in Listing 10, you can parameterize your query by creating an external file that holds your query like Listing 11.
Listing 10. Strings passed to the XPath parser
"//users/user[LoginID/text()=' " + loginID+ " ' and password/text()=' "+ password +" ']"
In Listing 11, parameterize your query by creating an external file that holds your query.
Listing 11. dologin.xq
declare variable $loginID as xs:string external; declare variable $password as xs:string external;//users/user[@loginID= $loginID and @password=$password]
You could then do the same thing as Listing 11 with slight modification as shown in Listing 12.
Listing 12. XQuery snippet
Document doc = new Builder().build("users.xml"); XQuery xquery = new XQueryFactory().createXQuery(new File(" dologin.xq")); Map vars = new HashMap(); vars.put("loginid", "abc"); vars.put("password", "test123"); Nodes results = xquery.execute(doc, null, vars).toNodes(); for (int i=0; i < results.size(); i++) { System.out.println(results.get(i).toXML()); }
This keeps important explicit variables, $loginID and $password from being processed as executable expressions at runtime. This way your execution logic and data are separated; unfortunately, query parameterization is not part of XPath, but it is freely available in open source parsers such as SAXON (see Resources for a link). Some other parsers allow this sort of functionality, and it can be a solid way to protect against XPath injection.
Data inspection at the Web server
To protect against both XPath injection and other forms of code injection, you should check all data passed from your Web server to your backend services. For example, with Apache you could use a Mod_Security filter such as SecFilterSelective THE_REQUEST "(\'|\")" to look for single quotes and double quotes in strings and disallow them. You might use this same approach to filter and disallow other forms of special characters such as ("*^';&><</), which are all characters that can be used for various injection attacks. This approach might be very good for some applications that perhaps use REST- or SOAP-based XML services, but in other cases, it might not be possible. As always the best approach is intelligent secure design from the initial design through implementation of your application.
Back to top
What if?
Most organizations think of threat detection and threat denial but rarely do they plan, with a qualified security professional, what to do if or when their systems are breached. You should always assume the worst case scenario and plan for it.
This depends greatly on your organization and the type of system that is penetrated, but usually the best thing to do is bring your systems offline and wait until a professional forensic engineer can come to inspect the system. Sometimes people immediately take systems offline and reimage their drives, but this wipes the evidence of the crime as well as possible information on other compromises the intruder has made to this system. If possible, always try to preserve the state of the system for a security expert to review.
Summary :
Most applications that use XML will not be vulnerable to XPath injection attacks and XML applications should not be considered more at risk just because a specific vulnerability is found. At the same time, with the increased adoption of new platforms such as Ajax, RIA platforms such as FLEX, or Open Laszlo, as well as the federation of XML services from organizations such as Google that rely heavily on the use of XML for everything from communication with backend services to persistence, you the developer need to stay aware of the threats and risks created by these approaches.
Fortunately, while the specific threats are new, the problems and principles to solve them are not. Following good security best practices will help you protect yourself not only from XPath injection attacks but other forms of attacks as well.
0 মন্তব্য(গুলি):
Post a Comment