Patents.com Logo
Choose Language:
 

Title:  Apparatus and method for developing secure software
Patent ID: US7207065
Issue Date:  April 17, 2007
Abstract:

A computer readable medium includes executable instructions to analyze program instructions for security vulnerabilities. The executable instructions convert diverse program instruction formats to a common format. A system model is derived from the common format. A static analysis is performed on the system model to identify security vulnerabilities. Security vulnerabilities are then reported.

 

 






Translations
Inventor(s): 
Chess;  Brian  (Mountain View,  CA,  US) , Email and Contact Information
Do;  Arthur  (Danville,  CA,  US) , Email and Contact Information
Fay;  Sean  (San Francisco,  CA,  US) , Email and Contact Information
Thornton;  Roger  (San Jose,  CA,  US) Email and Contact Information
Assignee:  Fortify Software, Inc.;  (Menlo Park,  CA,  US)
Agent:  Cooley Godward Kronish LLP
Application No.:  11/010,146
Filing Date:  December 10, 2004
Primary Class:  726/25
Other Classes:  726/22  726/3 
Field of Search:  726/22,25
Intern'l Class:  G06F 11/00 (20060101)  H04L 9/32 (20060101) 
Primary Examiner:Zand; Kambiz
US Patent Document(s):
  5440723    Arnold et al.    August 01, 1995
  5502815    Cozza    March 01, 1996
  20010027383    Maliszewski    October 01, 2001
  20020066024    Schmall et al.    May 01, 2002
  20020073330    Chandnani et al.    June 01, 2002
  20030120951    Gartside et al.    June 01, 2003
  20030159063    Apfelbaum et al.    August 01, 2003
  20040133777    Kiriansky et al.    July 01, 2004
  20040255163    Swimmer et al.    December 01, 2004
  20040255277    Berg et al.    December 01, 2004
  20040260940    Berg et al.    December 01, 2004
  20040268322    Chow et al.    December 01, 2004
  20050010806    Berg et al.    January 01, 2005
  20050015752    Alpern et al.    January 01, 2005
  20050028002    Christodorescu et al.    February 01, 2005
  20050273860    Chess et al.    December 01, 2005
Foreign Reference(s):
Other References:http://java.sun.com/docs/books/jls/second.sub.--edition/html/expressions.d- oc.html#20448 "Publication date unknown, but prior to Dec. 10, 2004." cited by other .
http://java.sun.com/products/ejb/docs.html "Publication date unknown, but prior to Dec. 10, 2004." cited by other .
http://java.sun.com/products/jdbc/reference/index.html "Publication date unknown, but prior to Dec. 10, 2004." cited by other .
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/reflect/package-summary.- html "Publication date unknown, but prior to Dec. 10, 2004." cited by other .
http://java.sun.com/j2se/1.4.2/docs/api/java/rmi/package-summary.html "Publication date unknown, but prior to Dec. 10, 2004." cited by other.

Parent Case Text: This application claims priority to the U.S. Provisional Patent Application entitled "Apparatus and Method for Developing, Testing and Monitoring Secure Software", Ser. No. 60/577,066, filed Jun. 4, 2004. This application is related to the following commonly owned and concurrently filed patent applications: "Apparatus and Method for Developing, Testing and Monitoring Secure Software", U.S. Ser. No. 11/009,570, filed Dec. 10, 2004; "Apparatus and Method for Testing Secure Software", U.S. Ser. No. 11/009,474, filed Dec. 10, 2004; "Apparatus and Method for Monitoring Secure Software", U.S. Ser. No. 11/009,572, filed Dec. 10, 2004.


Claim(s):

The invention claimed is:

1. A computer readable medium including stored executable instructions to analyze program instructions for security vulnerabilities, comprising instructions executingto: convert diverse program instruction formats of a set of software applications executing on different platforms to a common format, wherein said executable instructions to convert include executable instructions to break down expressions of saiddiverse program instruction formats into a single set of equivalent sequences of simpler statements to support analysis of said diverse program instruction formats; derive a system model from said common format, wherein said model characterizes programinteractions between said diverse program instruction formats; perform a static analysis on said system model to identify security vulnerabilities, wherein said static analysis includes analyzing said system model without executing said system model; and report said security vulnerabilities.

2. The computer readable medium of claim 1 wherein said executable instructions to convert include executable instructions to convert different source or executable code formats executing on different platforms to said common format.

3. The computer readable medium of claim 1 wherein said executable instructions to convert include executable instructions to convert different machine instruction formats to said common format.

4. The computer readable medium of claim 1 wherein said executable instructions to convert include executable instructions to convert different program configuration file formats to said common format.

5. The computer readable medium of claim 1 wherein said executable instructions to convert include executable instructions to convert a program instruction expression into an equivalent sequence of simpler statements defined in said commonformat.

6. The computer readable medium of claim 5 wherein said executable instructions to convert include executable instructions to convert said program instruction expression into an equivalent sequence of simpler statements that includes atemporary variable.

7. The computer readable medium of claim 1 wherein said executable instruction to derive include executable instructions to derive a system model characterizing multiple inter-operative applications.

8. The computer readable medium of claim 1 wherein said executable instructions to perform include executable instructions to identify locations where input is taken from outside the program instruction formats.

9. The computer readable medium of claim 8 wherein said executable instructions to perform include executable instructions to trace the processing of said input throughout said diverse program instruction formats.

10. The computer readable medium of claim 1 wherein said executable instructions to perform include executable instructions to identify at least one of the following security vulnerabilities: stack buffer overflow, heap buffer overflow, formatstring attack, SQL injection, an ordering problem, and protocol misuse.

11. The computer readable medium of claim 1 wherein said executable instructions to perform a static analysis include executable instructions to perform a static analysis selected from a static data flow analysis, a lexical analysis, a semanticanalysis, and a program control flow analysis.

12. The computer readable medium of claim 1 wherein said executable instructions to report include executable instructions to report a vulnerability, a vulnerability entry point, and a vulnerability processing path.

13. The computer readable medium of claim 1 wherein said executable instructions to report include executable instructions to report said security vulnerabilities to a security test module and a security monitoring module.

14. A method of analyzing stored program instructions for security vulnerabilities, comprising: converting diverse program instruction formats of a set of software applications executing on different platforms to a common format, whereinconverting includes breaking down expressions of said diverse program instruction formats into a single set of equivalent sequences of simpler statements to support analysis of said diverse program instruction formats; deriving a system model from saidcommon format, wherein said system model characterizes program interactions between said diverse program instruction formats; performing a static analysis on said system model to identify security vulnerabilities, wherein said static analysis includesanalyzing said system model without executing said system model; and reporting said security vulnerabilities.

15. The method of claim 14 wherein converting includes converting different source or executable code formats executing on different platforms to said common format.

16. The method of claim 14 wherein converting includes converting different machine instruction formats to said common format.

17. The method of claim 14 wherein converting includes converting different program configuration file formats to said common format.

18. The method of claim 14 wherein converting includes converting a program instruction expression into an equivalent sequence of simpler statements defined in said common format.

19. The method of claim 14 wherein deriving includes deriving a system model characterizing multiple inter-operative applications.

20. The method of claim 14 wherein performing a static analysis includes performing a static analysis selected from a static data flow analysis, a lexical analysis, a semantic analysis, and a program control flow analysis.



Description:

BRIEF DESCRIPTION OF THE INVENTION

This invention relates generally to software security. More particularly, this invention relates to comprehensive techniques for identifying software security vulnerabilities during software development, testing and deployment.

BACKGROUND OF THE INVENTION

Businesses are increasingly dependent on information technology. Information systems are becoming increasingly more complex, higher-powered, inter-connected, and openly accessible to partners and customers over vastly distributed networks. Thebusiness environment has increasingly shifted from face-to-face interactions to largely anonymous electronic transactions. Software development itself is becoming more distributed through offshore development arrangements and intra-company collaborativecomputing. These trends strain the ability of organizations to secure and protect digital data from misuse or unauthorized access.

Nearly every major business critical application deployed today contains vulnerabilities that can be exploited to cause considerable harm to the business or the assets it manages. These vulnerabilities can be leveraged to steal importantinformation, sabotage computer systems or influence processing for the profit or malicious intent of the attacker.

For an experienced hacker or rouge insider, manipulating software to this end is made especially easy due to the variety of information and tools available on-line. An attacker's biggest challenge is simply finding the vulnerabilities in thecontext of a large business application. Compounding the problem, mainstream computer security solutions, such as firewalls, are based on the premise that exposed and vulnerable software can be protected by isolating it from the dangers of the outsideworld. Business requirements dictate that few business critical applications can be truly isolated. Most have numerous access points via data transfer interfaces, remote procedure calls, and internal and remote users. Firewalls and othernetwork-oriented security solutions are not configured to block the type of access that business critical applications require. In fact, today's business functions rely on this access so much that they would fail to operate if denied. For example, thestock market would fail to execute trades without the links from brokers to the exchanges, supply chains would break without information flowing between suppliers and producers, and telecommunications would cease without the ability to connect cellphones to the computers that control the network or the billing systems that underlie the business. Attackers make use of these facts to compromise systems every day. The true flaw in the outside-in premise, however, is that vulnerable software can beprotected at all--somehow made un-vulnerable.

Given this background, a question naturally presents itself: Why are network-based computer security solutions applied to what is clearly a software problem? One answer is that most information security practitioners have network securitybackgrounds and are spread thin resolving operational security issues, leaving little time to interact with the core software development process. At the same time, application developers are rewarded for producing new features against tight deadlines,with little room for security considerations. Rarely does any one person own responsibility for the security elements of the application itself. Conventional practice has been that development gets the business critical application shipped, and networkoperation teams will secure it. The dichotomy of these roles creates an extraordinary advantage for the attacker--they are the only ones truly experienced and focused on software security or more precisely business critical application insecurity.

Experts in and around software development have increasingly acknowledged that something must be done about software security. Nevertheless, coherent and practical solutions have not been identified. There are a number of factors that makesolutions difficult to identify. For example, software security vulnerabilities are subtle, logical errors that can span thousands of lines of code, making accurate detection with reasonable performance extremely difficult. At first glance, thetechnology challenges make such a solution appear more akin to compilers or niche development tools. The large software development tools vendors, however, have not made security a core part of their offerings. Their customer base is still largelyfocused on how to improve creation of features and functionality--and the vendors' internal teams cannot easily recognize a changing paradigm while they work to improve the feature sets of their single-purpose products. This is a classic innovatorsdilemma. In addition, the high volume development tool providers are not adept at delivering enterprise-like solutions that a risk management system requires or sustaining the price points needed to support such a solution. Indeed, the current state ofdevelopment tool pricing has generally discouraged the security community from building developer-oriented solutions.

Apart from the downsides inherent in the development tool landscape, software security requires specialized expertise in a constantly changing field. The problem is not just about finding technology to scan code, but includes creating andcontinually updating rules to detect these vulnerabilities. Delivering the rules requires expert knowledge of a constantly growing body of research and real-world architectures, frameworks, use patterns and many other factors that cause vulnerabilitiesin business critical applications. For example, every release of an operating system or library application program interfaces (APIs) introduces new ways to make mistakes that lead to security vulnerabilities. Vendors must deliver solutions thataccount for these cross-boundary, multi-platform architectures.

Finally, it is unlikely that software security can be accomplished by a single point solution. Similarly, it is unlikely that software security can be addressed solely at the developer level. Software security is largely a risk managementproblem. Addressing such a problem requires detailed information collected over time. It requires an approach that keeps software developers as productive as before, yet makes security metrics visible to management during development, testing anddeployment. It requires an enterprise software-like solution for managers and organizations.

In view of the foregoing, it would be highly desirable to provide an improved technique for software security.

SUMMARY OF THE INVENTION

The invention includes a computer readable medium with executable instructions to analyze program instructions for security vulnerabilities. The executable instructions convert diverse program instruction formats to a common format. A systemmodel is derived from the common format. A static analysis is performed on the system model to identify security vulnerabilities. Security vulnerabilities are then reported.

The invention also includes a method of analyzing program instructions for security vulnerabilities. Diverse program instruction formats are converted to a common format. A system model is derived from the common format. A static analysis isperformed on the system model to identify security vulnerabilities. Security vulnerabilities are then reported.

BRIEF DESCRIPTION OF THE FIGURES

The invention is more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which:

FIG. 1 illustrates an apparatus configured in accordance with an embodiment of the invention.

FIG. 2 illustrates processing operations associated with an embodiment of a security development module of the invention.

FIG. 2A illustrates data flow security operations to track taint propagation through an exemplary common code format utilized in accordance with an embodiment of the invention.

FIG. 3 illustrates processing operations associated with an embodiment of a security test module of the invention.

FIG. 4 illustrates processing operations associated with an embodiment of a security monitoring module of the invention.

FIG. 5 illustrates the operation of a security monitoring module configured in accordance with an embodiment of the invention.

FIG. 6 illustrates components of a security monitoring module configured in accordance with an embodiment of the invention.

Like reference numerals refer to corresponding parts throughout the several views of the drawings.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 illustrates an apparatus 100 configured in accordance with an embodiment of the invention. The apparatus 100 includes a central processing unit 102 connected to a set of input and output devices 104 over a bus 106. By way of example, theinput and output devices may include a keyboard, mouse, computer monitor, printer, and the like. Also connected to the bus 106 is a network interface 108, which uses standard devices to interface with a network 110, which may be a local area network, anintranet, the Internet, and the like.

A memory 112 is also connected to the bus 106. The memory 112 stores a set of executable instructions to implement the operations of the invention. In one embodiment, the executable instructions include three major modules: a securitydevelopment module 114, a security test module 116, and a security monitoring module 118.

The security development module 114 includes executable instructions to facilitate a static analysis of software in order to identify security vulnerabilities inherent to the structure of the software. The software includes program instructions. As discussed below, the invention is operative with diverse program instruction formats. For example, the program instruction formats may be different source or executable code formats, different machine instruction formats, and/or different programconfiguration file formats. The program instructions form various software applications. A set of software applications define a software system, which is analyzed in accordance with the invention, as discussed below. In one embodiment, the securitydevelopment module 114 is implemented with an interface module 120, a common format generator 122, security development rules 123, an analysis engine 124, and a report generator 126.

The security test module 116 includes executable instructions to test the operation of software for security vulnerabilities. Preferably, the security test module 116 relies upon information gathered by the security development module 114 torefine its testing protocol. In one embodiment, the security test module 116 is implemented with an attack manager module 128, an attack database 130, security test rules 131, a fault injection module 132, and a test report generator 134.

The security monitoring module 118 includes executable instructions to monitor the execution of software in order to identify security vulnerabilities. Preferably, the security monitoring module 118 relies upon information associated with thelocal execution of a program and the global execution of related programs to identify security vulnerabilities. In one embodiment, the security monitoring module 118 is implemented with a sensor insertion module 136, security monitoring rules 137, amonitoring analysis module 138, and a monitoring report generator 140.

The configuration of the executable programs of FIG. 1 is exemplary. It should be appreciated that these modules may be combined in any manner and may otherwise be executed in any manner, such as across a network. Indeed, in many embodiments ofthe invention, these components are distributed across a network. Further, the operations performed by individual sub-modules may be combined in any number of ways.

Now that the primary processing operations of the invention have been introduced, attention turns to a more detailed discussion of these primary processing operations. As shown in FIG. 1, the security development module 114 includes an interfacemodule 120. The interface module 120 includes executable code to handle interface operations. For example, the interface module handles interactions with the user via command lines, pull-down menus, IDE plug-ins and the like. The interface module alsointeracts with the other executable programs of the system, including the security test module 116 and the security monitoring module 118.

FIG. 2 illustrates the primary processing operations associated with the other executable modules of the security development module 114. The first processing operation shown in FIG. 2 is to convert source or executable code into a common format200. This operation may be implemented with the common format generator 122. The common format generator 122 converts all of the source or executable code files for all of the tiers of an application to be analyzed into a common format. The examplecommon format disclosed herein is called the Normalized Syntax Tree (NST) format.

An application system model is then derived from the common format 202. The common format generator 122 may perform this operation as well. In particular, the executable code is used to create a uniform model of the application from the NSTfiles.

A data flow analysis is then performed on the system model to identify security vulnerabilities 204. The analysis engine 124 may be used to implement this operation. The analysis engine 124 identifies possible execution paths through theprogram where user input can reach a dangerous function or construct. The analysis engine 124 invokes security development rules 123. Typically, the security development module 114 is deployed with a set of security development rules 123. These rulesmay be updated on a subscription basis from a remote computer connected to network 110. In addition to these supplied rules, a user may tailor specific security development rules for a particular application. The analysis engine 124 is a separatecomputational kernel and therefore it can be used with a diverse set of standard and customized security development rules 123.

The security vulnerabilities identified by the analysis engine 124 are reported to the user and related modules 206. The report generator 126 may be used to implement this operation.

The security development module 114 performs a form of semantic analysis across multiple tiers and languages to find security vulnerabilities, such as stack buffers, tainted variables, SQL injection and custom-defined security flaws. The tiersrange from the operating system to the database, application server to user interface, in applications that span multiple languages, including Java, C/C++, HTML, JSP and PL/SQL. The invention's analysis of diverse program instruction formats and systemsthat include multiple software applications is a significant advance over prior art systems.

The security development module 114 of the invention may be integrated into commercially available integrated development environments, thus investigating warnings and removing security errors becomes a natural part of the edit-compile-debugsoftware development process.

As shown in FIG. 1, the security development module 114 may be implemented with an analysis engine 124. The analysis engine 124 preferably implements a static analysis. Static analysis is a technique for analyzing software without executing thesoftware. Static analysis has historically suffered from high complexity. In particular, static analysis has gained a reputation for producing a high volume of suspect or hard to interpret results when applied to real world software.

There are a number of challenges associated with using static analysis in software security operations. First, both global dataflow and control flow static analysis techniques must be used to provide accuracy. Second, the myriad languages andframeworks create special cases that must be handled. Third, security analysis must be extensible to cover the large set of application-specific vulnerabilities. Fourth, security analysis requires the study of attacks to define the semanticrepresentation of particular vulnerability classes and the studies must be kept up-to-date since these attacks change over time. Finally, any analysis must be constrained by realistic commercial product requirements. The two most difficult requirementsto satisfy in a commercial setting are scalability and code access. With respect to scalability, the analysis must perform with extremely low overhead at the developer's desktop, yet perform well in a full-scale audit and review over massive code bases. In addition, the global analysis must often be facilitated without access to the entire body of code.

The present invention addresses these challenges that exist in the prior art. The security development module 114 provides a new form of static analysis that is directed solely to software security issues. The security development module 114provides useful information that can be immediately utilized to improve software security. In addition, it provides useful information that is exploited during testing and monitoring phases.

These operations are more fully appreciated in connection with an example. The following example illustrates the steps carried out by the security development module for a simple 2-tier application. The following example is complete in that itprovides sufficient input to identify code vulnerabilities. The example is incomplete in the sense that additional standard tools, support logic, and configuration files are required to actually run the application. These additional elements arestandard in the art and therefore are not subject to further discussion.

The sample application consists of a Java servlet and a PL/SQL package. The purpose of the application is to display an account balance to a user. The application works as follows. The Java servlet accepts an HTTP POST request that contains aparameter named "acct". This is the type of HTTP request typically generated by a web browser when a user fills out and submits a form on a web page. The "acct" parameter might be set, for example, by the user selecting an account name from a drop-downlist. The servlet passes the value of the "acct" parameter to a database query. The query invokes a stored procedure in the database named "ACCT.get_balance". The stored procedure uses the parameter passed from the servlet in order to construct an SQLquery. The query examines a database table named "ACCOUNTS". It returns the value in the "balance" column for the row matching the account name that is passed in. The stored procedure returns the balance value to the servlet, and the servlet in turnreturns the balance value to the user.

A malicious user can exploit vulnerability in the application in order to see account balances that they are not authorized to see. The vulnerability is simple: the application never checks to see whether the user has permission to see thebalance of the account number that they have requested. This type of vulnerability is common in poorly written web-based applications. The problem can be viewed in terms of data flow: the "acct" value provided by the user flows unchecked into the SQLquery in the database. This class of vulnerabilities is known as "SQL injection" because a malicious user can "inject" information of their choosing into a SQL query.

The following is exemplary Java code for an account balance application:

TABLE-US-00001 ************************************************************ import java.sql.*; import javax.servlet.http.*; class AccountView extends HttpServlet { private Connection connection; public void doPost(HttpServletRequest request,HttpServletResponse response) { String acctNumber = request.getParameter("acct"); CallableStatement stmt = null; try { stmt = connection.prepareCall("begin ACCT.get_balance(?, ?); end;"); // Bind parameter types stmt.setString(1, acctNumber); // Bind 1stparameter stmt.registerOutParameter(2, Types.INTEGER); // 2nd is result // Execute the callable statement stmt.execute( ); int balance = stmt.getInt(2); // get result response.getWriter( ).write("Account balance: " + balance); } catch(SQLException ex) {// Trap SQL Errors response.getWriter( ).write("Error: " + ex.toString( )); } finally { try { if(stmt != null) { stmt.close( ); // close the statement } } catch(SQLException ex) { } } } } ************************************************************

Relying upon the same example, the following is PL/SQL code for the Account Balance application:

TABLE-US-00002 ************************************************************ CREATE OR REPLACE PACKAGE ACCOUNT IS TYPE CURSORTYPE IS REF CURSOR; FUNCTION get_balance( NAME VARCHAR2 ) RETURN CURSORTYPE; END; / -- Package body TEST CREATE ORREPLACE PACKAGE BODY TEST IS FUNCTION get_balance( NAME VARCHAR2 ) RETURN CURSORTYPE IS CURSORRET CURSORTYPE; N1 VARCHAR2; BEGIN N1:= NAME; OPEN CURSORRET FOR SELECT balance FROM ACCOUNTS WHERE (ACT_NUMBER = N1); RETURN CURSORRET; END; END; / commit;show errors; exit; ************************************************************

As previously indicated, the initial operation performed by the security development module is to convert all of the source or executable code files for all of the tiers of the application into a common format, called the Normalized Syntax Tree(NST) format.

This step involves parsing each source or executable code file according to the language it is written in and then translating the parsed information into the NST format. This step closely resembles the first phase carried out by a modernhigh-level language compiler (such as the Gnu C compiler, gcc) where the compiler creates a high-level intermediate language from a source or executable file. High-level languages are designed for balance between the freedom and expressiveness given tothe programmer and rules and constraints necessary for a compiler to efficiently translate the language into an executable form. Humans do not write the NST format, so it does not supply the niceties and shortcuts that are usually provided forprogrammers. Because the NST format is created from a number of different high-level languages, it targets the lowest common denominator between the languages. Of course, it must provide enough expressiveness to capture the meaning of all of theconstructs in all of the languages. Compiler researchers have defined well-accepted methods for building program models. For example, see chapters 3, 4, and 8 of Aho, et al., Compilers, Principles, Techniques and Tools, Pearson Higher Education (1985).

Translation from a high-level language into the NST format is governed by a set of translation rules that are specific to the high-level language being translated. For example, some of the rules controlling the translation from Java to NST are:NST does not include a construct like Java's import statement. Java import statements have no explicit representation in the NST. Java does not require programmers to fully qualify variable, class, and method names unless a name is potentiallyambiguous. NST requires all names to be fully qualified so that there is no need to check for ambiguity. When a name is translated from Java to NST, it is translated into a fully qualified form. Type and method resolution in Java is achieved byfollowing the rules and instructions set forth in the Java Language Specification (section 15.12, http://java.sun.com/docs/books/jls/second_edition/html/expressions.doc.ht- ml#20448). In Java, all objects are referenced through pointers. Because thereis only one way to reference an object, no pointer notation is necessary in Java. Because NST is used to represent languages like C and C++ where objects may be referenced directly or through pointers, all Java object references are translated toinclude explicit pointer reference notation in NST. In Java, a member function can operate on its object using the keyword "this". NST has no "this" keyword. Instead, the object associated with a member function is explicitly represented as the firstargument to the function.

The following text describes the NST syntax using grammar-like constructs. The following conventions are used:

TABLE-US-00003 Production - A plain word refers to another production identifiers - A word in italics is an identifier <token> - A word or character surrounded by brackets is a token <token_class> - A word in italics surrounded bybrackets refers to a class of tokens. CompilationUnit: (ClassDecl|VarDecl|FunDecl)* ClassDecl: <modifier>* name (ExtendsList)? (ImplementsList)? <{> (FieldDecl)* (FunDecl)* <}> ExtendsList: <extends> (Type)+ ImplementsList:<implements> (Type)+ FieldDecl : <modifier>* Type name <;> FunDecl : <modifier>* Type name <(> ( ( VarDecl ( <,> VarDecl )* (<,> <...>)? ) | <...> )? <)> <:> unique_name ( Block |<;> ) VarDecl : Type name <;> Type : ( <modifier>* (<primitive_type>|typename) <*>* (<[> numeric_literal? <]>)* |<modifier>* (<primitive_type>|typename) <*>* (<[> numeric_literal?<]>)* <(> ( VarDecl ( <,> VarDecl )* )? <)> ) Statement : ( label <:> ) ? (AssignmentStmt|IfElseStmt|WhileStmt|GotoStmt|DeclStmt|ReturnStmt| CallStmt|Block) <;> Block : <{> (Statement)* <}> AssignmentStmt :(Location) <=> Expression DeclStmt : VarDecl IfElse : <if> <(> Expression <)> Block (<else> Block)? WhileStmt : <while> <(> Expression <)> Block ReturnStmt : <return> Expression CallStmt : FunCallExpression : (Location|FunCall|Allocation|OpExp|TypeCastExp|LiteralExp) |<(> Expression <)> Location : ( (VarAccess|FieldAccess) (Index)* | FunIdentifier ) FunCall : ( <->> unique_name | <-->> Expression In this caseexpression is expected to evaluate to a function pointer ) <(> Arg (<,> Arg)* <)> GotoStmt : <goto> label Arg : (Expression) Allocation : <new> Type (Index)* VarAccess : name FieldAccess : (<[> Type <]>)?(Expression) <.> name note: Type here represents the enclosing type of the field being accessed FunIdentifier : <->> unique_name Index : <[> (Location|LiteralExp) <]> OpExp : ((<unary_op> Expression)|(Expression<bin_op> Expression)) TypeCastExp : <<>Type <>> Expression LiteralExp : <literal> Directive : (A directive can appear on any line by itself) <#> ( <source-type> | <source-file> | <source-line> ) a.Terminals modifier : :public: :private: :protected: :static: :final: :strictfp: :abstract: :transient: :volatile: :vitual: :inline: :extern: :const: primitive_type : :int: :long: :float: :double: :boolean: :short: :byte: :char: :void: :short char::unsigned char: :unsigned short: :unsigned int: :unsigned long: :long long: :unsigned long long: :long double:

The NST is designed to represent programs for the purpose of global analysis. It is not designed to be compiled or to have a convenient, human-readable form. As such it does not support many convenient features of the languages (e.g., C, C++,Java) it represents. Features such as single statement declaration and initialization, assignments in expressions, short-circuit operators, typedefs, etc., are not part of the NST. Rather, the front-end translator is responsible for breaking down thesecomplex expressions into equivalent sequences of simpler statements during the conversion from the source or executable language to NST. The following table contains an exemplary listing of high-level language constructs and their equivalent translationin NST form. Note that many of these translations require the introduction of temporary variables into the NST.

TABLE-US-00004 TABLE 1 High-Level Construct NST Equivalents Language Feature NST Equivalent Initializers VarDecl + AssignmentStmt int a = 10; int a; a = 10; compound expressions simple expressions a = b = 17; a = 17; b = 17; C typedefs typesresolved typedef unsigned int mytpe; unsigned int a; mytype a; continue statements ControlStmt while(b){ while_loop: if(c){ while(b){ continue; if(c){ } goto while_loop; ... } } ... } continue statements ControlStmt while(b){ while(c){ if(c){ if(c){break; goto while_loop_end; } } ... ... } } while_loop_end: compound predicates Statement + Predicate if(test( )){ tmp = test; ... if(tmp){ } ... } short-circuit and nested ifs if(exp1( ) && exp2( )){ t = exp1( ); ... if(t){ } t = exp2( ); if(t){ ... } } short-circuit or nested ifs if(exp1( ) || exp2( )){ t = exp1( ); ... if(!t){ } t = exp2( ); } if(t){ ... } conditional expressions IfElseStmt a = b ? 7 : 3; if(b){ a = 7; } else { a = 3; } for loops WhileStmt for(int i = 0; i < 10; ++i){ int i;... i = 0; } while(i < 10){ ... ++i; } do ... while loops WhileStmt do{ ... ... while(a < 10){ } while (a < 10); ... } switch statements IfElseStmts + ControlStmts swtich(a){ if(a == `a`){ case `a`: ...(1) ...(1) } else { break; if(a ==`b`){ case `b`: ...(2) ...(2) goto case_c; case `c`: } else { ...(3) if(a == `c`){ break; case_c: default: ...(3) ...(4) } else { } ...(4) } } } inner classes, anonymous inner classes named normal classes class A{ class A{ ...(A) ...(A) class B{ } ...(B)class A$B{ } protected final A A$this; } public A$B(A a){ A$this = a; } ...(B) }

The following rules are used to resolve types, variables, fields and functions in the NST back to their corresponding declarations.

TABLE-US-00005 ************************************************************ varDecl resolveVar(VarAccess v) Scope s = v.getScope( ) while(s.getVarDecl(v.name) = null) s = s.getParentScope( ) return s.getDecl(v.name) FieldDeclresolveField(FieldAccess f) return resolveType(f.type) .getFieldDecl(f.fieldName) FunDecl resolveFun(FunCall f) if(f.type != null) return resolveType(f.type) .getFunDecl(f.funSig) else return f.getScope( ) .getRootScope( ) .getFunDecl(f.funSig) TypeDeclresolveType(Type t) return globalScope.getTypeDecl(f.typeName) ************************************************************

Using the foregoing high-level construct NST equivalents and rules, the examplary Java code for the account balance example is transformed into the following exemplary NST listing. Line numbers are used so that individual lines can be referredto in the following discussion.

TABLE-US-00006 ************************************************************ 1 #source-file /home/sean/scratch/patent/AccountView.java 2 #source-type java 3 :class: AccountView :extends: javax.servlet.http.HttpServlet { 4 :private:java.sql.Connection * connection ; 5 :public: void doPost ( AccountView * this

US Patent:  7207065