About Understand:
Understand is a static code analyzer for a code base which gleans metrics like:
- Complexity Metrics (e.g. McCabe Cyclomatic)
- Volume Metrics (e.g Lines of Code)
- Object Oriented (e.g. Coupling Between Object Classes)
Understand has an inbuilt reverse engineering feature which helps the user to get a good understanding of how the code was developed by providing graphical representations
There are two methods to mine objects from the procedural code using Understand:
1. Identify global structures which apparently represent the state of some object
2. Identifying dependencies between data-types in a program
Clang:Embedded Static Analyzer in Understand
Clang Static Analyzer is a source code analysis tool that finds bugs in C, C++, and Objective-C program. Clang does a number of checks like:- Detection of memory leaks
- Checking of virtual function calls during construction and destruction
- Checks for dead code by looking for idempotent operations and unreachable code blocks
- Check for dereferences of null pointers
- Check for division by zero and logical errors by function calls.
Time for some real analysis
We picked up the following open source code from github to statically analyze this code once we downloaded Understand and Clang.
We tested audacity for code standards and probability of bug occurences.
Our results are as follows:
1. 281 files had control flow violations(Dangling Else, Single Exit Point at End, Unreachable code)
Our results are as follows:
1. 281 files had control flow violations(Dangling Else, Single Exit Point at End, Unreachable code)
2. 344 files contained violations related to Memory Allocation in the form of dynamic heap allocation.
Example code:(src/commands/commandType.cpp)
if(mName!=null){
delete mName;
}
Example code:(src/commands/commandType.cpp)
if(mName!=null){
delete mName;
}
Understand calculated the cyclomatic complexity of functions and listed a number of functions which qualified as the most complex ones in the code.
Most Complex Functions |
Inferences:
- The code base is pretty large which can be seen from the lines of code. Also code appears to be well documented as the Comment to lines Ratio is 33%, however we cannot guarantee that the comments are appropriate as there is no golden rule and the number of comments depends greatly on the inherent complexity of the code.
- Possibility of refactoring: There a some files like which need to be refactored. This will help to debug and maintain files in an improved manner.
- Also, we can take care to add more test cases to the files containing the complex functions as well as ensure that these functions are well explained using meaningful comments so that it becomes easier to understand and maintain the code base.
2:Re-think DB:
RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.
An important component of process improvement is the ability to measure the process.
The four steps involved in the Object Oriented Design process are:
a. Identification of classes and objects
b. Semantics of classes and objects
c. Relationships between classes and objects
d. Implementation of classes and objects
The metrics developed can be listed as follows:
1. Weighted Methods Per Class:
The following viewpoints have been developed in relation to gleaning OOD metrics from the classes and objects:
a. The number of methods and the complexity of methods involved indicate the time and effort required to maintain the code.
b. Larger the number of methods in a class, the greater will be the impact since the children will inherit all the methods in the class.
c. Classes with larger number of methods tend to be application specific limiting the possibility of reuse.
Largest Function |
The above figure shows the graphical representation of functions in the order of their cyclomatic complexity. By enlisting the complex functions, we can refactor the code and develop test cases to ensure all code flows work properly.
2. Depth of Inheritance Tree:
Viewpoints:
a. The deeper a class is in the hierarchy, the greater the number of methods it will inherit.
b. Deeper trees entail greater design complexity due to the number of classes and functions involved.
c. The deeper a class is in the hierarchy, the greater is the potential reuse of inherited methods
![]() |
Architectural Dependencies Using Understand |
3. Number of children:
Viewpoints:
a. Since inheritance is a form of reuse, greater the number of children, greater is the reuse.
b. Misuse of subclassing may result from the large number of children present in a hierarchy as it could lead to improper abstraction of the parent functions.
c. More testing of methods will be needed if a class has a lot of child classes.
Architectural Browser |
Inferences about the code:
- Certain coding standards were violated like presence of commented out code, goto statements which should not be present ideally. Also certain functions were too long. There were also instances of unreachable code and usused functions.
- In the metrics summary, we get an idea about the lines of code, classes and files involved. The architectural browser gives us an idea about the languages and directory structure. It becomes easier for maintainers to seek expertise in the field they need to maintain and support the code
- The lines of code are few in comparison to the previous two code bases explored, this is a relatively smaller application, also the comment to line ratio is 0.57 which may indicate the possibility of a well documented code.
3.JSON Parser:
A fresh approach to JSON loading that speeds up web applications by providing the parsed objects before the response completes.
![]() |
Dependency Graph |
The source code of the json parser is entirely javascript. We started analysis by applying reverse engineering using dependency graphs to understand the code flows and learn how the different classes were associated with one an other. We also attempted to generate UML diagrams for single files to analyze structural properties of the class.
4. Fast Image Cache:
Fast Image Cache is an efficient, persistent, and fast way to store and retrieve images in any iOS application.
We began analysis of Fast Image Cache code by generating Metrics Treemap. A Metrics Treemap visualizes the code by generating a hierarchical package structure as nested rectangles with parent packages encompassing child packages. It helps us understand how the code is structured, whether there are any major issues and if they are localized or spread throughout the database. This helps us visualize complex classes with the color showing the sum of the cyclomatic complexity of all the methods in the class. Usually file size and complexity are directly proportional so larger rectangles indicate larger file sizes.
The feature of Architecture Browser helps us to visualize the different classes in Fast image Cache and the different programming languages involved to build them.
Summary
We have evaluated the two static analysis tools Understand and Clang by running them over the selected code bases as mentioned above. Understand helped us to glean metrics and graphical representations to help understand the code structure better.Clang on the other hand checked the code base for violations like coding standards and helped detect memory leaks, dangling pointers and null pointer dereferences. We have learned that though the tools are really helpful in finding bugs, they can be resource intensive and the time required to run these tools is more than that required for compilation. Also there are instances of false positives, i.e, it falsely claims that there are bugs in the program where the code otherwise behaves perfectly.
|
glean metrics and graphical representations to help understand the code structure better.Clang on the other hand checked the code base for violations like coding standards and helped detect memory leaks, dangling pointers and null pointer dereferences.
ReplyDeleteBest static code analysis tools