Analyzing Code Comments to Boost Program Comprehension
→ Yusuke Shinyama
Yoshitaka Arahori
Katsuhiko Gondow
(Tokyo Tech)
https://euske.github.io/
/*
 * You are not expected to understand this.
 */
- Dennis Ritchie

1. Background

2. Why Code Comments?

3. Our Contribution

To propose a theoretical framework for analyzing source code comments.
q_math.c (Quake-III-Arena)
float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y;            // evil floating point bit level hacking
    i  = 0x5f3759df - ( i >> 1 );    // what the fuck?
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );  // 1st iteration
//  y  = y * ( threehalfs - ( x2 * y * y ) );  // 2nd iteration, this can be removed

    return y;
}

4. Research Questions

  1. What are the general structure of local comments?
  2. How can we use them for program comprehension?

Methodology

  1. Manually review comments and discover the commonality.
  2. Build a classifier to test the hypothesis.

5. Key Observation

Comment Code Role post- condition // Do the // thing. doit();
Code comment forms a relationship.

6. Three Elements of Comment

7. Comment Extent

8. Comment Category

Comment Categories

Postcondition (What do)75
Precondition (Why do)37
Value Desc. (What is)15
Instruction (Todo, etc.)14
Comment Out12
Visual Cue11
Directive8
Guide0
Uncategorized9

9. Comment Target

10. Key Observation (Again)

11. Building Classifier

Annotating Comment Extents

Annotating Comment Categories

Reviewer 1 Reviewer 2 Reviewer 3

Classifier Implementation

12. Experiments

Extra Data

13. Results

Cross-langauge Experiment

Categories by Language

JavaPython
(48229) 2. hadoop (46544) 3. camel (32885) 4. robovm (32195) (31151) 6. hive (28807) 7. hbase (25513) 8. neo4j (24603) 9. j2objc (22788) 10. XobotOS (22638) Post. Pre. Cmt.Out Visual Value Inst. 1. platform_frameworks_base 5. intellij-community 1. main (39745) 2. appscale (33416) 3. kbengine (25105) 4. hue (22956) 5. pyston (17016) (14525) 7. cpython (12810) 8. sympy (11537) 9. nova (11211) 10. Theano (9090) Post. Pre. Cmt.Out Visual Value Inst. 6. edx-platform
- /r/ProgrammerHumor

14. Findings

How to Use This? (Future Work)

15. Conclusion

Related Work

Comment Categories

Using Comments

Threats to Validity

Popular Phrases (Java)

do nothing332
throw exception170
set default161
add list154
do anything146
set value140
use default122
have value119
create file119
create list116

Features (for Comment Extent)

DeltaRowsDistance in lines from a previous comment.
DeltaColsDifference in columns from a previous comment.
DeltaLeftDifference in columns between a comment and syntax element.
LeftSyntaxSyntax element left to the comment.
RightSyntaxSyntax element right to the comment.
ParentSyntaxParent syntax element of the comment.

Features (for Target / Category)

LeftSyntaxSyntax element left to the comment.
RightSyntaxSyntax element right to the comment.
ParentSyntaxParent syntax element of the comment.
HasSymbolDoes the comment text include a symbol?
PosTagFirstPOS tag of the first word of the comment.
PosTagAnyDoes the comment text include a certain POS tag?
WordFirstFirst word of the comment text.
WordAnyDoes the comment text include a certain word?
This page intentionally left blank.

おまけ: 発表を有意義なものにする