Code Reading in Practice

5 Dimensions of Code Reading Structure, Domain, Concepts, Context,

and Collaboration.

Structural Examination

When you have only a little bit of prior knowledge, each line of code, or even each individual element might take up one spot, but if you know a lot about the code base, each method might hold a spot, causing you to be able to process a lot more. This process is called chunking.

One of the first techniques to apply when reading unfamiliar code is to understand its division into components; this includes classes and methods in an Object-oriented project, or functional components in a more functional style program.

Structural Components and Their Relationship

  • Elements
    • Object components
    • Functional components
    • Singular components
  • Folded view
    • Shortcuts help
    • Size matter as well. Notice the line numbers.
  • Structure view

When understanding new information, your brain always tries to relate new information to things you already know or understand. As such, your brain forms a mental model of the information that you can use to reason about the new information. Explicitly making connections between information visible will support your brain in easier understanding.

  • Relationship between components
    • Inheritance
    • Contains as method
    • Contains as field
    • Used in body
    • Calls and Overrides
    • Use as parameter, or Calls on static method
  • Find Usage feature for detecting dependencies
  • Informal relationship
    • Naming conventions
    • Similarities
    • Same parameters used frequently = Data Clump
  • Go for a timeboxed top-to-bottom read

Entry Point

  • Finding entry points
    • Guess by the name
    • Takes few arguments, but calls a lot functions
  • Place breakpoints and read from the entry point
  • Visualizing the system - draw your own UML diagrams

After the Read

  • Expand documentation
  • Improve comments
  • Discuss code quality
    • Structural code smells
    • Linguistic code smells
    • Is there a historical reason or a good justification?
    • Inappropriate intimacy or Feature envy?
    • Message Chains
      • Merge small methods?
      • Eliminate long chains?
  • In multi-lingual code base, static source code analysis might not always find dependencies.
  • Refactor
    • Large class divide into subgroups
    • Small class merge into other classes
    • Data class merge into other classes that frequently access the data
    • Long parameter list moving some into class fields? Group parameters into larger objects?
    • One call functions premature generalization, this causes more cognitive load. Maybe inline these methods?
    • Dead code

Understanding the Business Domain of Your Code in More Depth

  • Domain model = behavior + data

Learning from Variable Names

  • List all variable names in use
    • Quantifiers = refer to quantities
    • Programming concepts = refer to a specific concept in a language
    • Domain concepts = refer to the domain
    • Text analysis tools: Word Cloud
  • Split the words and examine the domain concepts
  • Determine the concept
    • Objects or objects attributes?
    • Type of the component?
    • Analyze in network
  • Define the concepts and behaviors as precisely as possible

After Reading

  • Documenting the Ubiquitous Language
  • Code quality
    • Similar objects
    • Linguistic code smells
    • Name molds = patterns in which elements in a variable name are typically combined

Linguistic Code Smells defined by Arnaudova

  • Methods that do more than they say
  • Methods that say more than they do
  • Methods that do the opposite of what they say
  • Identifiers that have more than what they say
  • Identifiers that have less than what they say
  • Identifiers that have the opposite of what they say

Linguistic Anti-Pattern Detector (LAPD) for Java developed by Arnaoudova

Refactor

  • Similar sounding objects
    • Genuinely different rename to minimize overlap
    • The same merge into one object
  • Vague-sounding names
  • Smelly names rename, or reimplement

Contextual Examination

What are the important programming concepts in this code base? How do they relate to the domain and to the structure of the code?

  • Names and concepts
  • Behavior
  • Syntactics
  • Roles of Variables Framework by Jorma Sajaniemi at the University of Eastern Finland
  • Dependencies
    • What are imported
    • Where are they used

Roles of Variables Framework These 11 roles cover 99% of programs

that novices will write:

  • Fixed value
  • Stepper
  • Flag
  • Walker
  • Most recent holder
  • Most wanted holder
  • Gatherer
  • Container
  • Follower
  • Organizer
  • Temporary

After Reading

  • Explain programming concepts
  • Document roles of variables

Context Examination

  • Read plain text strings
  • Read code comments
  • Search outside the code
    • Create a search plan — this is better than randomly browsing pages
    • Try a generic search engine
    • Try a code search engine as well — search on GitHub
  • Create a concept map as you go