Code Reading in Practice
5 Dimensions of Code Reading Structure, Domain, Concepts, Context,
and Collaboration.
Structural Examination
When you have only a little bit of prior knowledge, each line of code, or even each individual element might take up one spot, but if you know a lot about the code base, each method might hold a spot, causing you to be able to process a lot more. This process is called chunking.
One of the first techniques to apply when reading unfamiliar code is to understand its division into components; this includes classes and methods in an Object-oriented project, or functional components in a more functional style program.
Structural Components and Their Relationship
- Elements
- Object components
- Functional components
- Singular components
- Folded view
- Shortcuts help
- Size matter as well. Notice the line numbers.
- Structure view
When understanding new information, your brain always tries to relate new information to things you already know or understand. As such, your brain forms a mental model of the information that you can use to reason about the new information. Explicitly making connections between information visible will support your brain in easier understanding.
- Relationship between components
- Inheritance
- Contains as method
- Contains as field
- Used in body
- Calls and Overrides
- Use as parameter, or Calls on static method
- Find Usage feature for detecting dependencies
- Informal relationship
- Naming conventions
- Similarities
- Same parameters used frequently = Data Clump
- Go for a timeboxed top-to-bottom read
Entry Point
- Finding entry points
- Guess by the name
- Takes few arguments, but calls a lot functions
- Place breakpoints and read from the entry point
- Visualizing the system - draw your own UML diagrams
After the Read
- Expand documentation
- Improve comments
- Discuss code quality
- Structural code smells
- Linguistic code smells
- Is there a historical reason or a good justification?
- Inappropriate intimacy or Feature envy?
- Message Chains
- Merge small methods?
- Eliminate long chains?
- In multi-lingual code base, static source code analysis might not always find dependencies.
- Refactor
- Large class → divide into subgroups
- Small class → merge into other classes
- Data class → merge into other classes that frequently access the data
- Long parameter list → moving some into class fields? Group parameters into larger objects?
- One call functions → premature generalization, this causes more cognitive load. Maybe inline these methods?
- Dead code
Understanding the Business Domain of Your Code in More Depth
- Domain model = behavior + data
Learning from Variable Names
- List all variable names in use
- Quantifiers = refer to quantities
- Programming concepts = refer to a specific concept in a language
- Domain concepts = refer to the domain
- Text analysis tools: Word Cloud
- Split the words and examine the domain concepts
- Determine the concept
- Objects or objects attributes?
- Type of the component?
- Analyze in network
- Define the concepts and behaviors as precisely as possible
After Reading
- Documenting the Ubiquitous Language
- Code quality
- Similar objects
- Linguistic code smells
- Name molds = patterns in which elements in a variable name are typically combined
Linguistic Code Smells defined by Arnaudova
- Methods that do more than they say
- Methods that say more than they do
- Methods that do the opposite of what they say
- Identifiers that have more than what they say
- Identifiers that have less than what they say
- Identifiers that have the opposite of what they say
Linguistic Anti-Pattern Detector (LAPD) for Java developed by Arnaoudova
Refactor
- Similar sounding objects
- Genuinely different → rename to minimize overlap
- The same → merge into one object
- Vague-sounding names
- Smelly names → rename, or reimplement
Contextual Examination
What are the important programming concepts in this code base? How do they relate to the domain and to the structure of the code?
- Names and concepts
- Behavior
- Syntactics
- Roles of Variables Framework by Jorma Sajaniemi at the University of Eastern Finland
- Dependencies
- What are imported
- Where are they used
Roles of Variables Framework These 11 roles cover 99% of programs
that novices will write:
- Fixed value
- Stepper
- Flag
- Walker
- Most recent holder
- Most wanted holder
- Gatherer
- Container
- Follower
- Organizer
- Temporary
After Reading
- Explain programming concepts
- Document roles of variables
Context Examination
- Read plain text strings
- Read code comments
- Search outside the code
- Create a search plan — this is better than randomly browsing pages
- Try a generic search engine
- Try a code search engine as well — search on GitHub
- Create a concept map as you go