Compilers Overview

Tool Chain

Preprocessor

  • Takes source code and header files.
  • #pragma, #define, #include are preprocessed and inserted into source code.
  • cpp is the default preprocessor.

Compiler

  • cc1 is the default compiler.
  • Outputs assembly source code ending with .s (as it was the only source back then)

Assembler

See assembly for assembly language.

  • as is the default assembler.
  • Generates Object Code (.o files)

Linker

  • Takes libraries such as libc.a, libm.a and object code.
  • .a file is actually a bunch of .o files, in static linking, only needed functions are included.
  • Dynamic libraries (libc.so)
  • There is a small overhead time for dynamically linked programs — since the program needs to find the location of the shared libraries.
  • Generates executables.

Within Compiler

  • Input = Source Code Raw Bytes (individual characters/character stream)
  • Scanner
    • Reads raw bytes into tokens
    • Regular Expression, which can be turned into NFA, then to DFA.
    • The tools used is Flex
  • Parser
    • Produces an AST (= abstract syntax tree)
    • CFG
      • This can be expansive, we need to narrow down.
      • LL and LR grammar, which are converted to FA + Stack, with Bison.
  • Semantic Routines
    • Make sure that the “sentence” is valid.
  • Optimizer
  • Code Generator Target Assembly Language.