Skip to contents

COCOMO cost estimates

scc includes a COCOMO 81 model that estimates project effort and cost from SLOC. scc only emits COCOMO output via its tabular formatter — never in its JSON output — so glockr opts in by running scc a second time in tabular mode, scraping the COCOMO block, and parsing it into a tibble.

Opting in with cocomo = TRUE

By default scc() returns the per-language tibble and ignores all COCOMO-related arguments. Pass cocomo = TRUE to also compute and return the COCOMO summary:

rlang_cocomo <- scc(rlang_path, cocomo = TRUE)
#  # A tibble: 11 × 10
#     language   files lines  code comments blanks complexity weighted_complexity
#     <chr>      <int> <int> <int>    <int>  <int>      <int>               <dbl>
#   1 R            156 43171 27043    10926   5202       2239                8.28
#   2 Markdown      52 13437 11344        0   2093          0                0   
#   3 C             69 13173 10016      864   2293       1827               18.2 
#   4 C Header      70  8272  5322     1750   1200        677               12.7 
#   5 YAML           8   635   505       26    104          0                0   
#   6 C++            2    25    21        0      4          0                0   
#   7 C++ Header     1    26    21        0      5          0                0   
#   8 SVG           10    10    10        0      0          0                0   
#   9 Makefile       1    11     7        0      4          2               28.6 
#  10 License        1     2     2        0      0          0                0   
#  11 TOML           1     0     0        0      0          0                0   
#  # ℹ 2 more variables: bytes <int>, uloc <int>
#  # A tibble: 3 × 3
#    metric                    project_type value       
#    <chr>                     <chr>        <chr>       
#  1 Estimated Cost to Develop organic      $1,790,851  
#  2 Estimated Schedule Effort organic      17.16 months
#  3 Estimated People Required organic      9.27

The returned value is now a list (returned invisibly) with two tibbles:

names(rlang_cocomo)
#  [1] "scc"    "cocomo"
rlang_cocomo$cocomo |> 
  gt::gt() |> 
  gt::tab_header(
    title = "COCOMO Estimates", 
    subtitle = "rlang package", 
    preheader = "rlang package")
COCOMO Estimates
rlang package
metric project_type value
Estimated Cost to Develop organic $1,790,851
Estimated Schedule Effort organic 17.16 months
Estimated People Required organic 9.27

Customize the COCOMO model

The customization arguments only take effect when cocomo = TRUE:

  • avg_wage: annual salary in local currency
  • cocomo_project_type: “organic”, “semi-detached”, or “embedded”
  • eaf: effort adjustment factor
  • overhead: corporate overhead multiplier
  • currency_symbol: symbol shown in the value column
  • auto_print_scc: skip auto-printing the language tibble
embedded <- scc(rlang_path,
    cocomo              = TRUE,          # required to compute COCOMO
    avg_wage            = 120000L,        
    cocomo_project_type = "embedded",     
    eaf                 = 1.1,            
    overhead            = 2.0,            
    currency_symbol     = "$",            
    auto_print_scc      = FALSE)         
#  # A tibble: 3 × 3
#    metric                    project_type value       
#    <chr>                     <chr>        <chr>       
#  1 Estimated Cost to Develop embedded     $9,558,694  
#  2 Estimated Schedule Effort embedded     18.00 months
#  3 Estimated People Required embedded     26.55
embedded$cocomo |> 
  gt::gt() |> 
  gt::tab_header(
    title = "COCOMO", 
    subtitle = "embedded project", 
    preheader = "rlang package")
COCOMO
embedded project
metric project_type value
Estimated Cost to Develop embedded $9,558,694
Estimated Schedule Effort embedded 18.00 months
Estimated People Required embedded 26.55
organic <- scc(rlang_path,
    cocomo              = TRUE,
    avg_wage            = 140000L,
    cocomo_project_type = "organic",
    eaf                 = 1.1,
    overhead            = 2.0,
    currency_symbol     = "$",
    auto_print_scc      = FALSE)
#  # A tibble: 3 × 3
#    metric                    project_type value       
#    <chr>                     <chr>        <chr>       
#  1 Estimated Cost to Develop organic      $4,083,383  
#  2 Estimated Schedule Effort organic      17.80 months
#  3 Estimated People Required organic      9.83
organic$cocomo |> 
  gt::gt() |> 
  gt::tab_header(
    title = "COCOMO", 
    subtitle = "organic project", 
    preheader = "rlang package")
COCOMO
organic project
metric project_type value
Estimated Cost to Develop organic $4,083,383
Estimated Schedule Effort organic 17.80 months
Estimated People Required organic 9.83

Quieting the auto-print

When cocomo = TRUE, scc() prints both tibbles to the console as a side effect. Set auto_print_scc = FALSE and/or auto_print_cocomo = FALSE to suppress either one and just keep the returned list:

silent <- scc(rlang_path,
              cocomo            = TRUE,
              auto_print_scc    = FALSE,
              auto_print_cocomo = FALSE)

SLOCCount-compatible format

sloccount_format = TRUE reformats the upstream COCOMO block in SLOCCount-compatible form (useful for tooling that parses that output).

sloccount <- scc(rlang_path,
                 cocomo           = TRUE,
                 sloccount_format = TRUE,
                 auto_print_scc   = FALSE)
Total Physical Source Lines of Code (SLOC)                     = 54,291
Development Effort Estimate, Person-Years (Person-Months)      = 13.26 (159.10)
 (Basic COCOMO model, Person-Months = 2.40*(KSLOC**1.05)*1.00)
Schedule Estimate, Years (Months)                              = 1.43 (17.16)
 (Basic COCOMO model, Months = 2.50*(person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule)       = 9.27
Total Estimated Cost to Develop                                = $1,790,851
 (average salary = $56,286/year, overhead = 2.40)

The block doesn’t fit the standard metric / project_type / value shape, so it’s printed verbatim and $cocomo is NULL:

is.null(sloccount$cocomo)
#  [1] TRUE

Why not use COCOMO ???

scc()only gives us raw structural numbers (lines, code, complexity, bytes, etc.), but COCOMO II needs 22 judgement inputs we can’t derive from source code:

  • The current COCOMO project types ("organic" / "semi-detached" / "embedded") don’t translate to COCOMO II
  • COCOMO II uses 5 scale factors (PREC, FLEX, RESL, TEAM, PMAT) for organizational/process attributes and 17 effort multipliers (RELY, DATA, CPLX, RUSE, DOCU, TIME, STOR, PVOL, ACAP, PCAP, PCON, APEX, PLEX, LTEX, TOOL, SITE, SCED) on product, platform, personnel, and project context
  • Every one of these is a human rating (VL / L / N / H / VH / XH) about the team, requirements stability, reuse strategy, tool maturity, schedule pressure, etc.

The only one that might map would be CPLX (product complexity), but COCOMO II’s CPLX is architectural/algorithmic complexity, not McCabe’s cyclomatic.