Clean Code

How to identify a God Class using NDepend

Most of the static analysis tool in the .Net world report isolated code metrics. While this is useful, I would like to be able to detect coarse grained code smells. Being able to correlate several metrics to identify design disharmonies allows you to treat a problem holistically. In this blog post we’ll see how we can use NDepend to detect potential God Classes.

God Class Detection Strategy

A God Class is a class that centralizes the intelligence in the system. It is large, complex, has low cohesion and uses data from other classes. Object-Oriented Metrics in Practice, by Michele Lanza and Radu Marinescu, proposes the following detection strategy for a God Class:

(ATFD > Few) AND (WMC >= Very High) AND (TCC < One Third)

This detection strategy uses three metrics:

  • ATFD – Access To Foreign Data – to measure how many foreign attributes the class is using
  • TCC – Tight Class Cohesion – to measure class cohesion
  • WMC – Weighted Method Count – to measure class complexity

This detection strategy uses three types of thresholds:

  • ATFD uses a Generally-Accepted Meaning Threshold. FEW is defined between 2 and 5.
  • WMC uses a Statistics-Based Threshold. For these types of thresholds, a large number of projects needs to be analyzed. The authors of Object-Oriented Metrics in Practice analyzed 45 Java projects and extracted Low, Average, High and Very High thresholds for some basic metrics. The Very High threshold for WMC is 47.
  • TCC uses a Common Fraction Threshold. One Third is 0.33.

Metrics Definitions

Let’s go over the definitions for the used metrics and how to implement them with NDepend. For a more detailed definition, be sure to check Appendix A.2 of Object-Oriented Metrics in Practice. If you’re not familiar with CQLinq, check out the NDepend documentation or my blog post on how to query your code base.

ATFD – Access To Foreign Data

This metric measures the number of attributes from unrelated classes that are accessed directly or through accessor methods.

// <Name>ATFD</Name>
// ** Helper Functions **
let isProperty = new Func<ICodeElement, bool>(member => 
 member.IsMethod && 
 (member.AsMethod.IsPropertyGetter || member.AsMethod.IsPropertySetter)) 

let classHierarchyFor = new Func<IType, HashSet<IType>>(t => 
 t.BaseClasses.Append(t).ToHashSet())

// ** Metric Functions **
let atfdForMethod = new Func<IMethod, int>(m => 
 m.MembersUsed.Where(member=> 
   !classHierarchyFor(m.ParentType).Contains(member.ParentType) &&
   (isProperty(member) || member.IsField))
 .Count())

let atfdForClass = new Func<IType, int>(t => 
 t.Methods.Where(m => !m.IsAbstract)
  .Select(m => atfdForMethod(m))
  .Sum())

// ** Sample Usage **
from t in JustMyCode.Types
let atfd = atfdForClass(t)
orderby atfd descending 
select new { t, atfd }

WMC – Weighted Method Count

This metric measures the complexity of a class. This is done by summing the complexity of all methods of a class. McCabe’s Cyclomatic Complexity is used to measure the complexity of a method.

// <Name>WMC</Name>
let wmcFor = new Func<IType, int>(t => 
 t.MethodsAndContructors
  .Select(m => (int) m.CyclomaticComplexity.GetValueOrDefault())
  .Sum())

// ** Sample Usage **
from t in JustMyCode.Types
let wmc = wmcFor(t)
orderby wmc descending 
select new { t, wmc }

TCC – Tight Class Cohesion

This metric measures the cohesion of a class. It’s computed as the relative number of method pairs of a class that access in common at least one attribute of the measured class.

Writing a NDepend CQLinq query for TCC proved to be a little trickier. I wanted to extract the metric definition as a function (like I did for ATFD). Here is the query:

// <Name>TCC</Name>
// ** Helper Functions **
let fieldsUsedFromParentClass = new Func<IMethod, HashSet<IField>>(m => 
 m.FieldsUsed.Where(f => f.ParentType == m.ParentType)
 .ToHashSet())

let methodsToConsiderFor = new Func<IType, IEnumerable<IMethod>>(t =>
 t.Methods.Where(m => !m.IsAbstract))

let pairsFrom = new Func<IEnumerable<IMethod>, IEnumerable<HashSet<IMethod>>>(methods => 
 methods.Select((m1, i) => 
   methods.Where((m2, j) => j > i).Select(m2 => new HashSet<IMethod> {m1, m2}))
 .SelectMany(_ => _))

let cohesivePairsFrom = new Func<IEnumerable<HashSet<IMethod>>, IEnumerable<HashSet<IMethod>>>(pairs => 
 pairs.Where(p => fieldsUsedFromParentClass(p.First()).Overlaps(fieldsUsedFromParentClass(p.Last()))))

//number of combinations of all methods, taken 2 at a time, without repetition
let numberOfPairsFor = new Func<int, int>(n =>
 (n * (n-1)) / 2)

// ** Metric Functions **
let tccFor = new Func<IType, double>(t => 
 (double) cohesivePairsFrom(pairsFrom(methodsToConsiderFor(t))).Count()
 / numberOfPairsFor(methodsToConsiderFor(t).Count()))

// ** Sample Usage **
from t in JustMyCode.Types
let tcc = tccFor(t)
orderby tcc
select new { t, tcc }

This query has the following issues:

  • pairsFrom returns the method pairs in a HashSet. I wanted to use an anonymous object (e.g. new {m1, m2} at line 9), but couldn’t figure it out.
  • The methodsToConsiderFor(t) filter is evaluated twice (lines 24 and 25). This is because CQLinq supports only single statement queries.

Another option would be to compute the pairs in the body of the main query, as follows:

// <Name>TCC</Name>
// ** Helper Functions **
let fieldsUsedFromParentClass = new Func<IMethod, HashSet<IField>>(m => 
 m.FieldsUsed.Where(f => f.ParentType == m.ParentType)
 .ToHashSet())

// ** Sample Usage **
from t in JustMyCode.Types
let methodsToConsider = t.Methods.Where(m => !m.IsAbstract)
let pairs = methodsToConsider.Select((m1, i) => 
 methodsToConsider.Where((m2, j) => j > i).Select(m2 => new {m1, m2}))
 .SelectMany(_ => _)

let cohesivePairs = pairs.Where(p => 
 fieldsUsedFromParentClass(p.m1).Overlaps(fieldsUsedFromParentClass(p.m2)))

let pairsCount = pairs.Count()
let tcc = pairsCount == 0 ? 1 : (float) cohesivePairs.Count() / pairsCount
orderby tcc
select new { t, tcc }

This version has the advantage that we can easily tackle the case when the class doesn’t have any methods (and pairsCount would be 0).

Putting it all together

Now that we know how to compute each of the required metrics, let’s see how the detection strategy looks like:

// <Name>God Class</Name>
warnif count > 0
// *** ATFD ***
// ** Helper Functions **
let isProperty = new Func<ICodeElement, bool>(member => 
 member.IsMethod && 
 (member.AsMethod.IsPropertyGetter || member.AsMethod.IsPropertySetter)) 

let classHierarchyFor = new Func<IType, HashSet<IType>>(t => 
 t.BaseClasses.Append(t).ToHashSet())

// ** Metric Functions **
let atfdForMethod = new Func<IMethod, int>(m => 
 m.MembersUsed.Where(memeber => 
    !classHierarchyFor(m.ParentType).Contains(memeber.ParentType) &&
    (isProperty(memeber) || memeber.IsField))
 .Count())

let atfdForClass = new Func<IType, int>(t => 
 t.Methods.Where(m => !m.IsAbstract)
  .Select(m => atfdForMethod(m))
  .Sum())

// *** WMC ***
let wmcFor = new Func<IType, int>(t => 
 t.MethodsAndContructors
  .Select(m => (int) m.CyclomaticComplexity.GetValueOrDefault())
  .Sum())

// *** TCC ***
// ** Helper Functions **
let fieldsUsedFromParentClass = new Func<IMethod, HashSet<IField>>(m => 
 m.FieldsUsed.Where(f => f.ParentType == m.ParentType)
  .ToHashSet())

let methodsToConsiderFor = new Func<IType, IEnumerable<IMethod>>(t =>
 t.Methods.Where(m => !m.IsAbstract))

let pairsFrom = new Func<IEnumerable<IMethod>, IEnumerable<HashSet<IMethod>>>(methods => 
 methods.Select((m1, i) => 
   methods.Where((m2, j) => j > i).Select(m2 => new HashSet<IMethod> {m1, m2}))
 .SelectMany(_ => _))

let cohesivePairsFrom = new Func<IEnumerable<HashSet<IMethod>>, IEnumerable<HashSet<IMethod>>>(pairs => 
 pairs.Where(p => fieldsUsedFromParentClass(p.First()).Overlaps(fieldsUsedFromParentClass(p.Last()))))

//number of combinations of all methods, taken 2 at a time, without repetition
let numberOfPairsFor = new Func<int, int>(n =>
 (n * (n-1)) / 2)

// ** Metric Functions **
let tccFor = new Func<IType, double>(t => 
 (double) cohesivePairsFrom(pairsFrom(methodsToConsiderFor(t))).Count()
 / numberOfPairsFor(methodsToConsiderFor(t).Count()))

// ** Thresholds **
let Few = 5
let OneThird = 0.33
let wmcVeryHigh = 47

// ** Detection Strategy **
from t in JustMyCode.Types
let atfd = atfdForClass(t) 
let wmc = wmcFor(t)
let tcc = tccFor(t)

where 
 // Class uses directly more than a few attributes of other classes
 atfd > Few &&
 // Functional complexity of the class is very high
 wmc >= wmcVeryHigh &&
 // Class cohesion is low
 tcc < OneThird

select new { t, atfd, wmc, tcc }

Conclusion

This detection strategy can pick up potential God Classes. I think this is a great example of how several combined metrics can help us identify design flaws. NDepend makes it easier to implement, through CQLinq. If you want to find out more about metrics and detection strategies, check out Object-Oriented Metrics in Practice.