Code analysis with SonarQube, jacoco and gradle

When you work on a Java project, you want to get an idea of your code quality.

Of course, “good” code doesn’t mean, the code is error-free, but on the other hand, if your code is seen as “bad”, you can be pretty sure, that it will become unmaintainable very soon.

Because of this, tools like SonarQube can be helpful to give an unbiased insight into how well your code might be, according to established coding standards.

First, you have to set up a SonarQube server, which is a very easy task, if you’re on an Ubuntu system:

Add the following line to your /etc/apt/sources.list:

deb http://downloads.sourceforge.net/project/sonar-pkg/deb binary/

and then run the well known and to-be-expected

apt-get update
apt-get install sonar

commands.

Assuming, that you already have got a PostgreSQL database running, create a user “sonar” with password “sonar” and enable the few PostgreSQL-related parts in /opt/sonar/conf/sonar.properties.

Finally, as root, start SonarQube with

/etc/init.d/sonar start

and maybe add it to /etc/rc.local

The next step is now, to prepare your project’s build.gradle script to ensure, that not only the SonarQube is filled with data, but also at least measures your test coverage.

The relevant parts are:

apply plugin: "sonar-runner"
apply plugin: "jacoco"

sonarRunner {
        sonarProperties {
                property "sonar.host.url", "http://localhost:9000"
                property "sonar.jdbc.url", "jdbc:postgresql://localhost:5432/sonar"
                property "sonar.jdbc.driverClassName", "org.postgresql.Driver"
                property "sonar.username", "sonar"
                property "sonar.password", "sonar"
                property "sonar.projectName", "rmmusic"
                property "sonar.jacoco.reportPath", "build/jacoco/test.exec"
                property "sonar.java.source property", "1.8"
        }
}

jacoco {
    reportsDir = file("build/tmp/jacoco.exec")
}

Additionally, log in as admin user into your SonarQube instance and in Settings->System->Update Center, add a few plugins:

  • Java
  • Checkstyle
  • Sonargraph
  • PMD
  • Timeline
  • Findbugs

and restart SonarQube.

As admin user, you should then set now a quality profile, e.g. the FindBugs profile

Now, when you run the gradle target sonarRunner, all those tests will be executed automatically and you’ll get detailed insights into your code and its quality.

A short look at Java 8 streams

With Java 8, among the new language feature of Lambdas, the new concept of streams was also introduced, and if you look at streams, you will certainly use Lambdas, too.

The advantage of streams is, that you increase the understandibility and readability of your code. And in theory, if you use use parallel streams the correct way, you can speed up your process, but from my observations, that won’t happen, if you use only small datasets and/or simple operations.

To show you how to use streams, let’s implement a small task:

Imagine, you have got a record collection system and want to calculate the value of your collection and a average price of each record, where you still know, how much you payed for it (that’s not neccessiarily the case for all of your records!).

In a traditional approach, you would implement it more or less like this:

List<Medium> media = mediumRepository.findAll();

double sumValue = 0;
long boughtMediaCount=0;
for (Medium medium : media) {
  if (medium.getBuyPrice() != null) {
    sumValue += medium.getBuyPrice();
    boughtMediaCount++;
  }
}

System.out.println("Total price="+String.format("EUR %.02f", sumValue));
System.out.println("Average price="+String.format("EUR %.02f", (sumValue / (double) boughtMediaCount)))

Now, let’s analyze, what we do here:

After retrieving the whole dataset a a list, we iterate over each element. In each iteration, we check, whether the property “buyPrice” is set, and if yes, we add that value to the totals and increase the counter for records, where we know the price. At the end, we want to get two values, the total price and the the average price.

In other words:

We look at each element (“stream“), only process one single property (“map“), only use those properties with a certain value (“filter“) and calculate a result (“collect“).

That description can now be nicely transformed into Java 8 code, which is almost identical to the non-technical description above:

List media = mediumRepository.findAll();

Averager averagePrice = media.stream().
    map(Medium::getBuyPrice).
    filter(v -> v != null).
    collect(Averager::new, Averager::accept, Averager::combine);

System.out.println("Total price="+String.format("EUR %.02f", averagePrice.getTotal()));
System.out.println("Average price="+String.format("EUR %.02f", averagePrice.getAverage()));

Isn’t that nice? No brace hell any more, no boring iterations.

Ok, you have to use an extra class, the Averager, which looks the following way:

public class Averager implements DoubleConsumer {
    private double total=0;
    private int count=0;

    public double getAverage() {
        return count>0? (total/(double)count) : 0;
    }

    public int getCount() {
        return count;
    }

    public void combine(Averager other) {
        total += other.total;
        count += other.count;
    }

    @Override
    public void accept(double value) {
        total += value;
        count++;
    }

    @Override
    public DoubleConsumer andThen(DoubleConsumer after) {
        return null;
    }

    public double total() {
        return total;
    }
}

For one single occurence, you will use a little more core here (a big little more), but even then, your readability and testability increases, and that’s, what finally counts.

A few final observations:

  • You can parallelize the work on your stream, if you use the parallel() method of the streaming API. But be warned, that like with every parallelization, there can be cases, where you actually might lose performance.
  • The order of invoking the stream methods is important:
    On my example, using filter() before map() is faster on an sequentially executed stream, but equal to slower on a parallel executed stream
  • On small datasets (in my benchmarks, I worked with roughly 1000 items), the traditional approach with the for-loop, is much faster, than working with streams. I don’t know, how much that changes on larger datasets and/or more complex items.