Debugging Maven Unit Test

I don’t trust debugging Maven unit tests straight using Eclipse’s JUnit plugin, sometime it’s buggy and the classpath don’t match with Maven.

Here’s how you can attach eclipse debugger straight from Maven process. First setup few breakpoints of the suspicious code as per normal, and setup a Maven run configuration like this:

maven-debug-runconfig

When you run this configuration, Maven will halt right before unit tests are run:

maven-debug-console

Now create a Remote Java Application Debug configuration pointing to localhost port 5050

maven-debug-debugconfig

Happy debugging!

What’s The Deal With Half Up and Half Even Rounding?

java.math package came with several rounding mode but there are 2 quite interesting ones: HALF_UP and HALF_EVEN rounding.

HALF_UP

This is basically your elementary school rounding. If the fractions to be rounded are equidistant from its neighbor, then round them into the upper neighbour. In other words, if we’re rounding 1 digit after decimal, then if it ends with .5 just add .5. For example:

Fractional Number Rounded
0.1 0
0.5 1
1.3 1
1.5 2

HALF_EVEN

Similar like HALF_UP, except if the fraction is equidistant, round them into nearest even neighbor. For example:

Fractional Number Rounded
0.1 0
0.5 0
1.3 1
1.5 2

Why Bother With HALF_EVEN?

Why don’t we just stick with what’s learned in elementary school? Well here’s one good reason: accumulative error. Error here means “How much did we lose/gain by rounding the number?”. Let’s take a look again to both table with its rounding error displayed

Fractional Number HALF_UP rounding HALF_UP rounding error HALF_EVEN rounding HALF_EVEN rounding error
0.0 0 0.0 0 0.0
0.1 0 -0.1 0 -0.1
0.2 0 -0.2 0 -0.2
0.3 0 -0.3 0 -0.3
0.4 0 -0.4 0 -0.4
0.5 1 0.5 0 -0.5
0.6 1 0.4 1 0.4
0.7 1 0.3 1 0.3
0.8 1 0.2 1 0.2
0.9 1 0.1 1 0.1
1.0 1 0.0 1 0.0
1.1 1 -0.1 1 -0.1
1.2 1 -0.2 1 -0.2
1.3 1 -0.3 1 -0.3
1.4 1 -0.4 1 -0.4
1.5 2 0.5 2 0.5
1.6 2 0.4 2 0.4
1.7 2 0.3 2 0.3
1.8 2 0.2 2 0.2
1.9 2 0.1 2 0.1
2.0 2 0.0 2 0.0
Total 1 0 0

As you can see the accumulative errors for HALF_UP is incrementally higher whereas HALF_EVEN averages out. This is why HALF_EVEN is often called “Banker’s rounding” because given a large amount of data the bank should not gain/loss money because of rounding.

More Surprises With printf

Don’t yet assume all programming language defaults into HALF_EVEN, try below examples of printf in your shell:

$ printf "%.5f" 1.000015
1.00002
$ printf "%.5f" 1.000025
1.00002
$ printf "%.5f" 1.000035
1.00004
$ printf "%.5f" 1.000045
1.00005

Wait.. what? Isn’t 1.000045 supposed to be rounded to 1.00004? Well in floating point realm the reality is more complicated than that, taking into account floating point is often never accurate in the first place.

Try printing 1.000045 with long enough digits after decimal:

$ printf "%.30f" 1.000045
1.000045000000000072759576141834

Now you can see computers can’t always store accurate value of real numbers in floating point types. (And you should now see why it’s rounded into 1.00005)

Here’s some reading if you’re interested in this problem.

WordPress Config For Nginx PHP-FPM

Here’s how you can setup nginx and php-fpm for use with wordpress. It is assumed you have nginx and and php-fpm installed.

Goals / Environment

  • Domain mycooldomain1.com pointing to (http only)
  • wordpress installed on /usr/share/mycooldomain1.com_wordpress
  • php-fpm listening on localhost port 9000

On your nginx.conf (typically located at /etc/nginx/nginx.conf), add following server element somewhere down the inner-bottom of http:

http {
  ...
  server {
    listen      80;
    server_name mycooldomain1.com;
    access_log  /var/log/nginx/mycooldomain1.com.access.log main;
    error_log   /var/log/nginx/mycooldomain1.com.error.log;
    root        /usr/share/mycooldomain1.com_wordpress;
    index       index.php index.html index.htm;

    # This location block matches everything, but subsequently the first matching
    # regex location block (the ones starting with ~) will be used instead if any.
    # The try_files statement below will check if the request matches any file or
    # directory and serve it directly, otherwise it will redirect into /index.php
    # for nice permalink URLs processing
    location / {
      try_files $uri $uri/ /index.php?$args;
    }

    # Avoid logging these extensions and set maximum cache expiry. This is as
    # recommended by http://codex.wordpress.org/Nginx
    location ~* ^.+\.(ogg|ogv|svg|svgz|eot|otf|woff|mp4|ttf|rss|atom|jpg|jpeg|gif|png|ico|zip|tgz|gz|rar|bz2|doc|xls|exe|ppt|tar|mid|midi|wav|bmp|rtf)$ {
      access_log off; log_not_found off; expires max;
    }

    # Any requests ending with .php will be processed using this block instead of above,
    # including request to root (http://mycooldomain1.com/). This is true because we've set 
    # index.php to be one of the index file searched above
    location ~ \.php$ {
      try_files      $uri =404;
      fastcgi_pass   localhost:9000;
      fastcgi_index  index.php;
      fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
      include        fastcgi_params;
    }
  }
}

The directive include fastcgi_params imported directives from external file. fastcgi_params is a default file which comes with nginx / php-fpm installation, in case you don’t have it here’s the default fastcgi_params I have on my box.

Restart both nginx and php-fpm once you’ve updated the config:

sudo service nginx restart
sudo service php-fpm restart

And yes nginx config is pretty mundane to learn and debug. I’ve spent few hours reading the official doc as well as various posts to make it work. One most important paragraph of the official doc is probably this (from http_core_module location directive)

A location can either be defined by a prefix string, or by a regular expression. Regular expressions are specified with the preceding “~*” modifier (for case-insensitive matching), or the “~” modifier (for case-sensitive matching). To find location matching a given request, nginx first checks locations defined using the prefix strings (prefix locations). Among them, the location with the longest matching prefix is selected and remembered. Then regular expressions are checked, in the order of their appearance in the configuration file. The search of regular expressions terminates on the first match, and the corresponding configuration is used. If no match with a regular expression is found then the configuration of the prefix location remembered earlier is used.

Enjoy!

Nginx Virtual Host and Reverse Proxy

Firstly, there’s no such thing as Virtual Host in nginx. , Virtual Host is an apache terminology.

Scenario:

  • Domain mycooldomain1.com pointing to VPS server
  • Nginx running on port 80
  • Tomcat running on port 8080
  • Only inbound TCP traffic to port 80 is allowed through firewall

In your nginx.conf (mine’s on /etc/nginx/nginx.conf), add following inside the http element:

http {
  ...
  server {

    server_name mycooldomain1.com;
    access_log /var/log/nginx/mycooldomain1.com.access.log main;
    error_log /var/log/nginx/mycooldomain1.com.error.log;

      location / {
        proxy_pass http://localhost:8080;
        proxy_redirect default;
        proxy_cookie_domain localhost mycooldomain1.com;
      }
  }
  ...
}

The server_name and location / expression matches request to http://mycooldomain.com while proxy_pass sets the backend where the response will be fetched from.

proxy_redirect ensures any 3xx redirects and Location: header on response is rewritten into mycooldomain1.com.

If your backend has different context root (eg: http://mycooldomain.com to http://localhost:8080/someapp) you will also need to adjust the cookie path

proxy_cookie_path /someapp/ /;

DOS Script To Cap Log Files

I have this common problem where my servers generates about 5-6 megs of log file every day filling up the disk, and following dos script to delete old logs have been quite handy

:: Iterate files alphabetically at specified folder and keep a maximum of N to 
:: avoid filling disk space. Called by run.bat
:: This script takes 3 arguments
 off
setlocal ENABLEDELAYEDEXPANSION

set FOLDER=%1
set FILEPREFIX=%2
set LIMIT=%3

echo Accessing %FOLDER%
cd %FOLDER%

set /a COUNT=0
for /f "tokens=*" %%f in ('dir /b /a-d-h-s /o-n %FILEPREFIX%*') do (
          set /a COUNT=COUNT+1
          if !COUNT! GTR %LIMIT% (
                        echo deleting %%f
                        del /f /q %%f   
                )
        )
endlocal

I place the script above in C:filecleanerdeletefiles.bat. This script iterates a folder alphabetically and keep only specified amount of files. I then created a second script to be called by task scheduler

:: Called by task scheduler. This script calls deletefiles.bat over each file path prefix
::
:: Example:
:: deletefiles.bat c:logrotate-test access.log 3 
:: means keep maximum 3 files (sorted alphabetically) starting with access.log
 off

deletefiles.bat "C:apache-tomcat-6.0.35" catalina 7
deletefiles.bat "C:apache-tomcat-6.0.35" commons-daemon 7
deletefiles.bat "C:apache-tomcat-6.0.35" host-manager 7
deletefiles.bat "C:apache-tomcat-6.0.35" localhost 7
deletefiles.bat "C:apache-tomcat-6.0.35" manager 7
deletefiles.bat "C:apache-tomcat-6.0.35" tomcat6-stderr 7
deletefiles.bat "C:apache-tomcat-6.0.35" tomcat6-stdout 7

I put this script in C:filecleanerrun.bat. As you can see it calls deletefiles.bat several times to clean my tomcat log files. The script uses following arguments:

  1. The folder where the log files exist
  2. The prefix of the log files
  3. How many latest files should be kept

It’s important to note this will only work if the files alphabetical ordering implies their age. This typically works best if the files pattern has a yyyy-MM-dd format (or similar at the end):

stdout.2013-12-04.log
stdout.2013-12-05.log
stdout.2013-12-06.log
stdout.2013-12-07.log

Finally to run this automatically every 3.30am in the morning I created a task scheduler with following action:

filecleaner

Financial Time in Java

Yes dealing with time is always one good source of confusion. If you deal with financial application, many uses New York City as a guideline. A trading platform I’m working at on a daily basis uses NY+7 time zone such that 5pm in New York coincide with midnight. This is how you can format current NY+7 time (keep in mind USA do have DST and your local time zone might/not have it):

TimeZone nyPlus7 = TimeZone.getTimeZone("America/New_York");
nyPlus7.setRawOffset(nyPlus7.getRawOffset() + 7*3600*1000);
DateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
df.setTimeZone(nyPlus7);
String nyPlus7TimeNow = df.format(new Date());

And here’s a unit test proof the above method indeed works:

String pattern = "yyyy-MM-dd HH:mm:ss";
Date expected, actual;

// Setup UTC and NY+7 time zones
TimeZone utcTz = TimeZone.getTimeZone("UTC");
TimeZone nyp7Tz = TimeZone.getTimeZone("America/New_York");
nyp7Tz.setRawOffset(nyp7Tz.getRawOffset() + 7*3600*1000);

// Setup DateFormat for parsing
DateFormat utc = new SimpleDateFormat(pattern);
utc.setTimeZone(utcTz);
DateFormat nyp7 = new SimpleDateFormat(pattern);
nyp7.setTimeZone(nyp7Tz);

// US DST off, NY is UTC-5, NY+7 is UTC+2
expected = utc.parse("2014-03-09 06:59:59");
actual = nyp7.parse("2014-03-09 08:59:59")
Assert.assertEquals(expected, actual);

// US DST on, NY is UTC-4, NY+7 is UTC+3
expected = utc.parse("2014-03-09 07:00:00");
actual = nyp7.parse("2014-03-09 10:00:00")
Assert.assertEquals(expected, actual);

It’s important to understand the best practice is to never store time data in string. Stick to java Date object or alike. Only format your time to string whenever you need to present it visually

hMailServer for Outbound Only SMTP Server

If you ever needed to write program that sends email, most likely you’ll need a SMTP server. Here’s how you can configure one on a Windows box using hMailServer.

New Domain

After downloading and installing, you need to add a new domain to hMailServer. In my case I will not be using hMailServer to accept incoming email, hence I did not put the company’s email domain. Doing so will cause email to your colleague to be routed locally and likely fails.

So go ahead add a new domain, and just give it the local machine name (eg: devbox01.local). You have to pick a name that resembles an actual domain (with a dot and suffix), otherwise hMailServer will rejects it.

New Account

Once you’ve setup the domain, create a new account

hmail

Set a password, and that’s it you’re done. You can now use the SMTP server for outbound email

  • Username:
  • Password: whatever password you put in
  • SMTP host: devbox01
  • SMTP port: 25

Important

Now what’s left to do is configuring firewall. If you program runs on the same box you might not need to do anything. However it’s good to check that no outside traffic from internet can connect to port 25 so no-one can abuse your SMTP server.

And as a last word of warning, do not assume all mails will be delivered. This SMTP setup is very basic. Depending on the content you send, SPF, reverse DNS entry, spam filtering of receipient, and gazillion other things, your email might not go through

 

About Apache Compression and Content-Length Header

Just resolved an interesting problem today, one of our code breaks because the response header set by the web server did not include Content-Length.

Spent quite a while investigating and turns out this is due to gzip compression. As seen below Content-Encoding is gzip and I think this causes Content-Length to be omitted.

apache-resp-headers

Gzip compression can be disabled on apache via .htaccess config. In my case I disabled all compression to swf file by adding following configuration


  SetEnv no-gzip 1

Using File Protocol to Deploy Maven Project To Windows Shared Folder

Our development team is fairly small and we’re not at the point where we need Nexus yet, so I decided we can try using a simple Windows server share as our internal Maven repository.

On each of our shared Maven project pom.xml we add following distributionManagement configuration:


  
    enfinium
    Enfinium Internal Repository
    file://mavenrepo/maven_repository
  

Where //mavenrepo/maven_repository is the Windows server share we’ve setup. We’ve made sure each team member has correct read/write permission.

However every time we deploy Maven would say everything is successful but the file is nowhere to be found on the remote repository. (We’re using Maven 3.1.0

Turns out with file protocol and Windows server share, this is the syntax that works for us (yes Maven just fail silently and said everything was SUCCESSFUL)

file:////mavenrepo/maven_repository

Generating Visual C++ Crash Dump For Debugging

Detecting cause of problem that occurs only in production environment is hard. For Visual C++ created native windows application, this can be done using DebugDiag tool.

Download and install the tool on the production server, run it, and create a Crash rule type

vcppdebug1

Select a specific process target:

vcppdebug2

And search for your process

vcppdebug3

Tick This process instance only if multiple processes with same name are running but you’re only interested in one.

Click next and accept all defaults. Then when the crash occurs, you will get a .dmp file on C:\Program Files\DebugDiag\Logs\Crash rule for process id NNNN folder.

This .dmp file can be copied into your PC and opened in Visual Studio for stack trace analysis. Don’t forget you need to tell Visual Studio where to find the symbol (.pdb) file.

Select Set symbol paths on Visual Studio Action bar and add the folder containing your .pdb file.

vcppdebug4

Then start the debugging session by selecting Debug with Native Only. Visual Studio will tell you the stack trace when the crash happens.

Thanks to Ganesh R for providing this solution on Stack Overflow.

Also checkout Microsoft’s documentation on DebugDiag.

Gerry's software development journey of trial, errors and re-trials