October 27, 2009

Java Mystery: Directory Exists but Does Not?

Well, yesterday for me was another one of those days where you think your computer must be fooling you... There is an issue with a DirectoryCleaner that works for a subproject but does not when called from the parent. But first things first.

The openArchitectureWare Part

We are using oAW (yep, the new version which is part of Eclipse Modeling Project of Eclipse Galileo). Before generating the artifacts from the model, we are executing a directory cleaner that just rubs out the folders, for instance src/generated/java. The workflow file generate.mwe looks like this:

<property file="generateAll.properties"/>
<component class='org.eclipse.emf.mwe.utils.DirectoryCleaner' directory='${srcGenDir}'/>
<component file='.../generateAll.mwe' inheritAll="true">
<modelFile value='...' />

srcGenDir is a property that is defined in the included properties file, but that's irrelevant. The highlighted line configures the mentioned directory cleaner. The relevant code snippet is invokeInternal() method of org.eclipse.emf.mwe.utils.DirectoryCleaner which is part of Eclipse EMF frameworks:

protected void invokeInternal(final WorkflowContext model, 
final ProgressMonitor monitor, final Issues issues) {
if (directory != null) {
final StringTokenizer st = new StringTokenizer(directory, ",");
while (st.hasMoreElements()) {
final String dir = st.nextToken().trim();
final File f = new File(dir);
if (f.exists() && f.isDirectory()) {
... do the cleanup ...

Pretty simple, right? The directory attribute contains one or more directories (comma separated), hence a tokenizer is used to get each of them and to create a file. If this file exists and it's a directory, that will be erased.

The Maven Part

Of course we are using Maven to build the stuff. The Fornax Maven plugin is configured to call the workflow during Maven build. Moreover, the mentioned workflow is part of a subproject B which belongs to an outer multi-module project A.

Now, this is where the mystery begins... When I build B (the submodule), everything is fine and works like expected. However, when I build A (the parent project), B will be built in turn and its workflow is executed, but the directory is not cleaned up! You wouldn't expect this, right?

The Strange Part

In an attempt to find out what's going on, we put some debugging code into DirectoryCleaner. It turns out that the file f returns the same value for getAbsolutePath() in both cases, but f.exists() reveals false when the build is started from the parent project – hence the condition is not met and nothing will be cleaned up.

Unfortunately, you can't look any deeper into the native code that is called when determining if a file exists. So, in a kind of trial and error approach, we found out that using the following code fixes the issue:

protected void invokeInternal(final WorkflowContext model, 
final ProgressMonitor monitor, final Issues issues) {
if (directory != null) {
final StringTokenizer st = new StringTokenizer(directory, ",");
while (st.hasMoreElements()) {
final String dir = st.nextToken().trim();
final File f1 = new File(dir);
final File f = new File(f1.getAbsolutePath());
if (f.exists() && f.isDirectory()) {
... do the cleanup ...

That is: by creating another file that is using the absolute path of the first one and using this further on, everything is fine in both scenarios – whether called from parent project or submodule.

To be honest: I have no explanation for these findings. Why is File behaving differently? When calculating the exist flag, the code should take into account the file's absolute path, right? So, why is creation of another file based on the absolute path is fixing the issue? Any insights are deeply appreciated...!

October 16, 2009

Speeding Up Your System

We did a lot of interesting stuff lately, including upgrading Eclipse to the new Galileo release, and in turn upgrading oAW (openArchitectureware) to the new versions of Xtext, Xpand, Xcheck etc. which are now part of the Eclipse Modeling project. I will blog about all this later...

But, what really started to hurt us was the performance of our Windows XP based development laptops. They aren't really brand new ones, but not that old either. Nevertheless, they seem to have an issue with all those java, class and jar files involved when starting Eclipse, doing a "Clean Project", when using Maven to build the software etc.

IT department was not willing to provide us with new hardware (disk especially) at this time, and no, using Linux is no option either. Hence, we had to find out other areas of improvement by tuning our system.

Here is what we did to speed things up for Windows XP. Note that things might or might not be different for Windows Vista or Windows 7.

1. Disable Indexing Service

By default, there is a Microsoft "Indexing Service" running in your Windows system. According to msdn, this is "a base service for Microsoft Windows 2000 or later that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching."

Well, to be honest, I have never heard of that service before (and rarely use the search function of Windows), but it turned out to cause lots of harddisk traffic. So we decided to disable this service, which is recommended by some people.

Actually, there are several ways to do so (wihtout using Microsoft Management Console (MMC) with an appropriate snap-in):

  • Disable the Indexing Service in your list of local services.
  • In the Properties window of your local disk, remove option "Allow Indexing Service to index this disk for fast file searching".
  • Remove the function via the Control Panel > Add or Remove Programs > Add/Remove Windows Components > Uncheck "Indexing Service".

2. Don't Virus-Scan Java Stuff

Our anti-virus tool was configured to scan literally everything, including java, class and jar files. This seems to be exaggerated from security point of view, but IT folks said they could not configure the anti-virus to ignore a particular set of file extensions (like *.jar).

Hence, what we did instead was to exclude two folders from being scanned on the fly: javatools (where everything Java related is put, including Eclipse releases) and projects (where all our projects reside). Instead, these folders are now scanned once a week, which is acceptable for IT guys.

Know what? That speeded up starting time of Eclipse by factor 5!

Surprisingly, excluding the Maven local repository (located in your personal settings folder) from being scanned on the fly did not make much difference, so we didn't handle this folder any special.

3. Optimize Subversion

Are you using Subversion? If so, are you using TortoiseSVN, the Windows Shell Extension for Subversion? Well, in that case you will know and probably like the little overlay icons that are used to indicate the state of files and folders. This feature is recursive, whereby overlay changes in lower level folders are propagated up through the folder hierarchy so that you don’t forget about changes you made deep in the tree.

Starting with release 1.2, a new TSVNCache program is used to maintain a cache of your working copy status, providing much faster access to this information. Not only does this prevent explorer from blocking while acquiring status, but it also makes recursive overlays workable.

This is all nice, but there is a major drawback: TSVNCache by default looks for changes on all drives and in all folders, killing disk performance with all the I/O it's doing.

You can enable a TSVNCacheWindow showing all the folders being crawled by TSVNCache. To do so, open the Registry Editor, and create a new DWORD at HKEY_CURRENT_USER\Software\TortoiseSVN\CacheTrayIcon with value of 1. After that you have to restart TSVNCache which is easiest done by just killing the process, it will be automatically restarted when you do any TortoiseSVN operation. Now there should be a small tortoise icon in Windows tray area which opens the TSVNCacheWindow. Watch how TortoiseSVN scans files and folders whenever you write to a file...

Time to fix that! That should be quite easy if you're keeping all of your working copies below one specific folder (or a small set of folders), like we do. All you have to do is to setup TortoiseSVN to only scan your sourcefolder paths:

  1. Right-click in Explorer on any folder and select "TortoiseSVN > Settings...".
  2. In the Settings window's tree, click on the "Icon Overlays" entry.
  3. In the "Exclude Paths" input field, put C:\* to exclude the entire C drive. If you have more drives, exclude them all at the top level. Use newlines to separate the values.
  4. In the "Include Paths" input field, list all of the locations where your working copies are stored, again separated by newlines.
  5. Switch off "Network drives" option in "Drive Types" area.

All in all, your settings should now look like in the following screenshot. Thanks to Paraesthesia for this nice tip!


You won't believe what difference these three little tunings made to our system performance. The harddisk is not any more busy all the time, applications (like Eclipse) are starting much faster, and build time has decreased drastically. Not bad for not spending anything on new hardware! Well, next thing we will check is what effect a new (big, fast) hard disk will have... ;o)