D’oh! My Vagrant NFS is slow…

Vagrant is great for various reasons. One of them is that you can describe, version and share your virtual development environments as code. By doing this, you can make sure your setup and the one of your colleagues are identical, which avoids “works on my machine” bugs and discussions. Or it helps you to be able to develop two projects without having to fiddle with configs, for example if you want to use https/443 in two projects, you don’t have to reconfigure your web server every time you switch from project A to project B.

Unfortunately, the performance within the box, especially shared file system related aspects between the host and guest system, is often unsatisfactory. This blog post discusses options which help to optimize the performance.

Most likely you have encountered “<foo> is slow” and “bad performance” all too often as a bug report, where you asked yourself: “what do they consider ‘too slow’? How to reproduce? wtf?”. Therefore this blog post will also provide some facts and figures to de-mystify the  vague description “slow”.

What’s up?

At inovex we use vagrant for several purposes, for example

  • creating infrastructure as code dev and test environments,
  • setting up clustered dev boxes,
  • (pre-)sales demos and showcases,
  • dev environments for our projects, which can’t be disclosed for obvious reasons.

Especially when one of us works on several projects at a time or when a project sleeps for a while and needs to be woken up again later, it is much more efficient and convenient just to do a “vagrant up”, compared to setting up the dev environment manually on localhost.

In this case we also want to have the build&deploy chain in vagrant. The “vagrant up” would not be whole if you still had to manually configure your build tools (e.g. maven, gradle, grunt, gulp) on your host. Given that most of us have highly customized dev environments with respect to the IDE (sublime, intelliJ, eclipse), customizing (.vimrcs, .emacs) or VCS clients (git cli, tortoise git or svn, etc.), editing source code is generally done on the host. Therefore files need to be exchanged between host and guest. As far as I know shared folders are the most widely used approach (ssh, sftp or even ftp (sigh) seem to be legit and used alternatives out there too).

Unfortunately you might encounter performance problems very soon and Google will show up with NFS, NFS, NFS and resync and approx. 446,000 more hits. Even @mitchellh himself wrote a very good article on filesystem performance and he concludes that you should use VMWare if you can (instead of the default virtualbox provider) and that “if you have the option to use NFS, you should use it”. If you dig a little deeper you might even find valuable NFS mount options like 'vers=3'  and 'actimeo=2'  and the nice cachefilesd.

There was still one common case where NFS did not help us at all: When “grunt watch” was running in vagrant and a [CMD]+S had been fired in the IDE on the host, the resulting change in the webapp was not visible in the browser for ages. In this case 3-5 or sometimes even 20+ seconds are ages, I guess you agree.

Measuring Vagrant NFS

As you might have guessed from the introduction, stating “it is slow” and measuring performance with the second hand of your wrist watch is something I dislike. That is why I set up a small test environment. The setup is as follows:

  • the files are mounted into the vagrant box with vboxfs (vagrant+virtualbox default setup) and with nfs (needs to be configured explicitly)
    • config.vm.synced_folder 'grunt-project', '/home/vagrant/grunt-project-vboxfs'
    • config.vm.synced_folder 'grunt-project', '/home/vagrant/grunt-project-nfs', type: 'nfs',  nfs_udp: false, nfs_version: 3
  • a developer is working on a piece of code (“src/helloworld.html”)
  • the editing is done on the host in the IDE (“touch src/helloworld.html”)
  • in vagrant the build & deploy chain has been set up. “grunt watch” notices the change on “src/helloworld.html” and builds the webapp
  • with some timestamps (grunt task “logmillisecs”) and proper NTP configuration the reaction time between [CMD]+S and “grunt notices the change” is measured.

This is what is looks like on the host:

mwippert@MaxBookPro$ perl -e 'use Time::HiRes qw(time); print time . "\n"'; touch src/helloworld.html 1443876162.24607

Meanwhile within vagrant:

vagrant@vagrant-ubuntu-trusty-64:~/grunt-project-nfs$ grunt watch Running "watch" task Waiting... >> File "src/helloworld.html" changed. Running "logmillisecs" task 1443876169.982 Done, without errors. Completed in 0.491s at Sat Oct 03 2015 14:42:49 GMT+0200 (CEST) - Waiting...

Let’s do the math: 1443876162.24607 – 1443876169.982 = approximately 7 seconds?!? Having to wait for ~7 seconds until my changed code in the IDE has been noticed by grunt is definitely not fast enough for serious, efficient and joyful web development.

I did several measurements, the latest is available in measurements.xls where I did 10 repetitions. In average the reaction time with NFS was 9.7050 seconds. Compared to the benchmark (everything set up on the host, no vagrant, no shared filesystems) of 0.6106 seconds this is nearly 15 times slower. By the way, when working only in vagrant on a local (not mounted) vboxfs filesystem, there was no significant difference to the benchmark: 0.6552 seconds.

Tweaking NFS with these settings did improve the situation significantly: config.vm.synced_folder 'grunt-project', '/home/vagrant/grunt-project-nfs-tuned', type: 'nfs', &nbsp;mount_options: ['rw', 'vers=3', 'tcp', 'fsc' ,'actimeo=1']

Now the average reaction time was 1.5090 seconds. But still, this is more than twice as slow as the benchmark.

vagrant-benchmark-1

After digging into the details of NFS, grunt, gaze, kqueue and inotify my current understanding is that NFS does not support inotify(), hence there are no events and hence some kind of polling must be used. But I am getting off-track, feel free to drop a comment if you are interested in these things.

Off-track, because even though knowing the exact reason would not provide a solution. Unless you are a virtualbox/vboxfs or nfs maintainer, which I am not. What we did come up with was samba, and by mounting the other way round – i.e. vagrant exports, the host mounts – we were also able to circumvent the hassle that every user of the vagrant boxes has to configure his workstation as samba server. This way round, the samba config is very straightforward, infrastructure as code to the rescue.

The resulting average reaction time with samba was 0.6519 seconds. Only 0.04 seconds slower than the benchmark. Bam!

vagrant-benchmark-2

Conclusion

Whilst there are many hints in the internet that NFS is the lever to improve vagrants shared file system performance, there are not many people saying that NFS might have significant drawbacks as well. Also there are not many hints that using samba is worth a try. Only after having finished my setup I stumbled across this blog bost. And here the eyecatching chart in middle of the post might even mislead superficial readers, a the chart suggests that samba is a bad choice.

I don’t think either samba or NFS are a bad choice. I think that there are situations where samba has advantages over NFS (for example when reaction time matters), where NFS outperforms vboxfs (for example when I/O writes (source) matter) or where NFS is more suited than samba (for example when the grunt run- and build-time are your main concern. I did not focus on this in this article, but in the spreadsheet measurements.xlsx you can see that albeit the grunt task is only very rudimental there are runtime differences).

What I advise is that you look into your situation at hand and that you try out what is best for you. And also consider mounting the other way round. As often, when performance questions are raised, I guess there is no easy answer. You know…  it depends 😉

Full disclosure

The github repo should enable you to reproduce, test and tweak everything on your own workstation. For transparency reasons and because I am happy to discuss and merge pull requests here is my setup.

Hardware

  • MacBook Pro (Retina, 15-inch, Late 2013)
  • OS X Yosemite, 10.10.5
  • 16 GB 1600 MHz DDR3
  • 2,6 GHz Intel Core i7 (1 Processor, 4 Cores)
  • 500GB SSD (Apple SSD SM0512F)
  • OS X Firewall disabled

Software

  • node 0.10.35_2 (installed via homebrew)
  • npm 3.3.5
  • see package.json

Acknowledgements

Kudos to iigorr for coming up with the Samba idea. Working on a Windows workstation does have benefits from time to time 😉

By the way: Maybe a reversed NFS mount would have shown the same performance improvements. But as we have a heterogenous mix of developer workstations at inovex running OS X, Linux and Windows, the latter – using NFS clients – would have required some extra effort… afaik the NFS integration got better with Windows 8 but I never bothered to try it out. I would be glad if you were to try it and submit a github pull request.

We’re hiring!

Looking for a change? We’re hiring Linux Systems Engineers skilled in Vagrant, Salt, Varnish, HAProxy and various Linux distributions such as Red Hat and Debian. Apply now!

Get in touch

Wanna know more about development for mobile and the web? Interested in replatforming? Visit our website, drop us an Email at list-blog@inovex.de or call +49 721 619 021-0.

comments powered by Disqus