Integration Testing with Jimfs virtual Filesystems

In various test cases, primarily those covering components that interact with the filesystem and use filesystem entitites for a convention-based operation, it perfectly makes sense to provide a filesystem resource in an expected state to assess the component-under-test’s behavior.

To provide a resource in an expected state, the classic approaches are:

  • Have a handcrafted directory with testdata on the developer machine
  • Couple the testdata with the source in the repository

Both methods are quite poor, as the first will definitively yield into trouble when the application gets built on a CI or any other machine, or when the fact that the local filesystem is anything but immutable shows it evil face. The second includes hand-crafted static data that has to be kept in sync with the application contracts and logic. Despite this might have been an acceptable approach in the 90s, please do not do this today.

More contemporary approaches are:

  • Have a configurable location on a ramfs/tmpfs with freshly prepared testdata on each @Before* part of the test.
  • In Spring Framework, use the TemporaryFolder resource, which promises to cleanup the resources after testing.

The above are quite decent, but it still depends on local, platform specific and non-reproduceable resources, a fact that may (and will) corrupt the actual test. So, why not use a layer which emulates a filesystem on java.nio layer and uses a throwaway, in-memory filesystem which gets assembled in an expected state on each test run?

Jimfs

Jimfs performs the task of providing a virtual in-memory filesystem which behaves more or less exactly as the DefaultFs. Being developed by google and quite feature-complete, it drives all my tests which require File or Path dependencies.

Example code

First, jimfs needs to be imported. For Maven, the import is:

A minimal test with Jimfs in JUnit4:

From here on, the jimfs filesystem behaves like the normal filesystem, so lets populate it with some testdata.

There is one single trap: According to the jimfs repository and the Java documentation, the Path class does not neccessarily have to provide a toFile() method – so if any method which ends up using a Path hosted by Jimfs, it should create an InputStream from the Path, which might also be a free lunch if the filesystem-centric code relies on Java8 Files.walk() functionality.

Memory filesystems in tests may be a bad smell

In Java8, there are a lot of ways to abstract filesystem-driven components from the actual storage. When the testing strategy requires the use of an in-memory filesystem, it might be a sign that a component is coupled with the filesystem too tightly.

My usual way to abstract the filesystem from my other code is using a Supplier<ContractType>, which can easily be mocked or stubbed for components which depend on it, while the supplier, as a last outpost before the filesystem, is the right component to be tested against a virtual filesystem.

So, memory filesystems may be a bad smell used in the wrong place, but really useful when used right.

Norderney

Heligoland Revisited

Word Frequency in 5 Programming Languages (Java, Scala, Go, C++, R)

Java 8

 

Golang

Scala

R

C++

Alexa Skill development and testing with Java

This article is supposed to give a brief overview over Amazon Alexa Skills development with the Alexa Java API. The mayority of tutorials on Alexa skills appear to be targeted on node.js developers, so I would like to highlight the Java way and point out some things that I missed in the official trainings and some things that I would have solved differently.

For demonstration purposes, I wrote a simple fun application which finds crew members on the enterprise spaceship. A user could ask Alexa questions like “Where is captain Picard” or “Ask Enterprise where Captain Picard is” – so this application makes perfectly no sense, but it demonstrates everything a developer has to know to implement own basic skills.

The Speechlet interface

Providing an Alexa-enabled applikation requires the developer to provide an implementation to the Speechlet interface, which is:

The functions are quite straightforward – the session related functions handle init and cleanup work for the task of instantiating or terminating a session, which in the Alexa domian the the lifetime of a conversation with the user. OnIntent gets invoked on any voice interaction the Alexa backend is able to map to an intent based on the predefined utterances schema.

Lets take our nonsense Enterprise crew resolver:

We deliberately do not deliver cards to the customers application, if a card is required, there is another signature of the newTellResponse method. As we do not have access to the board computer and there is no Amazon Alexa service for spaceships or even a region outside the the earth atmosphere yet, we inject a mock resolver for testing purposes.

Mocking an Alexa call for jUnit

Testing the data providers behind the Alexa API behaves as good as everyday testing, but mocking an Alexa request does not appear to be part of the primary feature set of the API, which means that we have to completely mock the request before passing it to our handler.

Fortunately, Amazon used a library related to immutables.net for their API, so it is possible to handcraft an IntentRequest which closely resembles an actual search request for Captain Picard as following:

What I would like to be changed in the Java API

Builders all the way

Builders are good. Please use them on the response types, aswell. For example, the code below feels very, very 90s:

Way too much ceremony. What I would like to have written without providing own facades is:

Easy testing

Mocking a request for semi-end to end testing like in the example above works, but it is not really comfortable. I would appreciate a function which exports the request to a JSON file, together with a corresponting input function. This would make it easy to mock the request without using the builders directly.

Besides, once a speech has been created, it is not possible to extract the speech text out of it without applying dirty reflection to break the private property barrier. Why not just provide a getSsml() member to make e2e testers happy?

Plaintext or Ssml?

Honestly, I do not want to use the Plaintext response at all. Ssml is a superset of Plaintext and allows more detailed control on the way Alexa text-to-speech works, for instance if it is a requirement to spell out a word instead of speaking it. So, why not just use Ssml all the way and improve the speech renderer so it does not crash if no <speech></speech> tags are present?

Performance Testing with Apache jMeter

Designing and implementing distributed systems, both customer-faced or just datacrunching farms, one is soon required to determine performance impacts and possible bottlenecks.

In this specific case, I wanted to gain information regarding the limits and scalability aspects of a customer-facing web application which also manages a high number of connected devices.

Why Apache jMeter

The performance testing case I was working on made me opt for jMeter in the end, for the following reasons:

  • Developed in Java, supporting plugins in Java or Beanshell. It is unlikely to have a metering case not which cannot be met with Java. In this case, Java became a killer feature, as most modules were implemented in Java, so it was possible to integrate jMeter into the given scenario without writing gluecode.
  • Distributed by design. It is unlikely for a single machine to stress the system under test (SUT) enough to gain any useable information. Testing and stressing a distributed system requires controlled distributed load generators.
  • Easy to get load generators on demand. jMeter is used by many engineers, there are a lot of services that accept jMeter recipies and play them on their machines for cash.
  • jMeter is able to take jUnit tests and their performance metrics into account, too. This makes it possible to share tests between jMeter and the main testbench.
  • jMeter brings a sophisticated GUI for testplan generation and debugging and also supports headless operation.
  • Very flexible to configure.
  • It is easy to package a jMeter node inside a docker image, so it can also run on a cloud computing provider which allows the execution of docker.
  • Many built-in graph and reporting components, data export to 3rd party analysis tools is available.
  • Open-Source, Apache Project
  • Finally: A wise man once said to me: “Long live the standards!”. jMeter can be considered as a defacto-standard swiss army knife for performance testing.

Functions and Vocabulary

jMeter uses a set of functional units for performing tests. After learning the vocabulary, the system is quite straightforward. The table below gives an overview on the jMeter modules and their purpose.

Aspect jMeter Components
Control Structures, Grouping and behavior Threads, Logic Controllers
Controlling iteration speeds and timing Timers -> Constant Timer, Random Timer, Constant Throughput timer, …
Storing Configuration and State Data Variables
Creating Load and performing actions on Systems under Test Samplers -> HTTP, Websocket, FTP, HTTP, Beanshell, Java Code, JDBC, WS, SMTP,..
Altering or extracting sampled Data before sampler execution Pre-Processors -> Regex, XPath, JSON Path, …
Altering or extracting sampled Data  after sampler execution Post-Processors -> Regex, XPath, JSON Path, …
Verifying and asserting sampled Data Assertions -> Regex, Compare, XPath,…
Obtaining Information and reporting Listeners -> Graph, File, Table, …

For more details on elements of a test plan, jMeter provides a concise documentation.

Designing a Test Plan

jMeter manages its test plan in a tree structure, which favours the XML data format used on the filesystem. So, the whole testplan meets a structure of ordered nodes with 0<n<inf children. For example, a parallel execution of a certain set of nodes would be represented by a parent node with the functionality of a thread controller, same applies on loop controllers or conditional controllers.

As an example, a login on a standard user/password form would be represented in jMeter as follows:

Analysis and Reporting

After running the testplan and listening for the metrics delivered by the samplers, jMeter compiles a set of prebuilt reports which gather a lot of information, in most cases every information required to derive the next actions. For instance, it is possible to graph the respone times, the error ratio and the response times in relation to the quantity of parallel accesses.
It is also possible to export the data into csv/xml files or use the generated reportfiles for further analysis. An interesting approach is to pass the data into R and use R’s graphing and reporting tools for further analysis.

Automation

Even though jMeter brings a really impressive GUI, it can be fully operated from the commandline. So, it is no problem to script it and, for example, integrate it into a CI/CD pipeline and let a build fail if it does not meet the performance expectations.

Distribution

In a distributed jMeter installation, workers are called “servers” and masters “clients”. Both are connected via old-fashioned Java RMI, so, after setting up an appropiate communication foundation between servers and client(s), triggering a load/performance testing job on the master suffices to start the slaves and collect their metrics.

Test plan creation

The jMeter files (.jmx) are pure XML, so it is theoretically possible to write them manually or, more probably, generate programatically. In most cases, one would use the GUI to click a test and customize it with config files, environment variables or own tiny scripting, depending on the system under test.

Plugin development

If jMeter does not deliver a functionality out of the box, it is possible to add the functionality by scripting or plugins. This means, it is possible to divide any implementation of a Sampler, Pre/Postprocessor, Assertion or Listener into three classes:

Class A: Available out of the box
Class B: Possible by jMeter/Beanshell scripting
Class C: Only possible by developing an own plugin

Developing a Java Sampler

A test-case classified as C needs to be implemented as a plugin. Basically, every aspect of jMeter can be delegated to a Java plugin, so it would also be possible to use a Java class to implement a custom assertion. Nevertheless, I think that the most common case is implementing a custom sampler to wrap a test around a functionality which either is not available through a public API or has asynchronity/concurrency requirements jMeter itself cannot meet.

An easy way of implementing a custom sampler is fetching the dependencies via maven and providing an implementation to the jMeter Sampler API.

A very minimal Maven POM just includes a dependency expression to the ApacheJMeter_java artifact, in a real use-case one might want to add maven-assembly to create a fat bundle including all further dependencies, so a downloadable package can be built on a buildserver.

 

The JavaRequest sampler main class needs to extend the AbstractJavaSamplerClient class and provide a tree (1 <= n <= inf) of SampleResults:

After deploying the build artifact to $JMETER/lib/ext and restarting jMeter, it is available and can be integrated into a testplan using the GUI.

Dockerizing jMeter

An easy way to deploy a distributed jMeter installation is providing a docker set which consists of a master and n slaves. In this setup, it is advisable to create a base image and derive both master and server. If the setup is supposed to run in distributed environments which do not neccessarily provide a registry, it is advisable to use an own lightweight mechanism, such as a list in Redis which is populated by the slaves as soon as they get invoked.

The base image:

 

The master:

The server:

Performance Testing (or: Turrican II vs Github)

This is an intruductionary post on a series of articles on application performance.

In the not-so-old times of computing, dealing with performance was, in most cases, trivial in terms of the primary approach. A developer was required to optimize an application for acceptable or competitive performance on the given hardware or a small local network for one or a small number of users. This time gave us a lot of digital art that pushed the limits of popular home computers so far that it was unbelievable how far they could go with such limited hardware power.

Regardless of the glory, the scenario above is unlikely to be applicable on todays mainstream application scenarios and could even be considered as less challenging as todays hardware-abstracted systems. Nowadays, hardcore hardware programming is limited to the embedded or special applications field, performance tweaking primarily focuses on gaining the most from a limited hardware, building the solid foundation for distributed systems. In as good as no accounts, however, performance observations are limited to one single machine only. Google or Github are running on a large array of machines and services with a lot of parameters and their permutations to take into account – we are talking about distributed systems.

Looking back on the 90s, there also was another concept of scaling – one additional player of Turrican II meant buying another C64 or Amiga and a new copy of the game – literally the relation between users and “servers” was 1:1, what is straight in opposition to the “cloud” and visualization virtue we are pursuing today.

Another relevant parameter to take into consideration is agile development. Back in 1992, every piece of software was built as a single artifact and shipped on one floppy disc which remained unchanged throughout the complete shipment (not considering various unofficial “forks” which were much more liberate in terms of copy protection) and the whole lifespan of the product, which, thanks to emulators and lovers of computer game history, still lasts on. Again, this does not relate on todays work, which is built in multiple small artifacts, distributed on multiple machines, and likely to change and possibly evolve many times a day.

So, comparing a computer gaming masterpiece of the 90s with todays distributed high-scale applications reveals that both of them still are masterpieces, but in a very different way.

Performance matters

As described above, todays business is seldom centered around developing games for a hardware replicated million times. Consumer systems are heterogeneous, such are servers and applications. This raises the following questions in terms of performance:

  • Does the update we just promoted to the beta/acceptance/productive system impact the overall performance in any way?
  • Does the new version cause side-effects on other components of the system?
  • I know Amdahl and Gustavson take no prisoners, but how far can I scale with my current architecture? Can I meet my performance peak goals by showing my credit card to the hoster of my choice or is my architecture likely to get stuck in the tar at a certain point?
  • Will I be able to react to load changes, such as a black friday sale or my hobby project mentioned on a mayor news site?
  • Can I fulfill a business case on final approach or have I got to take additional measures to deliver a working system?
  • Anyway, what is performance? Which metric determines whether my application is acceptable and delivers value or not?

In the further articles in the performance testing series, I will deal with various aspects of performance and related tools.

Copenhagen

Turning a spare Dreamplug mini-PC into a remote-controllable audio player

My company gave away a couple of abandoned spare devices previously used for evaluation purposes. So, I got hands on a dreamplug, a socket-plugin sized mini computer (ARM based) with decent power, low energy consumption, integrated dual Ethernet, WiFi, Bluetooth and, most important, an optical audio output – in other words: a Raspberry PI with a housing and the IO I was missing. The digital output was the primary reason for using it as an addon to my stereo.

My stereo was lacking the capability to play mp3/mp4/aac/lossless files without hassle, which means playing music without having to perform any manual action after importing a new CD I bought. I was not satisfied with my current solution, which consisted of a TerraTec NOXON2audio with an attached USB pen drive, as unfortunately it never ran perfectly stable with USB drives with a capacity greater than 64GB. When we developed the device at TerraTec, even 8GB USB drives were beyond “are enough for everybody/anybody wants to pay” capacity.

My other option was using an empty C.A.R. 4000 housing from the 2000s, fit a Raspberry PI and a USB DAC (or PI Audio) inside, hook up the buttons and the display to the GPIOs and build a beautiful HiFi size player. Even if this approach was very tempting, a couple of circumstances made the expected bill of material too high to take the retrofit project into consideration. First, the 3″ display. The original CAR4000 uses a parallel powertip display, which takes a lot of GPIO space and requires me to develop a new driver from scratch. New 3″ displays are too expensive or take too much GPIOs aswell, so I would have to use shift registers for connecting the control buttons, which also are originally designed to be connected to the controller via ADC – the Raspberry PI does not have an ADC, so I would have had to add an ADC, modify the board or use fancy timing code. None of them was a satisfying option, so I suspended the project.

Simply exchanging any of the gear in my living room was the least acceptable version. My oldschool AVR, built in France in 1999, may not have any retail value anymore, but such a great performance that it simply would not make sense to replace it.

So, using the dreamplug as a hidden addon perfectly was the most viable and promising option in the situation described above.

Use case

The amplifier offers both switched and unswitched power outlets on the rear, I plugged the Dreamplug to the unswitched one and connected the digital output to the digital in of the amp, the 3.5mm jack should go to my multiroom speaker system.  This makes the entire solution feel like an upgrade for the unit, not an additional device, and I came to the opinion that reducing the numbers of units in my stereo is not a bad thing.

Software stack

Originally, I wanted to build a new Hifi size unit with a display, buttons, incremental encoders and more fancy things. As I have a lot of old tablet computers and mobile phones, it became more attractive to relocate the user interface to an external unit.

Regarding the usecase, music player daemon (MPD) was the fitting solution: It offers a network interface for remote control, a media library, and various input and output plugins. Unlike common clones of platforms such as Google Music, such as Ampache or Subsonic, MPDs primary concern is using the audio output facilities of the machine it is installed on. This results into the following software stack on the device:

  • Debian Jessie Linux
  • Libertas WiFi drivers for the Marvell WiFI chipset
  • Pulseaudio
  • MPD
  • Alsatools for mixer controls
  • sshd for maintainance and file transfer
  • rsync for music library synchronization
  • udev for automounting
  • Golang runtime for a web service that switches the 3.5mm jack on and off to control multiroom

The dreamplug

Unlike the Raspberry pi, the unit does not provide a HDMI output, the only way to get a console is using a UART port. The device has a 4-pin mini connector, the pinout is:

1     GND
2     RX
3     TX
4     3.3V

I used a fan connector from an old VGA card, connected it to a 5V->3.3V serial level converter and used a terminal with the settings 115k/8N1. From there, it was possible to interact with the u-boot bootloader and write a Debian Jessie image to the flash.

Music Player Deaemon (MPd)

Just a few changes to the configurations are neccessary:

To mount the pen drive automatically, udev comes into play:

Mobile Phone control

Among many options, I choose M.A.L.P (for Android) because of its featureset, the decent UI and the fact that it is open source and free of ads. M.A.L.P requires an initial configuration consisting of the server address and the password. Afterwards, an old spare tablet computer became the control unit for the media player.

Multiroom

My home has got ceiling-mounted speakers in all rooms, with a central terminal which I can connect to an external audio source. My amplifier offers a Zone2 functionality, unfortunately it does not work when the main speakers are connected via bi-wiring/bi-amping, which is the case in my setup. So, I needed another solution. In case of the dreamplug, two facts made the decision on the solution very easy:

  • The analogue 3.5mm jack is capable of driving passive speakers with an acceptable performance
  • The analogue output uses a separate Pulseaudio mixer control, so I can use the digital out to the amp and the analogue multiroom out independently

The only challenge left was turning the analogue out on and off. Simply pulling the cable was not acceptable. So I decided on writing a slim golang webservice which provided an endpoint for setting the analog output volume and an Angular2 app to provide a simple user interface.

The golang webservice

Thanks to systemd, the inclusion of the service into the system start process is easy:

The Angular2 application

The Angular2 application just provides a frontend for the webservice, which consists of two simple on/off buttons.

 

 

 

After installing the Angular2 application, serving it with Nginx, proxy_passing the /api/v1 url segments to the golang service, the device presented a welcome page with the option to turn multiroom on and off.

 

Podcasting with MPD

Although MPD can archive much, it has no native support for Podcasts, but this functionality can easily by added by taking advantage of the playlisting feature. Here, I created a simple Golang task that retrieves a podcast (using the awesome Podfeed Go library) and converts it into a m3u playlist. This service is triggered by cron.hourly and piped to a file named according to the podcast title inside the MPD playlist folder.