Finnish Forest Center Open Data Services

Robustness and quality of spatial web services has never been more crucial, and Finnish Forest Center (Metsäkeskus) understands this well. Metsäkeskus published open data services in 2018 using Geoserver. Before releasing the new services, they wanted to ensure these services would be reliable and fast, so users of the data could make the most of the data. Metsäkeskus partnered with us to ensure optimal performance and reliability. We are very experienced in capacity testing of spatial web services and we offer a cloud based tool as well as expert consultancy to help customers in cases such as this.

Preparing for the realistic test scenarios

Testing service capacity and reliability requires realistic estimates of the amount of usage. In other words, what is the estimated number of users, how many and what kind of requests they are expected make. In the case of Metsäkeskus’ open data services, the service was brand new and there was little baseline to compare to. Spatineo together with Metsäkeskus made estimations based on usage data of other services analyzed in Spatineo Monitor as well as traffic analysis of Metsäkeskus’ website and existing applications.

After we had an estimate of normal usage, we also took into account the possibility that news of the published data might create some buzz in online media. It is critical for a new service to operate reliably right from the start or otherwise there is a risk of the initial disappointment negatively affecting of how reliable people see the service. Estimates were made using news website readership statistics and usage estimated based both for news articles with embedded maps as well as articles linking to Metsäkeskus’ own web map.

These scenarios were then translated into tests in Spatineo Performance. Once the estimates were ready, creating the actual test plans was just a matter of minutes: selecting the correct services and layers to test, and assign the amount of load applied to each part of the service.

This video shows how to set up a test in Spatineo Performance. First you choose what kind of service you want to test, then the service. After choosing a test profile as a template, you can select which layer to test. Once you give a name for the test and save, you can either start the test immediately or do further configuration.

For a more complete presentation on how Spatineo Performance works, you can watch this video.

Running the tests and analysing the results

The services were tested with Spatineo’s cloud-based application Spatineo Performance. During the tests our experts along with Metsäkeskus’ personnel analysed the behaviour of the services and servers to identify bottlenecks that affect reliability and performance of the service. To ensure bottlenecks could be identified quickly, Metsäkeskus had remote access to the servers running Geoserver to monitor resource usage on the servers. Performance testing was conducted in two testing rounds and an additional final test to confirm that changes made based on the test result analysis had the expected effect.

During the first two rounds of testing some possible optimisations in the server configuration were identified and tested. Java virtual machine parameters can make a big difference to Geoserver performance and we were able to improve the reliability of Geoserver by tweaking them: we set up the virtual machine to reserve all the allocated memory capacity immediately (same memory amount to both -Xms and -Xmx) which ensures that Geoserver has all the allocated memory available when it needs it. It is however important to test each parameter change separately to ensure that they have the desired effect.

Bottlenecks where you least expect them

The main observation during these tests was a bit of a surprise: In the first round of tests, the servers were unable to handle the amount of requests in the planned test, but neither the server CPU load or memory usage was reaching critical levels. Also, no exceptional resource usage was detected on the database server. Something was limiting the performance, but what?

After analysing access logs produced by Geoserver during the tests, we realized that not all of the requests from Spatineo Performance were actually reaching the servers. Furthermore, the response times in the log files did not match the response times measured by the testing tool.

Our suspicions turned towards the network capacity of the server. We tested this theory by placing a large file on the same server as Geoserver and measured download speeds when accessing it via the public internet and from the neighbouring server. This confirmed that the network was drastically limiting the bandwidth between the server and the public network. The limit had no discernible effect in performance when there were only a few requests to the server at a given time, but once the bandwidth required by the requests was high enough, the network throttled the requests and stopped some requests from reaching the server altogether.

Metsäkeskus contacted their service provider in order to fix the bandwidth issue. Once the bandwidth limit had been changed, we ran the final tests that confirmed that the issue had been resolved. Once the bandwidth had been increased, we could run also re-run the original tests to measure whether the servers could reach the estimated levels of usage.

Reliable services leads to trust

Making sure data delivery works smoothly ensures the data makes the most impact. Users will get the data and answers they want faster and developers are more likely to build applications using the data when the services can be relied upon. Successes in data delivery will also build the image of the organisation as a reliable partner. This builds trust between data providers and users in result contributes to the growing data ecosystem that is crucial for the development of new applications for the forest sector in Finland.

After publishing the new service, the open forest data WMS service is now receiving 1-2 million hits each month and it’s uptime has been over 99% for the second half of 2018.