The OP5 team recently had the pleasure of attending this year’s OpenStack Summit in Boston. The OS summit is truly a community event for all major vendors and users of the OpenStack platform and all its supporting projects, where the overall objective is to share experiences, promote ideas and educate one another. At OP5 we have witnessed an increasing amount of customers and vendors, discussing, installing and pushing software-defined infrastructure and more specifically OpenStack.
So we thought it would be useful to share some key insights from the event from one of our founders and CTO, Jan Josephson.
My initial thoughts from the OS Summit were concerning the overall growth in maturity of OpenStack and its core components, with many discussions and sessions focused on actual services, applications and orchestration. From the four days in attendance there was a real sense that of all the discussions and sessions, there was an even 50:50 split between real production implementations and pre-production/lab status.
This was interesting for multiple reasons, but for OP5 it was of particular importance as the need for metrics, events and status data is usually different when monitoring labs as opposed to production installations, with a key shift in what information needs to be produced and communicated to various users and stakeholders.
More of Kubernetes in the near future
Kubernetes – an orchestration project and vendor for automating deployment, scaling and management of containers such as Docker. As the microservice architecture gains ground it automatically mean more containers/pods are being deployed with more direct or indirect dependencies that need to be managed. The most commonly used cases encountered were around handling the potentially challenging workload of in-production upgrades when you have a hybrid containerized production spanning over multiple cloud services such as in-house, Amazon and Google.
Even though it was very early days, it was still interesting to see some some working production examples of this, and I am convinced we will see a lot more of Kubernetes in the next 1-2 years.
Talking about automation and configuration, Ansible has clearly done a great job in the community as most discussions, demos and overall work in this area were very much around Ansible. The more traditional Puppet and Chef were not as present as they have been, with this being confirmed by the OpenStack yearly user survey below.
The Nagios ecosystem is a very important and visible part of OpenStack. As the software defined infrastructure grows into both Enterprise and Telco, a huge challenge is the fact that what is needed to be collected in terms of events, status and metrics comes from the existing infrastructure base. Many services in operation or being deployed in the future will depend on both underlying infrastructures.
It’s clear that many of the in-production cases I witnessed at OS 2017 were using Nagios scripts and/or checks at the endpoints to grab the needed data. This was visible in the case of “The smart city of Messina” where they wrote a script which took the data output from a standard Nagios check and sent it to a collectD agent who then fed it up to the OpenStack / Monasca core. This is a great example as it shows how a sysadmin who’s been around for some time solves the challenge in an efficient way.
I´m sure there are many discussion on “the right way” to do things, but when you add short timeframes, limited budgets, skillsets etc. to the overall equation – we will see lots enterprise and customers reusing the Nagios ecosystem in the future.
Network Function Virtualization
Network Function Virtualization tends to be tightly connected to the telecom world and their use of OpenStack. One of the discussions taking place at this year’s Open Stack event regarding NFV were conserving The “noisy neighbor” problem addressing the challenge of having large environments with many instances and core network components being virtualized yet sharing the same hardware. Due to resources being dynamically allocated, this can result in one virtualized service consuming ‘too much’ of the shared resource at the expense of its neighbors, so detection of what is abnormal, what is affected and how to automatically handle it, is not so simple as it seems…
Another discussion concerned subsecond granularity on metric collection. This is a traditional scalability issue when you move from regular 5 minutes intervals to 1 minute and then to subseconds. As you get close to subsecond intervals it translates into some seriously big hardware investment, and so is more a financial concern than a technical one. It then becomes important to ask yourself some direct questions, why do we need it? what is the value of this information over time, where and how should we store it?
My Final thoughts
Consultants will continue to have a great market in this space. The need for specialized skillsets especially in the beginning is high and not only on the configuration/ architecture side of things, but also on the output side, hence we are getting into a world where millions of metrics are being materialized and can be visualized in many formats.
All this will require new skillsets on the receiving end and that is a challenge I think is shared by many including the community and potential customers and users in the future. The good news in this, is the continued openness and strengths in the openstack community that provided transparency and promotes sharing.
Here’s looking forward to Openstack Nordic Copenhagen in October, hopefully we’ll see you there!
All the best