The Difference Between Elasticsearch, Open Distro, and OpenSearch

Here is exactly what OpenSearch is and how it relates to Elasticsearch and Open Distro.

The Difference Between Elasticsearch, Open Distro, and OpenSearch
Source: Me

** Note: Open Distro is no longer releasing new versions. All development has moved to OpenSearch**
That’s right, Amazon is releasing an open-source fork of Elasticsearch/Kibana. This may be a bit confusing as they already been supporting Open Distro since February of 2019. We’ll take a side-by-side look at both to understand what is so different about OpenSearch.

Open Distro

Source: Me

We will start with Open Distro since it is the older of the two. Open Distro is still vanilla Elasticsearch at its core. What Amazon did with Open Distro was to add functionality to both Elasticsearch and Kibana. The value they added came in the shape of the following additions they made:

  • Enhanced Security
    They added several authentication methods such as integrating with SAML, Kerberos, LDAP/AD, and Proxy Auth/SSO.
  • Alerting
    Alerts have queries as a parameter and then can then alert based on the threshold or the output of a custom script. The alert can be sent over SMS, Email, or any other way you can imagine.
  • K-NN (Nearest Neighbor)
    K nearest neighbor allows you to quickly perform k-NN calculations on billions of documents. This is a common graphing algorithm that has been optimized for speed.
  • Index Management
    This Kibana plugin helps you to define index management policies based on either the number of documents, index size, or age. This allows you to manage things like TTL, backups, etc within Kibana.
  • Performance Management
    When troubleshooting your application you need something that works even when your cluster doesn’t. The performance manager does just that by working as an outside agent to give us the 411 on what our environment is doing.
  • and of course a SQL interface….
    Do we really have to keep doing this? Why does every platform need to have a SQL interface? I digress, it’s in there for better or worse.

Aside from this, they have done well to ensure they could easily keep Open Distro up to date with the upstream Elasticsearch repos. Or at least that was the intent. As of recently, Elasticsearch has added in additional checks in an effort to slow or stop users from being able to use Open Distro altogether. In the words of Kyle the developer advocate for OpenSearch, Open Distro was the open-source community’s response to X-Pack.

OpenSearch

Source: Me

OpenSearch is a fork of Elasticsearch. OpenSearch is picking up where open-source Elasticsearch left off. The team working on OpenSearch has forked version 7.10 of Elasticsearch and is in the process of gutting it. As you can see from below it’s been a bloody war.

Source: Preparing OpenSearch and OpenSearch Dashboards for Release

Gutting it means a few different things. The first and most obvious is the name. Everywhere in the code where there is an Elasticsearch or Kibana reference, it is changed to OpenSearch. Although it may sound simple weeks of work went into making all the name changes so that it is consistent across the board.

The next and arguably most complicated change they made is removing many of the Elasticsearch specific features such as X-Pack, license checks, and Elastic “phone home” code.

X-Pack is the feature that arguably caused the most controversy. They were the “open source” but “elastic licensed” modules. What this meant is that they could be used by end-users but anyone who wanted to sell services with Elasticsearch needed to purchase licensing from Elastic. That didn’t sit well with Amazon as they had contributed to Elastic and were selling hosted Elasticsearch services.

Because of the removal of all X-Pack modules from Elasticsearch, X-Pack-enabled beats will not work as well. Want to monitor Netflows, F5, CoreDNS, or many other common log formats? You are straight outta luck. Their license check won't allow their beats to work with any non-X-Pack licensed Elasticsearch. There are other log stream processors you can use such as fluentd.

fluent/fluentd
GitHub Actions: Drone CI for Arm64: Fluentd collects events from various data sources and writes them to files, RDBMS…

The phone home code I mentioned was used by Elastic to get utilization metrics from end-users. While this sounds sketchy there is a way to disable this service. Elastic uses this information to drive product decisions. Say for example many users were needing to do range scans. Elastic could use this information to optimize those types of operations to improve the product for everyone's use case.

Finally, they are adding all of the wonderful additions they made to Open Distro to OpenSearch.

It’s safe to say that while OpenSearch is very similar to Elasticsearch now, they are staring down very different paths. OpenSearch is committed to keeping its fork open source and has the backing of Amazon to do so. That’s why I believe that everyone will start to make their move over to OpenSearch.

Big thanks to Kyle for help with some of the technical details. Check him out on Twitter or GitHub.

More content at plainenglish.io