AWS OpenSearch Service

Home » Amazon Web Services » AWS OpenSearch Service
AWS OpenSearch Service Training

The AWS OpenSearch Service is the successor to the Amazon Elasticsearch Service. This service has also become a popular topic in several AWS certification exams. So what is OpenSearch and how can it be used? This article is a basic introduction to AWS OpenSearch and a cheat sheet for those taking AWS certification exams.

AWS OpenSearch Overview

AWS OpenSearch Service is a distributed search and analytics suite based on the popular open source Elasticsearch. First released in 2010 by Elasticsearch N.V, Elasticsearch is based on Apache Lucene and is a search and analytics engine used for a wide variety of data including structured, unstructured, geospatial, textual, and numerical.

With OpenSearch you can perform log analytics interactively, perform real-time application monitoring, website search, performance metric analysis and more.

There are several open-source engine options you can select from when creating your OpenSearch cluster. Options include the latest version of OpenSearch and many versions of ALv2 Elasticsearch.

Deploying your AWS OpenSearch Cluster

An OpenSearch cluster can be created using the AWS Management Console, API, or AWS CLI. You must specify the number of instances to use and the instance type. In-place upgrades can be easily performed without incurring any downtime.

Storage options include UltraWarm or Cold storage. The UltraWarm or “hot” storage allows for very fast retrieval of frequently accessed data. For infrequently accessed data you may want to choose Cold storage instead as this is the lowest-cost storage option for AWS OpenSearch.

Additional deployment options:

  • You can use AWS CloudFormation to deploy Amazon OpenSearch Service domains.
  • You can configured dedicated master nodes for your domains (specify the instance type and count).
  • You can create multiple Elasticsearch of OpenSearch indices within the same OpenSearch service domain.

Ingesting data into AWS OpenSearch Service domains

There are three methods for investing data into an Amazon OpenSearch service domain:

  • Amazon Kinesis Data Firehose – use this option for large data volumes.
  • Logstash – configure the Amazon OpenSearch Service domain as the data store for a Logstash deployment.
  • Elasticsearch or OpenSearch APIs – use the index and bulk APIs to load data into the domain.

Monitoring AWS OpenSearch

AWS OpenSearch provides built-in monitoring and alerting with automatic notifications.

You can configure alerts using the Kibana or OpenSearch Dashboards and the REST API. Notifications can be sent via custom webhooks, Slack, Amazon SNS, and Amazon Chime.

OpenSearch Service supports multiple query languages such as:

  • Domain-Specific Language (DSL).
  • SQL queries with OpenSearch SQL.
  • OpenSearch Piped Processing Language (PPL).

OpenSearch also integrates with open-source tools including:

  • Logstash.
  • OpenTelemetry.
  • ElasticSearch APIs.

OpenSearch in an Amazon VPC

OpenSearch Services domains can be launched into an Amazon Virtual Private Cloud (Amazon VPC). Using a VPC enables secure communication between the OpenSearch Service and other services within the VPC. The EC2 instances used by the OpenSearch service can be deployed across one, two, or three Availability Zones (AZs) within the Amazon VPC.

The following are some of the ways VPC domains differ from public domains.

  • Because of their logical isolation, domains that reside within a VPC have an extra layer of security compared to domains that use public endpoints.
  • While public domains are accessible from any internet-connected device, VPC domains require some form of VPN or proxy.
  • Compared to public domains, VPC domains display less information in the console. Specifically, the Cluster health tab does not include shard information, and the Indices tab isn’t present.
  • The domain endpoints take different forms (https://search-domain-name vs. https://vpc-domain-name).
  • You can’t apply IP-based access policies to domains that reside within a VPC because security groups already enforce IP-based access policies.

Note the following limitations:

  • If you launch a new domain within a VPC, you can’t later switch it to use a public endpoint. The reverse is also true.
  • You can either launch your domain within a VPC or use a public endpoint, but you can’t do both.
  • You can’t launch your domain within a VPC that uses dedicated tenancy. You must use a VPC with tenancy set to Default.
  • After you place a domain within a VPC, you can’t move it to a different VPC, but you can change the subnets and security group settings.
  • To access the default installation of OpenSearch Dashboards for a domain that resides within a VPC, users must have access to the VPC.

The ELK Stack

ELK is an acronym that describes a popular combination of projects: Elasticsearch, Logstash, and Kibana. The ELK stack gives you the ability to aggregate logs from all your systems and applications, analyze these logs, and create visualizations. ELK is useful for visualizing application and infrastructure monitoring data, troubleshooting, security analytics and more.

Security

There are several options for securing your data when using AWS OpenSearch. This includes enabling encryption of data at rest for OpenSearch Service domains. This uses the AWS Key Management Service (AWS KMS) for storage and management of encryption keys.

The encryption uses AES-256 bit encryption for high levels of security.

You can also enable encryption for node-to-node communications using TLS 1.2. Node-to-node encryption is optional and can be enabled at any time through the console, CLI, or API. Once node-to-node encryption is enabled it cannot be disabled. Instead you must create a new domain from a snapshot without this setting enabled.

You can create JSON based access policies for your AWS OpenSearch Service cluster using either resource-based or identity-based policies:

You can also use IP-based policies to restrict access to a domain to one or more IP addresses or CIDR blocks.

Another feature is known as “fine-grained access control” and this offers additional capabilities within AWS OpenSearch Service.

Fine-grained access control offers the following benefits:

  • Role-based access control.
  • Security at the index, document, and field level.
  • OpenSearch Dashboards multi-tenancy.
  • HTTP basic authentication for OpenSearch and OpenSearch Dashboards.

The AWS OpenSearch Service also supports authentication through SAML and Amazon Cognito so you can configure federation with your on-premises directories as well as social identity providers.

Pricing for AWS OpenSearch

The first element to pricing OpenSearch is to choose the EC2 instance types and the number of instances you need to run. This will have a large impact on the overall cost of the solution. You also pay for storage you use and the tier of storage you select. Additionally, you are charged for data transferred out of the AWS OpenSearch service.

AWS OpenSearch Best Practices

There are several best practices for deploying AWS OpenSearch Service domains. Here are some of the most important best practices:

  • Deploy OpenSearch data instances across three Availability Zones (AZs) for the best availability.
  • Provision instances in multiples of three for equal distribution across AZs.
  • If three AZs are not available use two AZs with equal numbers of instances.
  • Use three dedicated master nodes.
  • Configure at least one replica for each index.
  • Apply restrictive resource-based access policies to the domain (or use fine-grained access control).
  • Create the domain within an Amazon VPC.
  • For sensitive data enable node-to-node encryption and encryption at rest.

Get Hands-On | Get AWS Certified

Get hands-on with challenge labs in a secure sandbox environment to build your job-ready skills.

Get AWS certified with our popular training courses.

Related posts: