Skip to content

OpenSearch AWS kNN Search Observability

OpenSearch Engineering for AWS-Managed Search and Log Analytics

OpenSearch is the Apache-2.0-licensed fork of Elasticsearch 7.10, backed by AWS and a broad open-source community. We provision and operate AWS OpenSearch Service managed clusters, migrate production workloads from Elasticsearch 7.10 and above, build log analytics and SIEM pipelines with OpenSearch Dashboards, and deliver vector kNN semantic search for US and EU product teams that need open-source licensing, AWS-native integration and audit-ready security configuration.

Get a proposal See cases

OpenSearch is the Apache-2.0-licensed fork of Elasticsearch 7.10, backed by AWS and a broad open-source community. We provision and operate AWS OpenSearch Service managed clusters, migrate production workloads from Elasticsearch 7.10 and above, build log analytics and SIEM pipelines with OpenSearch Dashboards, and deliver vector kNN semantic search for US and EU product teams that need open-source licensing, AWS-native integration and audit-ready security configuration.

Challenges

Industry challenges we solve

Shard and index over-provisioning

Teams often copy Elasticsearch shard counts without adapting them to OpenSearch data volumes, resulting in thousands of tiny shards that consume JVM heap and slow down cluster state operations. We right-size indices to the recommended 20-50 GB per shard range and consolidate rollover aliases to keep shard counts manageable as data grows.

Cluster instability and JVM heap pressure

Large field-data caches, unbounded aggregations and hot-node imbalances cause repeated JVM GC pauses and node evictions on OpenSearch clusters. We audit field-data usage, replace eager field-data with doc-values where possible, distribute hot indices across nodes with shard allocation awareness, and configure circuit breakers appropriate to the instance class.

Elasticsearch-to-OpenSearch migration compatibility

The Apache-2.0 fork diverged at Elasticsearch 7.10: Elastic's proprietary features (Elastic APM, EQL in later versions, Kibana-specific APIs) have no drop-in equivalents. Client libraries using transport protocol rather than REST also require updates. We audit the exact API surface your application uses, map it to OpenSearch equivalents or OpenSearch Dashboards, and run parallel validation before cutover.

Expensive queries and aggregation performance

Deep pagination, wildcard-prefix queries and high-cardinality nested aggregations are disproportionately costly on OpenSearch. We profile slow queries via the Profile API, replace unbounded scroll with search_after for pagination, apply shard_preference to route repeated aggregations to warm cache nodes, and introduce composite aggregations where cardinality allows.

Security and fine-grained access control configuration

The OpenSearch security plugin supports tenants, roles, field masking and document-level security, but misconfiguration leaves either over-privileged access or application breakage. We design role hierarchies aligned to application personas, test each role with a dedicated service account before go-live, and document the permission model for audit reviews.

Vector search scaling for kNN workloads

k-NN indices in OpenSearch load FAISS or NMSLIB graphs into JVM heap, which competes directly with query cache on the same node. Naive kNN deployments exhaust heap on modest instance sizes. We separate kNN indices onto dedicated data nodes with memory-optimised instance types, tune ef_search and ef_construction per recall/latency target, and combine kNN with BM25 in a hybrid scoring pipeline for production-quality semantic search.

Solutions

Solutions we build

AWS OpenSearch Service managed clusters

End-to-end provisioning of AWS OpenSearch Service domains: VPC placement, instance sizing, dedicated master nodes, UltraWarm and cold storage tiers, automated S3 snapshots and CloudWatch alerting. We handle day-two operations — rolling upgrades, shard rebalancing, index template governance — so your team does not carry on-call burden for the search layer.

Elasticsearch-to-OpenSearch migration

Structured migration from Elasticsearch 7.10 and above to OpenSearch: API surface audit, client library update (opensearch-py, opensearch-js), index template and mapping review, parallel shadow-indexing validation and blue/green cutover. We preserve existing ILM policies as ISM equivalents and document every behavioural difference discovered during the migration window.

Log analytics and SIEM pipelines

Centralised log ingestion via OpenSearch Ingestion (Data Prepper) or Fluent Bit, structured into time-series indices with ISM lifecycle management. OpenSearch Dashboards visualisations, anomaly detection monitors and alert notifications to PagerDuty or Mattermost give security and operations teams a full observability layer built entirely on open-source components.

Vector kNN and semantic search

Production kNN search using the OpenSearch k-NN plugin with FAISS or NMSLIB engines: embedding pipeline design (Sentence Transformers, Amazon Titan, custom models), index configuration for recall/latency targets, and hybrid BM25 + kNN scoring for superior relevance over pure lexical or pure vector retrieval. Deployed for product discovery, document similarity and RAG retrieval augmentation.

Index lifecycle and cost optimisation

ISM policies that move indices through hot, warm, UltraWarm and cold tiers automatically, triggered by age, size or query rate. Force-merge and compression on closed indices. Rollover aliases that keep individual index sizes predictable. We audit existing clusters for shard waste and deliver a right-sizing plan with projected cost reduction before any change is applied.

Fine-grained security and audit configuration

Security plugin configuration for field-level masking, document-level security filters, multi-tenant OpenSearch Dashboards and service-account role binding. Audit logging routed to a dedicated index, accessible to compliance teams without granting cluster-admin access. TLS certificates managed via AWS Certificate Manager or Let's Encrypt with automated rotation.

Stack

Technology stack

OpenSearch, OpenSearch Dashboards, AWS OpenSearch Service (managed clusters), index and shard design, ISM (Index State Management), k-NN vector search (FAISS/NMSLIB engines), OpenSearch Ingestion (Data Prepper), alerting plugin, security plugin (fine-grained RBAC, field- and document-level security), snapshots to S3, cross-cluster replication, Logstash, Fluent Bit.

Compliance

Compliance & regulations

Audit-ready search infra · Fine-grained field/document-level access control · Node-to-node TLS + encryption at rest · ISM retention policies for data governance

EU

  • GDPR — the OpenSearch security plugin enforces field-level and document-level access control so PII stored in indices is visible only to authorised roles; we design index mappings to isolate personal data and configure ISM retention policies that honour data-minimisation obligations.
  • EU AI Act — vector search indices built with OpenSearch kNN provide traceable embedding lineage: each document retains its source reference and ingestion timestamp, supporting the provenance requirements for AI-assisted retrieval systems.
  • NIS2 — centralised log ingestion via OpenSearch Ingestion and Data Prepper feeds a SIEM pipeline in OpenSearch Dashboards; anomaly detection monitors and audit-log indices give security teams the continuous visibility NIS2 mandates.
  • eIDAS — node-to-node TLS enforced through the OpenSearch security plugin, combined with client certificate authentication on AWS OpenSearch Service VPC endpoints, satisfies the transport-layer integrity requirements applicable to electronic service infrastructure.

US

  • SOC 2 — audit-ready cluster configuration: fine-grained RBAC, immutable audit logging to a dedicated index, S3 snapshot history and ISM-enforced retention windows give compliance teams evidence for access and availability controls.
  • Encryption — encryption at rest using AWS KMS-managed keys on AWS OpenSearch Service domains, plus node-to-node TLS; no plaintext data on disk or in transit between cluster nodes.
  • HIPAA-eligible configuration — AWS OpenSearch Service is listed as HIPAA-eligible; we configure the required Business Associate Agreement settings, fine-grained access control, audit logging and VPC isolation — the configuration work is our scope, not a compliance certification.
  • Data governance — ISM policies automate index lifecycle (hot/warm/cold/delete transitions), enforce retention windows and trigger S3 snapshot archiving; index aliases enable zero-downtime rollovers without manual intervention.

Why YuSMP

Why teams choose YuSMP for OpenSearch engineering

Open-source licensing without vendor lock-in

OpenSearch is Apache-2.0 licensed — no Elastic SSPL restrictions, no licence audit risk as your cluster scales. We build on the managed AWS OpenSearch Service where that reduces operational burden, but the underlying technology is fully open-source and portable, protecting your investment in index design, mappings and application integrations.

Migration expertise from Elasticsearch 7.10

The 7.10 fork boundary introduces specific API and behavioural differences that are easy to overlook in a straight reindex. Our engineers have executed migrations across multiple client environments and maintain a documented compatibility matrix covering client libraries, index settings, aggregation behaviour and Dashboards equivalents for Kibana visualisations.

Search and observability in a single platform

OpenSearch handles both application search and log/SIEM analytics on the same cluster infrastructure. Teams that would otherwise run separate Elasticsearch and ELK stacks can consolidate onto AWS OpenSearch Service, reducing infrastructure cost and operational complexity while gaining fine-grained RBAC across both workloads.

FAQ

OpenSearch Engineering FAQ

OpenSearch vs Elasticsearch — which should we use?

OpenSearch (Apache-2.0) is the right choice when you are on AWS, want to avoid Elastic's SSPL licensing for self-managed deployments, or need tight integration with AWS services (S3, CloudWatch, IAM). Elasticsearch (Elastic licence or SSPL) is preferable when your team already uses Elastic Cloud, Elastic APM or Kibana features that have no OpenSearch equivalent. The REST API is compatible for most query and indexing operations through the 7.10 baseline, so application-level migration is usually straightforward for that subset.

How complex is migrating from Elasticsearch 7.10 or later to OpenSearch?

The 7.10 baseline means core search, aggregation and kNN APIs are compatible, but proprietary Elastic features added after 7.10 — EQL refinements, certain Kibana-specific APIs, Elastic APM wire format — have no direct OpenSearch equivalent. We start every migration with an API surface audit against your application code and current index templates, identify the gaps, and map each to an OpenSearch alternative before writing a line of migration script. Most REST-based applications migrate cleanly in one to three weeks of engineering time.

What does AWS OpenSearch Service manage vs what we handle ourselves?

AWS manages hardware provisioning, OS patching, OpenSearch version upgrades (with your approval), automated snapshots to S3, multi-AZ replication and basic CloudWatch metrics. You — and we, on your behalf — handle index design, shard sizing, mapping templates, ISM policies, security plugin configuration, fine-grained access control, custom dashboards and application-layer query optimisation. Choosing AWS OpenSearch Service eliminates the server operations burden but does not remove the need for search engineering expertise.

How does vector kNN search work in OpenSearch?

The OpenSearch k-NN plugin adds a knn_vector field type backed by FAISS or NMSLIB approximate nearest-neighbour graphs stored in JVM off-heap memory. At query time, a knn query returns the k most similar document vectors to a query embedding. In production we combine kNN with BM25 using a hybrid query and a normalisation processor pipeline, which consistently outperforms either approach alone for semantic document retrieval. Instance sizing must account for the kNN graph memory alongside the standard query cache allocation.

Can OpenSearch replace a dedicated SIEM tool for log analytics?

For many teams, yes. OpenSearch Ingestion (Data Prepper) ingests logs from Fluent Bit, Logstash or direct HTTP sources, applies field extraction and enrichment, and writes to time-series indices. OpenSearch Dashboards provides visualisation, anomaly detection and alerting comparable to the ELK stack. The security analytics plugin adds threat-detection rules in Sigma format. For regulated environments it delivers the continuous monitoring NIS2 requires at a fraction of the cost of commercial SIEM platforms, with no per-event licensing.

How does fine-grained access control work in the OpenSearch security plugin?

The security plugin layers multiple access control mechanisms: cluster-level permissions, index-level permissions, field-level masking (hash or anonymise specific fields), and document-level security (filter clauses applied automatically per role). Multi-tenancy in OpenSearch Dashboards isolates visualisations and index patterns between teams. We configure dedicated service-account roles for each application, separate read and write principals, and route audit events to an index that operations or compliance staff can query without cluster-admin access.

How do you reduce OpenSearch cluster costs without hurting query performance?

Cost on AWS OpenSearch Service is dominated by instance count and EBS storage. The primary levers are: right-sizing shard count to 20-50 GB per shard (over-sharding wastes master-node heap), enabling UltraWarm for indices queried infrequently (S3-backed, ~90% cheaper than EBS), moving cold data to the cold tier or deleting it via ISM after the retention window, and using force-merge on read-only historical indices to reduce segment count. We deliver a cost audit with specific ISM policies and instance recommendations before any change is applied to production.

Build audit-ready AWS OpenSearch clusters with senior search engineers

Response within 1 business day. NDA on request.

Get a proposal

Get a proposal

Share a few details and a senior consultant will reply within one business day.