In the case of a failure, your ingress replicator(s) or search service nodes may be
unreachable. This topic describes what happens during an outage.
Note: To avoid non-recoverable disk failures, Jive Software recommends that you configure
the ingress replicator journals and search service indexes so that they are written to
durable storage. For each ingress replicator, allocate at least 20GB for journal
storage. For each search service, allocate at least 50GB for index storage. Monitor
these storage volumes for remaining capacity, maintaining 25% free capacity.
In the case of a failure of any given node in your HA search configuration, here's what
happens:
- Ingress replicator node fails
- The ingress replicator journals everything to disk to guarantee all
ingressed activities will be delivered at least once. If the service fails
or is stopped, it will send any remaining journaled events when it starts
back up. If the service cannot come back up due to a non-recoverable disk
failure, then a full rebuild will be required (see Rebuilding an
On-Premise HA Search Service). If both ingress replicators fail
(or you have only one and it fails), for the duration of the outage no new
content will be indexed; but, when the ingress replicator comes back online,
the search service will catch up with the indexed content (due to local
caching on the web application nodes); therefore, the search service will
not have missed anything.
- Search service node fails
- If search service 1 or 2 is offline for any reason, the ingress replicator
will retain the undelivered activities. When search service 1 or 2 is
restored to a healthy state, the undelivered activities will be sent to the
restored service. While previously undelivered activities are being fed into
the newly restored service, the search indexes will be out of sync. After
all undelivered activities have been received by the restored service, the
indexes will be synced. If the service cannot be restored due to a
non-recoverable disk failure, then you'll need to remove and re-add the
affected search service (see Adding an
On-Premise HA Search Service Node). If you leave a search service
down for a very long period of time (e.g., many weeks), you may run out of
disk space because the ingress replicator services will be persisting to
disk until the configured search service is restored. If you don't plan to
restore the offline search service, then remove the offline search service
from all ingress replicator configuration files and restart the ingress
replicators.