Search relevancy in content and place search
Jive searches object fields. Stop words are excluded. The parameters, described in this article, impact the rank of a content item and can provide a boost to get it to the top of the search results.
Rank = (SimilarityScore + ProximityScore) * OutcomeType * ObjectType * Recency
The resulting rank determines what will be displayed in the user's search results and in what order, with the objective to surface the most relevant content first.
Fields that are searched
Cloud Search uses information from the following fields:
subject
: Title field of Jive objects, such as place or doc title.body
: Content of objects, such as blog post text.tag
: Tags added to objects.
Analyzed
and
Edgengram
fields. Stop words
The Cloud Search service uses a predefined set of stop words for the specified
languages. Stop words are common words often occurring in any text. For
example, the stop words for the English language include of
,
the
, this
, a
, and
and
.
When a search request is processed, these words are skipped from the searched object fields and the Analyzed and Edgengram sub-fields.
Similarity score
When searching for a phrase the system looks at each word in the phrase and checks the match type and place of match for this work. Each match type and place has its own boost score. The boost score is normalized with the number of times the searched term appears in the given content (the more it appears the better), as well as with the number of times this term appears in the search index (the more common the term is, the less impact it has on the rank). The default settings are listed in Table 1.
Match types reflect how well your search query matches the results:
- Raw: Exact matches of the search term.
- Analyzed: Matches that are created by language analyzer. In this case,
stemming is used, that is, looking for the root of the word. Stop
words are ignored. For example,
focusing
will also findfocus
,focused
, and other related words with the same stem. - Edgengram: Partial match, used for wildcard search matches and matches in search-as-you-type queries. Stop words are ignored.
Match place | Match type | ||
---|---|---|---|
Raw | Analyzed | Edgengram | |
subject |
1.0 | 1.0 | 1.0 |
body |
0.1 | 0.1 | 0.1 |
tag |
0.5 | 0.5 |
Analyzed
and
Edgengram
fields. Proximity score
The proximity score checks how close is the term the user searches for to what
appears in the content – to boost more relevant results. When a user searches for a
phrase built from several words, this phrase may appear exactly the same way in the
content or it may appear in the content in a slightly different way. For example,
content with the term product one-pager brochure
is an approximate
match when searching for product brochure
.
Types of proximity boosts:
- Exact match: When all the search terms appear in the content next to each other
- Proximity match: When all the search terms appear less than three words apart from each other
The default settings are listed in Table 2.
Place | Proximity boost | Exact match boost |
---|---|---|
Subject | 0.5 | 1.6 |
Body | 0.5 | 1.0 |
Tags** | 0.1 | 1.0 |
** Having proximity score on tags is unlikely to happen.
raw
sub-field
for tags and only analyzed
sub-field for other search fields. A higher the boost value, more matches in the given field, and the match method get ranked higher. Stop words are ignored.
Besides the matches, we also look for frequency. The score has a lot to do with how
many occurrences of the word user is searching for exists in the field. For example,
if a 20,000-word essay makes a single reference to the movie Finding
Nemo
somewhere in the document and another document in the system has
only 50 words and includes Finding Nemo
, the latter is counted more
relevant to a query for nemo
.
Outcome type
Content in Jive can be marked with structured outcomes. These outcomes impact the score of that content in the search results, results are boosted based on outcome type.
The boosts given to content according to outcome type are listed in Table 3.
Outcome | Boost | Outcome | Boost |
---|---|---|---|
Finalized | 1.4 | Official | 2.0 |
Outdated | 0.1 | Default | 1.0 |
This score is being multiplied by the boosts above. A higher boost results in that content being ranked higher in the search results, so the 0.1 score for outdated documents significantly reduces their rank.
Object type
Similarly to outcome boost, there is a boost for ranks based on the type of content used. Documents and blogs are ranked higher in the search results as these are usually used for more comprehensive content that may be more relevant for the searching user. The settings are listed in Table 4.
Object | Boost | Object | Boost |
---|---|---|---|
Document | 1.4 | Poll | 1.0 |
Blog | 1.4 | Idea | 1.0 |
Blog post | 1.4 | Video | 1.0 |
Discussion | 1.0 | Status Update | 1.0 |
Question | 1.0 |
Recency
Recency (or time decay) lowers the score for older content. The impact of content can be seen this way:
Figure: Default recency boost

The recency score calculation is based on the following parameters:
- Drop speed
- Determines how fast the algorithm reduces the content score by age. The default setting in 50.
- Max value
- Determines the latest period the content from which has the same score without decay. The default setting is 4 weeks.
- Minimum score
- Determines the score difference of a very old document and a just created one as 2 times as maximum. It is set so that even the oldest relevant content can be found but allows preference for fresh content. The default setting is 0.9.