Content popularity
ESB3024 Router can make routing decisions based on content popularity. All incoming content requests are tracked to continuously update a content popularity ranking list. The popularity ranking algorithm is designed to let popular content quickly rise to the top while unpopular content decays and sinks towards the bottom.
Routing
A content popularity based routing rule can be created by running
$ confcli services.routing.rules -w
Running wizard for resource 'rules'
Hint: Hitting return will set a value to its default.
Enter '?' to receive the help string
rules : [
rule can be one of
1: allow
2: consistentHashing
3: contentPopularity
4: deny
5: firstMatch
6: random
7: rawGroup
8: rawHost
9: split
10: weighted
Choose element index or name: contentPopularity
Adding a 'contentPopularity' element
rule : {
name (default: ): content_popularity_rule
type (default: contentPopularity):
contentPopularityCutoff (default: 10): 5
onPopular (default: ): edge-streamer
onUnpopular (default: ): offload
}
Add another 'rule' element to array 'rules'? [y/N]: n
]
Generated config:
{
"rules": [
{
"name": "content_popularity_rule",
"type": "contentPopularity",
"contentPopularityCutoff": 5.0,
"onPopular": "edge-streamer",
"onUnpopular": "offload"
}
]
}
Merge and apply the config? [y/n]: y
This rule will route requests for the top 5 most popular content items to
edge-streamer
and all other requests to offload
.
Some configuration settings attributed to content popularity are available:
$ confcli services.routing.settings.contentPopularity
{
"contentPopularity": {
"enabled": true,
"algorithm": "score_based",
"sessionGroupNames": [],
"popularityListMaxSize": 100000,
"scoreBased": {
"popularityDecayFraction": 0.2,
"popularityPredictionFactor": 2.5,
"requestsBetweenPopularityDecay": 1000
},
"timeBased": {
"intervalsPerHour": 10
}
}
}
enabled
: Whether or not to track content popularity. Whenenabled
is set tofalse
, content popularity will not be tracked. Note that routing on content popularity is possible even ifenabled
isfalse
and content popularity has been tracked previously.algorithm
: Choice of content popularity tracking algorithm. There are two possible choices:score_based
ortime_based
(detailed below).sessionGroupNames
: Names of the session groups for which content popularity should be tracked. If left empty, content popularity will be tracked for all sessions. The content popularity is tracked globally, not per session group, but the popularity metrics is only updated for sessions belonging to these groups.popularityListMaxSize
: The maximum amount of unique content items to track for popularity.scoreBased
: Configuration parameters unique to the score based algorithm.timeBased
: Configuration parameters unique to the time based algorithm.
Size of Popularity List
The size of the popularity list is limited to prevent it growing forever. A single entry in the popularity ranking list will at most consume 180 bytes of memory. E.g. setting the maximum size to 1000 would consume at most 180⋅1,000 = 180,000 B = 0.18 MB. If the content popularity list is full, a request to a new item will replace the least popular item.
Setting a very high maximum size will not impact performance, it will only consume more memory.
Score-Based Algorithm
The requestsBetweenPopularityDecay
parameter defines the number of requests
between each popularity decay update, an integral component of this feature.
The popularityPredictionFactor
and popularityDecayFraction
settings tune
the behaviour of the content popularity ranking algorithm, explained further
below.
Decay Update
To allow for popular content to quickly rise in popularity and unpopular content to sink, a dynamic popularity ranking algorithm is used. The goal of the algorithm is to track content popularity in real time, allowing routing decisions based on the requested content’s popularity. The algorithm is applied every decay update.
The algorithm uses current trending content to predict content popularity. The
popularityPredictionFactor
setting regulates how much the algorithm should rely
on predicted popularity. A high prediction factor allows rising content to quickly
rise to high popularity but can also cause unpopular content with a sudden burst
of requests to wrongfully rise to the top. A low prediction factor can cause
stagnation in the popularity ranking, not allowing new popular content to rise
to the top.
Unpopular content decays in popularity, the magnitude of which is regulated by
popularityDecayFraction
. A high value will aggressively decay content
popularity on every decay update while a low value will bloat the ranking,
causing stagnation. Once content decays to a trivially low popularity score, it
is pruned from the content popularity list.
When configuring these tuning parameters, the most crucial data to consider is
the size of your asset catalog, i.e. the number of unique contents you offer.
The recommended values, obtained through testing, are presented in the table below.
Note that the popularityPredictionFactor
setting is the principal factor in
controlling the algorithm’s behaviour.
Catalog size n | Popularity prediction factor | Popularity decay fraction |
---|---|---|
n < 1000 | 2.2 | 0.2 |
1000 < n < 5000 | 2.3 | 0.2 |
5000 < n < 10000 | 2.5 | 0.2 |
n > 10000 | 2.6 | 0.2 |
Time-Based Algorithm
The time based algorithm only requires the configuration parameter
intervalsPerHour
. As an example, setting intervalsPerHour
to 10
would give 10 six minute intervals per hour. During each interval,
all unique content requests has an associated counter, increasing
by one for each incoming request. After an hour, all intervals have
been cycled through. The counters in the first interval will be reset
and all incoming content requests will increase the counters in the
first interval again. This cycle continues forever.
When determining a single content’s popularity, the sum of each content’s counter in all intervals is used to determine a popularity ranking.