Data is extracted from Lund
University Publications database. For each time-period all
publications from EIT is selected. From each publication the fields
'Title', 'Keywords', and 'Abstract' are used. They are concatenated
(with weights 10 for 'Title', 5 for 'Keywords', and 1 for 'Abstract')
in a document which is added to a corpus. From the corpus the 50 most
important words and the 50 most important bigrams (2 consecutive
words) are extracted. Importance is measured by frequency of occurrence
adjusted for document size. for each group of terms (words, bigrams)
importance measure is scaled independently to a fixed interval
generating a score and the two lists are merged based on the score.
These 100 terms form the base for generating a TagCloud where size of
the term is proportional to the score.
In order to generate a time animation, a number of time-periods
spanning 2 years are used. Layout is calculated using rectangles for
the rendered terms with selected font based on the maximum score over
all time-periods. These rectangles are placed in fixed positions on the
screen with the largest in the middle. For each time-period a single
image is generated with each term in its fixed position but with the
font-size calculated from the actual score for that time-period.
The animation is constructed by interpolating (25 frames, cosine
interpolation) between the time-period images generating a lot of
frames with together constitute the animation.