Elektro- och informationsteknik

Lunds Tekniska Högskola | Lunds universitet

Tagcloud time animation EIT publications


Technique behind TagCloud Animation generation

Data is extracted from Lund University Publications database. For each time-period all publications from EIT is selected. From each publication the fields 'Title', 'Keywords', and 'Abstract' are used. They are concatenated (with weights 10 for 'Title', 5 for 'Keywords', and 1 for 'Abstract') in a document which is added to a corpus. From the corpus the 50 most important words and the 50 most important bigrams (2 consecutive words) are extracted. Importance is measured by frequency of occurrence adjusted for document size. for each group of terms (words, bigrams) importance measure is scaled independently to a fixed interval generating a score and the two lists are merged based on the score. These 100 terms form the base for generating a TagCloud where size of the term is proportional to the score.

In order to generate a time animation, a number of time-periods spanning 2 years are used. Layout is calculated using rectangles for the rendered terms with selected font based on the maximum score over all time-periods. These rectangles are placed in fixed positions on the screen with the largest in the middle. For each time-period a single image is generated with each term in its fixed position but with the font-size calculated from the actual score for that time-period.

The animation is constructed by interpolating (25 frames, cosine interpolation) between the time-period images generating a lot of frames with together constitute the animation.