Some Restrictions Apply
Calliope is a text reader/synthesizer from the French software company, Astefo. The Calliope process requires a certain amount of human interaction to determine the relevancy and applicability of extracted word/phrase lists. Calliope is ideal for companies that have unstructured textual sources that are better utilized when separate analytics is performed on each of multiple related bodies of work. Calliope does depart somewhat from the usual idea of text analytics software when, at the outset, Astefo declares that Calliope is not a crawler that seeks information from web documents, but rather is for use on a company’s internal repository of documents, no matter how massive. Repeated Calliope-processing of related but separate document collections is also encouraged to refine the effectiveness of the total analysis.
Calliope, The Verb
Calliope-processing of textual data is the how Astefo describes the four-step procedure that is performed on a given body of documents. This text data mining process is used to reveal themes and concepts that may not have been readily evident previously. The analysis uses ranking algorithms that gauge the importance (precedence) of discovered ideas. Subject matter experts may then prioritize the results so further processing can exploit initial unforeseen terms which Astefo calls “weak signals” that may hold potential as newly discovered facts or attributes.
Calliope-processing of Textual Data, Phase One
Calliope uses two phases to perform automated textmining of written matter, then it gauges the dynamics of the trends discovered. The first phase is the analysis process performed on a set of analogous documents. The second phase quantifies and categorizes multiple processing of related results.
- The first Calliope step, called “Pre-formatting,” converts all sources—likely to be in various document formats—into XML which allows standardized machine-reading.
- Next, the XML documents undergo “Extraction” which creates topical word lists. These lists are reviewed and validated for relevance by subject matter experts. Then, automatic indexing occurs on the lists. If Calliope isn’t able to come up with any topic lists, the user creates them within a software-guided sequence. Astefo concedes that this may be the most labor-intensive (i.e., lengthiest) step, sometime requiring 4 to 5 hours of user attention for every 150 pages of text. However, once completed, every unstructured text document is linked to a list of validated terms describing the content of all analyzed text.
- The heart of the analysis process, called “Co-Word Identification,” is when specific algorithms are applied to determine the relationship of words that are “thematically homogeneous.” Co-Words are analogous terms found within the body of analyzed text that are both 1) linked by the same topic thread, and 2) that frequently recur within it.
- And finally, the “Viewer” mode presents the user with various displays such as thematic maps and resultant trend curves. The shows the related documents and how they are linked with other word lists and other topics within the data.
Calliope Repetition: Phase Two
This Calliope Process is repeated on as many collections of documents as desired by the users. Then, once the processing has been completed, all results are automatically quantified into segments of relative importance within the entire corpus. This is done by computing a figure of merit that indicates the words’ “attraction power,” i.e. their frequency of “participation” within a network of terms built during the Co-Word Identification step. If this attraction power increases over the course of each repeated analysis process, the term gains importance and is designated as “emerging.” Non-analogous words (those that are not topically related or that are not periodically recurring) are quantified as words that are “aging” or “declining” in importance. And if thematic relationships neither advance nor decline in importance, remaining somewhat constant, they are identified as “stable”. The Calliope Viewer again displays the resulting trend curves of all analyzed terms and they are sorted into one of the three categories: Emerging, Stable, or Declining.
Astefo describes their Calliope text mining application as “a tool for stimulating your intelligence, suggesting paths for imagination and correlations, enabling you to widen your information space whilst reducing the duration for reading and analyzing.” They offer in-depth information and a downloadable demo on their website here.


