Client: Confidential

In this project our task was to extract any meaningful time reference from transcripts of radio programs and to highlight them connected with a calendar control on the web page, so whenever somebody chooses some date (or a week, a month or a year) on the calendar, these word references would be highlighted on the web page and vice versa if somebody clicks on the highlighted word for a time-reference in the text (like for example "next week"), then relative to the date of the radio program transcript, a certain week on the calendar would get highlighted. The time references were of all sorts, starting from exact dates like "March 17th 2008" and then continuing to all words explaining some time reference, like "last month", "next year", "yesterday", "in two days" and many more combinations.

As a continuation of the Time Named Entity Extraction, we have been engaged in developing a module for automatic extraction of the possible keywords from the transcript of a radio program. We have used a combination of few NLP techniques, including the classical calculation of the term frequency, but we have combined these with proprietary advanced algorithms which take into account the context of the keywords, and the fact that some of these keywords could be mentioned only once in the text, while many of their related words could be mentioned many times in the same text.

