×

Detecting new and emerging events in streaming news documents. (English) Zbl 1250.68272

Summary: Recognizing new and emerging events in a stream of news documents requires understanding the semantic structure of news reported in natural language. New event detection (NED) is the task of recognizing when a news document discusses a completely novel event. To be successful at this task, we believe a NED method must extract and represent four principal components of an event: its type, participants, temporal, and spatial properties. These components must then be compared in a semantically robust manner to detect novelty. We further propose event centrality, a method for determining the most important participants in an event. Our NED methods produce a 29% cost reduction over a bag-of-words baseline and a 17% cost reduction over an existing state-of-the-art approach. Additionally, we discuss our method for recognizing emerging events: the tracking and categorization of unexpected or novel events.

MSC:

68T50 Natural language processing
68U15 Computing methodologies for text processing; mathematical typography
68P20 Information storage and retrieval of data

Software:

WordNet
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Fellbaum C., WordNet: An Electronic Lexical Database (1998)
[2] DOI: 10.1016/0306-4573(86)90097-X · doi:10.1016/0306-4573(86)90097-X
[3] Willett R., Information Processing and Management 25 pp 577–
[4] DOI: 10.1007/978-3-540-88845-1_12 · doi:10.1007/978-3-540-88845-1_12
[5] Blei D. M., Journal of Machine Learning Research 3 pp 993–
[6] W. Quine, Actions and Events: Perspectives on the Philosophy of Donald Davidson, eds. E. LePore and B. P. McLaughlin (1985) pp. 162–171.
[7] DOI: 10.1145/2047296.2047300 · doi:10.1145/2047296.2047300
[8] Manning C. D., Foundations of Statistical Natural Language Processing (1999) · Zbl 0951.68158
[9] DOI: 10.1023/B:INRT.0000011210.12953.86 · Zbl 05063209 · doi:10.1023/B:INRT.0000011210.12953.86
[10] T. Joachims, Advances in Kernel Methods – Support Vector Learning (1999) pp. 41–56.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.