A New Approach to Summarization in the Kannada Language by Sentence Ranking

Authors

  • Jayashree R Department of Computer Science and Engineering, PES Institute of Technology
  • Srikantamurthy K Department of Computer Science and Engineering, PES Institute of Technology
  • Basavaraj S Anami Department of Computer Science and Engineering, KLE Institute of Technology

Keywords:

Summary, Keywords, GSS coefficient, TF, IDF, Ranking, Word weight, Sentence.

Abstract

Text summarization aims at producing quick and concise summary from a document and is considered central to Information Retrieval (IR) systems. In this paper, we have presented a sentence ranking based method for Kannada language text summarization. Each word in a Kannada document is assigned a weight and the weight of the sentence is computed as the sum of weights of all words present in the sentence. We have chosen the first ‘m’ sentences by arranging them in the descending order of their weights. The data used for testing is devised from the documents available in Kannada web portal called Kannada webdunia.

In this methodology, keywords are extracted from Kannada language documents by combining the feature extraction techniques, namely, TF (Term Frequency) and Inverse Document Frequency (IDF). The stop words are removed by using a technique developed which finds structurally similar words in a document. The methodology is compared with the key word extraction based summarization [18]. The results are satisfactory.

Downloads

Download data is not yet available.

Downloads

Published

2013-04-01

How to Cite

Jayashree R, Srikantamurthy K, & Basavaraj S Anami. (2013). A New Approach to Summarization in the Kannada Language by Sentence Ranking. Journal of Network and Innovative Computing, 1, 15. Retrieved from https://cspub-jnic.org/index.php/jnic/article/view/24

Issue

Section

Original Article