Top Reearch Articles @ 2024

ASSAMESE-ENGLISH BILINGUAL MACHINE TRANSLATION

Kalyanee Kanchan Baruah¹, Pranjal Das², Abdul Hannan1¹ and Shikhar Kr Sarma¹, ¹Department of Information Technology, Gauhati University, Guwahati, Assam

ABSTRACT

Machine translation is the process of translating text from one language to another. In this paper, Statistical Machine Translation is done on Assamese and English language by taking their respective parallel corpus. A statistical phrase based translation toolkit Moses is used here. To develop the language model and to align the words we used two another tools IRSTLM, GIZA respectively. BLEU score is used to check our translation system performance, how good it is. A difference in BLEU scores is obtained while translating sentences from Assamese to English and vice-versa. Since Indian languages are morphologically very rich hence translation is relatively harder from English to Assamese resulting in a low BLEU score. A statistical transliteration system is also introduced with our translation system to deal basically with proper nouns, OOV (out of vocabulary) words which are not present in our corpus.

KEYWORDS

Assamese, Machine translation, Moses, Corpus, BLEU

For More Details :
https://airccse.org/journal/ijnlc/papers/3314ijnlc07.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol3.html

RESUME INFORMATION EXTRACTION WITH A NOVEL TEXT BLOCK SEGMENTATION ALGORITHM

Shicheng Zu and Xiulai Wang, Post-doctoral Scientific Research Station in East War District General Hospital, Nanjing, Jiangsu 210000, China

ABSTRACT

In recent years, we have witnessed the rapid development of deep neural networks and distributed representations in natural language processing. However, the applications of neural networks in resume parsing lack systematic investigation. In this study, we proposed an end-to-end pipeline for resume parsing based on neural networks-based classifiers and distributed embeddings. This pipeline leverages the position-wise line information and integrated meanings of each text block. The coordinated line classification by both line type classifier and line label classifier effectively segment a resume into predefined text blocks. Our proposed pipeline joints the text block segmentation with the identification of resume facts in which various sequence labelling classifiers perform named entity recognition within labelled text blocks. Comparative evaluation of four sequence labelling classifiers confirmed BLSTMCNNs-CRF’s superiority in named entity recognition task. Further comparison among three publicized resume parsers also determined the effectiveness of our text block classification method.

KEYWORDS

Resume Parsing, Word Embeddings, Named Entity Recognition, Text Classifier, Neural Networks

For More Details :
https://aircconline.com/ijnlc/V8N5/8519ijnlc03.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol8.html

GRAMMAR CHECKERS FOR NATURAL LANGUAGES: A REVIEW

Nivedita S. Bhirud¹, R.P. Bhavsar², B.V. Pawar³, ¹Department of Computer Engineering, Vishwakarma Institute of Information Technology, Pune, India, ^2,3School of Computer Sciences, North Maharashtra University, Jalgaon, India

ABSTRACT

Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called ‘Computational Linguistics’ which focuses on processing of natural languages on computational devices. A natural language consists of many sentences which are meaningful linguistic units involving one or more words linked together in accordance with a set of predefined rules called ‘grammar’. Grammar checking is fundamental task in the formal world that validates sentences syntactically as well as semantically. Grammar Checker tool is a prominent tool within language engineering. Our review draws on the till date development of various Natural Language grammar checkers to look at past, present and the future in the present context. Our review covers common grammatical errors , overview of grammar checking process, grammar checkers of various languages with the aim of seeking their approaches, methodologies and performance evaluation, which would be great help for developing new tool and system as a whole. The survey concludes with the discussion of different features included in existing grammar checkers of foreign languages as well as a few Indian Languages.

KEYWORDS

Natural Language Processing, Computational Linguistics, Writing errors,Grammatical mistakes, Grammar Checker

For More Details :
https://aircconline.com/ijnlc/V6N4/6417ijnlc01.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol6.html

Named Entity Recognition using Hidden Markov Model (HMM)

Sudha Morwal¹, Nusrat Jahan² and Deepti Chopra³, ¹Associate Professor, Banasthali University, Jaipur, Rajasthan-302001, ²M.Tech (CS), Banasthali University, Jaipur, Rajasthan-302001, ³M. Tech (CS), Banasthali University, Jaipur, Rajasthan-302001

ABSTRACT

Named Entity Recognition (NER) is the subtask of Natural Language Processing (NLP) which is the branch of artificial intelligence. It has many applications mainly in machine translation, text to speech synthesis, natural language understanding, Information Extraction, Information retrieval, question answering etc. The aim of NER is to classify words into some predefined categories like location name, person name, organization name, date, time etc. In this paper we describe the Hidden Markov Model (HMM) based approach of machine learning in detail to identify the named entities. The main idea behind the use of HMM model for building NER system is that it is language independent and we can apply this system for any language domain. In our NER system the states are not fixed means it is of dynamic in nature one can use it according to their interest. The corpus used by our NER system is also not domain specific.

KEYWORDS

Named Entity Recognition (NER), Natural Language processing (NLP), Hidden Markov Model(HMM).

For More Details :
https://airccse.org/journal/ijnlc/papers/1412ijnlc02.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol1.html

SURVEY OF MACHINE TRANSLATION SYSTEMS IN INDIA

G V Garje¹ and G K Kharate², ¹Department of Computer Engineering and Information Technology PVG’s College of Engineering and Technology, Pune, India, ²Principal, Matoshri College of Engineering and Research Centre, Nashik, India

ABSTRACT

The work in the area of machine translation has been going on for last few decades but the promising translation work began in the early 1990s due to advanced research in Artificial Intelligence and Computational Linguistics. India is a multilingual and multicultural country with over 1.25 billion population and 22 constitutionally recognized languages which are written in 12 different scripts. This necessitates the automated machine translation system for English to Indian languages and among Indian languages so as to exchange the information amongst people in their local language. Many usable machine translation systems have been developed and are under development in India and around the world. The paper focuses on different approaches used in the development of Machine Translation Systems and also briefly described some of the Machine Translation Systems along with their features, domains and limitations.

KEYWORDS

Machine Translation, Example-based MT, Transfer-based MT, Interlingua-based MT

For More Details :
https://airccse.org/journal/ijnlc/papers/2513ijnlc04.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol2.html

MACHINE TRANSLATION DEVELOPMENT FOR INDIAN LANGUAGES AND ITS APPROACHES

Amruta Godase¹ and Sharvari Govilkar², ¹Department of Information Technology (AI & Robotics), PIIT, Mumbai University, India, ²Department of Computer Engineering, PIIT, Mumbai University, India

ABSTRACT

This paper presents a survey of Machine translation system for Indian Regional languages. Machine translation is one of the central areas of Natural language processing (NLP). Machine translation (henceforth referred as MT) is important for breaking the language barrier and facilitating interlingual communication. For a multilingual country like INDIA which is largest democratic country in whole world, there is a big requirement of automatic machine translation system. With the advent of Information Technology many documents and web pages are coming up in a local language so there is a large need of good MT systems to address all these issues in order to establish a proper communication between states and union governments to exchange information amongst the people of different states. This paper focuses on different Machine translation projects done in India along with their features and domain.

KEYWORDS

Machine translation, computational linguistics, Indian Languages, Rule-based, Statistical, Empirical MT, Principle-based, Knowledge-based, Hybrid

For More Details :
https://airccse.org/journal/ijnlc/papers/4215ijnlc05.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol4.html

SURVEY ON MACHINE TRANSLITERATION AND MACHINE LEARNING MODELS

M L Dhore¹, R M Dhore², P H Rathod ³, ^1,3Vishwakarma Institute of Technology, Savitribai Phule Pune University, India, ²Pune Vidhyarthi Girha’s College of Engineering and Technology, SPPU, India

ABSTRACT

Globalization and growth of Internet users truly demands for almost all internet based applications to support local languages. Support of local languages can be given in all internet based applications by means of Machine Transliteration and Machine Translation. This paper provides the thorough survey on machine transliteration models and machine learning approaches used for machine transliteration over the period of more than two decades for internationally used languages as well as Indian languages. Survey shows that linguistic approach provides better results for the closely related languages and probability based statistical approaches are good when one of the languages is phonetic and other is nonphonetic.Better accuracy can be achieved only by using Hybrid and Combined models.

KEYWORDS

CRF, Grapheme, HMM, Machine Transliteration, Machine Learning, NCM, Phoneme, SVM

For More Details :
https://airccse.org/journal/ijnlc/papers/4215ijnlc02.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol4.html

Algorithm For Text To Graph Conversion And Summarizing Using Nlp: A New Approach For Business Solutions

Prajakta Yerpude and Rashmi Jakhotiya and Manoj Chandak, Department of Computer Science and Engineering, RCOEM, Nagpur

ABSTRACT

Text can be analysed by splitting the text and extracting the keywords .These may be represented as summaries, tabular representation, graphical forms, and images. In order to provide a solution to large amount of information present in textual format led to a research of extracting the text and transforming the unstructured form to a structured format. The paper presents the importance of Natural Language Processing (NLP) and its two interesting applications in Python Language: 1. Automatic text summarization [Domain: Newspaper Articles] 2. Text to Graph Conversion [Domain: Stock news]. The main challenge in NLP is natural language understanding i.e. deriving meaning from human or natural language input which is done using regular expressions, artificial intelligence and database concepts. Automatic Summarization tool converts the newspaper articles into summary on the basis of frequency of words in the text. Text to Graph Converter takes in the input as stock article, tokenize them on various index (points and percent) and time and then tokens are mapped to graph. This paper proposes a business solution for users for effective time management.

KEYWORDS

NLP, Automatic Summarizer, Text to Graph Converter, Data Visualization, Regular Expression, Artificial Intelligence

For More Details :
https://airccse.org/journal/ijnlc/papers/4415ijnlc03.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol4.html

SENTIMENT ANALYSIS ON PRODUCT FEATURES BASED ON LEXICON APPROACH USING NATURAL LANGUAGE PROCESSING

Ameya Yerpude, Akshay Phirke, Ayush Agrawal and Atharva Deshmukh, Department of Computer Science and Engineering, RCOEM, Nagpur, India

ABSTRACT

Sentiment analysis has played an important role in identifying what other people think and what their behavior is. Text can be used to analyze the sentiment and classified as positive, negative or neutral. Applying the sentiment analysis on the product reviews on e-market helps not only the customer but also the industry people for taking decision. The method which provides sentiment analysis about the individual product’s features is discussed here. This paper presents the use of Natural Language Processing and SentiWordNet in this interesting application in Python: 1. Sentiment Analysis on Product review [Domain: Electronic]2. sentiment analysis regarding the product’s feature present in the product review [Sub Domain: Mobile Phones]. It usesa lexicon based approach in which text is tokenized for calculating the sentiment analysis of the product reviews on a e-market. The first part of paper includessentiment analyzer whichclassifiesthe sentiment present in product reviews into positive, negative or neutral depending on the polarity. The second part of the paper is an extension to the first part in which the customer review’s containing product’s features will be segregated and then these separated reviews are classified into positive, negative and neutral using sentiment analysis. Here, mobile phones are used as the product with features as screen, processors, etc. This gives a business solution for users and industries for effective product decisions.

KEYWORDS

Sentiment Analysis, Natural Language Processing, SentiWordNet, lexicon based approach

For More Details :
https://aircconline.com/ijnlc/V8N3/8319ijnlc01.pdf

Volume Link :
https://airccse.org/journal/ijnlc/vol8.html

CSEIJ

Top 10 read research articles in the field of Natural Language Computing @ 2024

ASSAMESE-ENGLISH BILINGUAL MACHINE TRANSLATION

ABSTRACT

KEYWORDS

RESUME INFORMATION EXTRACTION WITH A NOVEL TEXT BLOCK SEGMENTATION ALGORITHM

ABSTRACT

KEYWORDS

GRAMMAR CHECKERS FOR NATURAL LANGUAGES: A REVIEW

ABSTRACT

KEYWORDS

Named Entity Recognition using Hidden Markov Model (HMM)

ABSTRACT

KEYWORDS

SURVEY OF MACHINE TRANSLATION SYSTEMS IN INDIA

ABSTRACT

KEYWORDS

MACHINE TRANSLATION DEVELOPMENT FOR INDIAN LANGUAGES AND ITS APPROACHES

ABSTRACT

KEYWORDS

SURVEY ON MACHINE TRANSLITERATION AND MACHINE LEARNING MODELS

ABSTRACT

KEYWORDS

Algorithm For Text To Graph Conversion And Summarizing Using Nlp: A New Approach For Business Solutions

ABSTRACT

KEYWORDS

SENTIMENT ANALYSIS ON PRODUCT FEATURES BASED ON LEXICON APPROACH USING NATURAL LANGUAGE PROCESSING

ABSTRACT

KEYWORDS

Reach Us