Indian Language Benchmark Portal

6 results

Please Login/Register to submit the new Resources

Bengali to Assamese Statistical Machine Translation using Moses (Corpus Based)
Nayan Jyoti KalitaBaharul Islam

Machine dialect interpretation assumes a real part in encouraging man-machine correspondence and in addition men-men correspondence in Natural Language Processing (NLP). Machine Translation (MT) alludes to utilizing machine to change one dialect to an alternate. Statistical Machine Translation is a type of MT consisting of Language Model (LM), Translation Model (TM) and decoder. In this paper, Bengali to Assamese Statistical Machine Translation Model has been created by utilizing Moses. Other translation tools like IRSTLM for Language Model and GIZA-PP-V1.0.7 for Translation model are utilized within this framework which is accessible in Linux situations. The purpose of the LM is to encourage fluent output and the purpose of TM is to encourage similarity between input and output, the decoder increases the probability of translated text in target language. A parallel corpus of 17100 sentences in Bengali and Assamese has been utilized for preparing within this framework. Measurable MT procedures have not so far been generally investigated for Indian dialects. It might be intriguing to discover to what degree these models can help the immense continuous MT deliberations in the nation.

An Improved Feature Descriptor for Recognition of Handwritten Bangla Alphabet
Nibaran DasSubhadip BasuRam SarkarMahantapas KunduMita NasipuriDipak kumar Basu

Appropriate feature set for representation of pattern classes is one of the most important aspects of handwritten character recognition. The effectiveness of features depends on the discriminating power of the features chosen to represent patterns of different classes. However, discriminatory features are not easily measurable. Investigative experimentation is necessary for identifying discriminatory features. In the present work we have identified a new variation of feature set which significantly outperforms on handwritten Bangla alphabet from the previously used feature set. 132 number of features in all viz. modified shadow features, octant and centroid features, distance based features, quad tree based longest run features are used here. Using this feature set the recognition performance increases sharply from the 75.05% observed in our previous work [7], to 85.40% on 50 character classes with MLP based classifier on the same dataset.

Determination of Nonequilibrium Temperature and Pressure using Clausius Equality in a State with Memory: A Simple Model Calculation
P. D. Gujrati

Use of the extended definition of heat dQ=deQ+diQ converts the Clausius inequality dS greater than or equal to deQ/T0 into an equality dS=dQ/T involving the nonequilibrium temperature T of the system having the conventional interpretation that heat flows from hot to cold. The equality is applied to the exact quantum evolution of a 1-dimensional ideal gas free expansion. In a first ever calculation of its kind in an expansion which retains the memory of initial state, we determine the nonequilibrium temperature T and pressure P, which are then compared with the ratio P/T obtained by an independent method to show the consistency of the nonequilibrium formulation. We find that the quantum evolution by itself cannot eliminate the memory effect.cannot eliminate the memory effect; hence, it cannot thermalize the system.

Hindi to English Transfer Based Machine Translation System
Akanksha GehlotVaishali SharmaShashi Pal SinghAjai Kumar

In large societies like India there is a huge demand to convert one human language into another. Lots of work has been done in this area. Many transfer based MTS have developed for English to other languages, as MANTRA CDAC Pune, MATRA CDAC Pune, SHAKTI IISc Bangalore and IIIT Hyderabad. Still there is a little work done for Hindi to other languages. Currently we are working on it. In this paper we focus on designing a system, that translate the document from Hindi to English by using transfer based approach. This system takes an input text check its structure through parsing. Reordering rules are used to generate the text in target language. It is better than Corpus Based MTS because Corpus Based MTS require large amount of word aligned data for translation that is not available for many languages while Transfer Based MTS requires only knowledge of both the languages(source language and target language) to make transfer rules. We get correct translation for simple assertive sentences and almost correct for complex and compound sentences.

Classifier-Based Text Simplification for Improved Machine Translation
Shruti TyagiDeepti ChopraIti MathurNisheeth Joshi

Machine Translation is one of the research fields of Computational Linguistics. The objective of many MT Researchers is to develop an MT System that produce good quality and high accuracy output translations and which also covers maximum language pairs. As internet and Globalization is increasing day by day, we need a way that improves the quality of translation. For this reason, we have developed a Classifier based Text Simplification Model for English-Hindi Machine Translation Systems. We have used support vector machines and Na\"ive Bayes Classifier to develop this model. We have also evaluated the performance of these classifiers.

Handwritten Malayalam Character Recognition using Curvelet Transform and ANN
Manju Manuel R SaidasS.

Malayalam, the official language of Kerala, a southern state of India has been accorded the honour of language of eminence. Hence the researches in recognition and related works in Malayalam language is gaining more prominence in the current scenario. This paper proposes the use of Curvelet transform and neural network for the recognition of handwritten Malayalam character. Curvelet transform is to be used in the feature extraction stage and neural network for classification. Curvelet transform provides a compact representation for curved singularities and is well suited for malayalam language. Two different back propagation algorithms had been employed and the performance is compared on varying architecture. The promising feature of the work is successful classification of 53 characters which is an improvement over the existing works. Application of character recognition include sorting of bank cheques and postal letters, reading aid for blind, data compression etc. Besides, an automated tool with graphical user interface in MATLAB has been developed for Malayalam character recognition. General Terms Pattern Recognition, Artificial Neural Network (ANN), Curvelet Transform, Optical character recognition (OCR),

Filter by Author
P. D. Gujrati (8)
Manish Shrivastava (7)
Partha Pratim Roy (5)
Umapada Pal (5)
Ayan Kumar Bhunia (4)
Iti Mathur (4)