Shahjalal Ahmed,
Md. Rafiqul Islam,
Jahid Hassan,
Minhaz Uddin Ahmed,
Bilkis Jamal Ferdosi,
Sanjay Saha,
Md. Shopon
Recent advancements in the field of computer vision with the help of deep
neural networks have led us to explore and develop many existing challenges
that were once unattended due to the lack of neces... sary technologies. Hand
Sign/Gesture Recognition is one of the significant areas where the deep neural
network is making a substantial impact. In the last few years, a large number
of researches has been conducted to recognize hand signs and hand gestures,
which we aim to extend to our mother-tongue, Bangla (also known as Bengali).
The primary goal of our work is to make an automated tool to aid the people who
are unable to speak. We developed a system that automatically detects hand sign
based digits and speaks out the result in Bangla language. According to the
report of the World Health Organization (WHO), 15% of people in the world live
with some kind of disabilities. Among them, individuals with communication
impairment such as speech disabilities experience substantial barrier in social
interaction. The proposed system can be invaluable to mitigate such a barrier.
The core of the system is built with a deep learning model which is based on
convolutional neural networks (CNN). The model classifies hand sign based
digits with 92% accuracy over validation data which ensures it a highly
trustworthy system. Upon classification of the digits, the resulting output is
fed to the text to speech engine and the translator unit eventually which
generates audio output in Bangla language. A web application to demonstrate our
tool is available at http://bit.ly/signdigits2banglaspeech. more
Md. Ataur Rahman,
Md. Hanif Seddiqui
Detecting emotions from text is an extension of simple sentiment polarity
detection. Instead of considering only positive or negative sentiments,
emotions are conveyed using more tangible manner; thus... , they can be expressed
as many shades of gray. This paper manifests the results of our experimentation
for fine-grained emotion analysis on Bangla text. We gathered and annotated a
text corpus consisting of user comments from several Facebook groups regarding
socio-economic and political issues, and we made efforts to extract the basic
emotions (sadness, happiness, disgust, surprise, fear, anger) conveyed through
these comments. Finally, we compared the results of the five most popular
classical machine learning techniques namely Naive Bayes, Decision Tree,
k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and K-Means Clustering
with several combinations of features. Our best model (SVM with a non-linear
radial-basis function (RBF) kernel) achieved an overall average accuracy score
of 52.98% and an F1 score (macro) of 0.3324 more
Shantipriya Parida,
Ondřej Bojar,
Satya Ranjan Dash
Visual Genome is a dataset connecting structured image information with
English language. We present ``Hindi Visual Genome'', a multimodal dataset
consisting of text and images suitable for English-Hi... ndi multimodal machine
translation task and multimodal research. We have selected short English
segments (captions) from Visual Genome along with associated images and
automatically translated them to Hindi with manual post-editing which took the
associated images into account. We prepared a set of 31525 segments,
accompanied by a challenge test set of 1400 segments. This challenge test set
was created by searching for (particularly) ambiguous English words based on
the embedding similarity and manually selecting those where the image helps to
resolve the ambiguity.
Our dataset is the first for multimodal English-Hindi machine translation,
freely available for non-commercial research purposes. Our Hindi version of
Visual Genome also allows to create Hindi image labelers or other practical
tools.
Hindi Visual Genome also serves in Workshop on Asian Translation (WAT) 2019
Multi-Modal Translation Task. more
Mohd Zeeshan Ansari,
Lubna Khan
A word having multiple senses in a text introduces the lexical semantic task
to find out which particular sense is appropriate for the given context. One
such task is Word sense disambiguation which r... efers to the identification of
the most appropriate meaning of the polysemous word in a given context using
computational algorithms. The language processing research in Hindi, the
official language of India, and other Indian languages is restricted by
unavailability of the standard corpus. For Hindi word sense disambiguation
also, the large corpus is not available. In this work, we prepared the text
containing new senses of certain words leading to the enrichment of the
sense-tagged Hindi corpus of sixty polysemous words. Furthermore, we analyzed
two novel lexical associations for Hindi word sense disambiguation based on the
contextual features of the polysemous word. The evaluation of these methods is
carried out over learning algorithms and favorable results are achieved. more
Vishwajeet Kumar,
Nitish Joshi,
Arijit Mukherjee,
Ganesh Ramakrishnan,
Preethi Jyothi
Automatic question generation (QG) is a challenging problem in natural
language understanding. QG systems are typically built assuming access to a
large number of training instances where each instanc... e is a question and its
corresponding answer. For a new language, such training instances are hard to
obtain making the QG problem even more challenging. Using this as our
motivation, we study the reuse of an available large QG dataset in a secondary
language (e.g. English) to learn a QG model for a primary language (e.g. Hindi)
of interest. For the primary language, we assume access to a large amount of
monolingual text but only a small QG dataset. We propose a cross-lingual QG
model which uses the following training regime: (i) Unsupervised pretraining of
language models in both primary and secondary languages and (ii) joint
supervised training for QG in both languages. We demonstrate the efficacy of
our proposed approach using two different primary languages, Hindi and Chinese.
We also create and release a new question answering dataset for Hindi
consisting of 6555 sentences. more
Anirudh Dahiya,
Neeraj Battan,
Manish Shrivastava,
Dipti Mishra Sharma
Sentiment Analysis and other semantic tasks are commonly used for social
media textual analysis to gauge public opinion and make sense from the noise on
social media. The language used on social media... not only commonly diverges from
the formal language, but is compounded by codemixing between languages,
especially in large multilingual societies like India.
Traditional methods for learning semantic NLP tasks have long relied on end
to end task specific training, requiring expensive data creation process, even
more so for deep learning methods. This challenge is even more severe for
resource scarce texts like codemixed language pairs, with lack of well learnt
representations as model priors, and task specific datasets can be few and
small in quantities to efficiently exploit recent deep learning approaches. To
address above challenges, we introduce curriculum learning strategies for
semantic tasks in code-mixed Hindi-English (Hi-En) texts, and investigate
various training strategies for enhancing model performance. Our method
outperforms the state of the art methods for Hi-En codemixed sentiment analysis
by 3.31% accuracy, and also shows better model robustness in terms of
convergence, and variance in test performance. more
Bidisha Samanta,
Niloy Ganguly,
Soumen Chakrabarti
Multilingual writers and speakers often alternate between two languages in a
single discourse, a practice called "code-switching". Existing sentiment
detection methods are usually trained on sentiment... -labeled monolingual text.
Manually labeled code-switched text, especially involving minority languages,
is extremely rare. Consequently, the best monolingual methods perform
relatively poorly on code-switched text. We present an effective technique for
synthesizing labeled code-switched text from labeled monolingual text, which is
more readily available. The idea is to replace carefully selected subtrees of
constituency parses of sentences in the resource-rich language with suitable
token spans selected from automatic translations to the resource-poor language.
By augmenting scarce human-labeled code-switched text with plentiful synthetic
code-switched text, we achieve significant improvements in sentiment labeling
accuracy (1.5%, 5.11%, 7.20%) for three different language pairs
(English-Hindi, English-Spanish and English-Bengali). We also get significant
gains for hate speech detection: 4% improvement using only synthetic text and
6% if augmented with real text. more
Sandip Modha,
Prasenjit Majumder
This paper attempt to study the effectiveness of text representation schemes
on two tasks namely: User Aggression and Fact Detection from the social media
contents. In User Aggression detection, The a... im is to identify the level of
aggression from the contents generated in the Social media and written in the
English, Devanagari Hindi and Romanized Hindi. Aggression levels are
categorized into three predefined classes namely: `Non-aggressive`, `Overtly
Aggressive`, and `Covertly Aggressive`. During the disaster-related incident,
Social media like, Twitter is flooded with millions of posts. In such emergency
situations, identification of factual posts is important for organizations
involved in the relief operation. We anticipated this problem as a combination
of classification and Ranking problem. This paper presents a comparison of
various text representation scheme based on BoW techniques, distributed
word/sentence representation, transfer learning on classifiers. Weighted $F_1$
score is used as a primary evaluation metric. Results show that text
representation using BoW performs better than word embedding on machine
learning classifiers. While pre-trained Word embedding techniques perform
better on classifiers based on deep neural net. Recent transfer learning model
like ELMO, ULMFiT are fine-tuned for the Aggression classification task.
However, results are not at par with pre-trained word embedding model. Overall,
word embedding using fastText produce best weighted $F_1$-score than Word2Vec
and Glove. Results are further improved using pre-trained vector model.
Statistical significance tests are employed to ensure the significance of the
classification results. In the case of lexically different test Dataset, other
than training Dataset, deep neural models are more robust and perform
substantially better than machine learning classifiers. more
Mimisha Nesan ,
Amir Sadeghi,
John Everatt
Reading comprehension is a complex process that stems from the development of decoding and understanding the written form of a language. Reading development largely depends on the typological and orth... ographical features of a language. Hence, research investigating the impact of different writing systems on reading processes and acquisition is needed to inform reading models and teaching practices across language/learning contexts. Malayalam is a prominent Indic language, but has hardly been studied in reading research. Therefore, to stimulate such research, the present chapter explains the orthographical features of Malayalam, considering these in terms of cross-linguistic factors that are important for reading acquisition. The chapter then presents a review of the relevant studies in reading, focusing on akshara orthographies and those recognising metalinguistic awareness as an aspect of successful reading acquisition, particularly in multilingual contexts. The chapter ends by arguing that phoneme-based instructional strategies should be usefully applied to Malayalam, despite its akshara characteristics. more
Partha Pratim Roy,
Akash Mohta,
Bidyut B. Chaudhuri
This paper presents a novel approach to generate synthetic dataset for
handwritten word recognition systems. It is difficult to recognize handwritten
scripts for which sufficient training data is not ... readily available or it may
be expensive to collect such data. Hence, it becomes hard to train recognition
systems owing to lack of proper dataset. To overcome such problems, synthetic
data could be used to create or expand the existing training dataset to improve
recognition performance. Any available digital data from online newspaper and
such sources can be used to generate synthetic data. In this paper, we propose
to add distortion/deformation to digital data in such a way that the underlying
pattern is preserved, so that the image so produced bears a close similarity to
actual handwritten samples. The images thus produced can be used independently
to train the system or be combined with natural handwritten data to augment the
original dataset and improve the recognition system. We experimented using
synthetic data to improve the recognition accuracy of isolated characters and
words. The framework is tested on 2 Indic scripts - Devanagari (Hindi) and
Bengali (Bangla), for numeral, character and word recognition. We have obtained
encouraging results from the experiment. Finally, the experiment with Latin
text verifies the utility of the approach. more