Sankar Mukherjee

Sankar Mukherjee

Researcher

Biography

I have worked on speech synthesis, prosody modelling, speech/speaker recognition, speech imitation modelling, audio event detection, computational neuroscience, dual interaction. I have used different technology stacks such as Python, PySpark, PyTorch, JAVA, JS, C#, SQL, Matlab. I occasionally participate in Kaggle competitions. In my spare time i like to travel and practice my Italian and French. I like to learn new things and always look for challenges.

Interests

  • Speech Synthesis
  • Audio Signal Processing
  • Conversational AI
  • Human Computer Interaction
  • Artificial General Intelligence

Education

  • PhD in Bioengineering and Robotics, 2019

    Istituto Italiano di Tecnologia

  • MS in Speech Technology, 2014

    Indian Institute of Technology Kharagpur

  • B.Tech in Electronics and Communication Engineering, 2009

    Jalpaiguri Govt. Engg. College

Skills

Python

5 years

PySpark

1 year

PyTorch

1 year

JAVA

3 years

MySQL

1 year

Statistics and Signal Processing

8 years

Language

Bengali, Hindi, Italian, French

Experience

 
 
 
 
 

Speech Scientist

DefinedCrowd

Sep 2019 – May 2020 Lisbon
  • Research and prototype new approaches and develop new algorithms using ML techniques that can be exploited to clean noise from speech signals.
  • Work with other data scientists, engineers, PMs to provide valuable insights from system telemetry.
  • Mine data sets with Spark, SQL, Cosmos from disparate sources in complex data pipelines.
  • Transform data into innovative features/signals that can improve a machine-learning task.
  • Communicate final recommendations and drive decision making.
  • Implement and deploy features into production.
 
 
 
 
 

Postdoc Researcher

Istituto Italiano di Tecnologia

Apr 2019 – Aug 2019 Italy
  • Grant writing for a research project titled ‘Neuro-behavioural aware conversational agent’ which win 10th Christian Benoît Award.
  • Collaborate with senior researchers and engineers to build a mobile app which can detect coordination between speakers.
  • Design an experiment to explore behavioral-underpinnings between two speakers during conversation.
  • Research novel data fusion techniques which combines acoustic (e.g. back-channels) and visual cues (e.g. head nodding) during conversation.
 
 
 
 
 

Research Assistant

Istituto Italiano di Tecnologia

Jun 2015 – Oct 2015 Italy
  • Collaborate with Senior researchers and Professors between Aix Marseille université (AMU) and Istituto Italiano di Tecnologia (IIT) for a project entitled SPIC.
  • One of the aim of SPIC is to measure speaker similarity during a conversation also known as phonetic convergence.
  • Created an algorithm with GMM-UBM which captures speakers phonetic convergence during conversation based on bi-syllabic words.
  • Created a Skype plugin which triggers via speech onset and used as communication software between AMU and IIT.
 
 
 
 
 

Research Assistant

Laboratoire Parole et Langage

Sep 2014 – May 2015 France
  • Performance analysis and comparison of various ML pipelines (PCA, LDA, SVM, SGD, CART, RandomForest) for classification of conversation feedback (linguistic and acoustic) using two french dialog databases.
  • Worked with senior researchers and technicians to measure trajectories of 3D sensor coils inside Electromagnetic Field.
 
 
 
 
 

Software Engineer

Yantra Software

May 2014 – Jul 2014 India
  • Built text-to-speech synthesis system (English, Hindi, Telegu) for Indian Banks in their customer service module.
  • Text-to-speech synthesis system was Speech Application Programming Interface (SAPI) compatible and built using HMM based speech synthesis system (HTS) with Asterisk with UniMRCP integration.
 
 
 
 
 

Project Engineer

Indian Institute of Technology Kharagpur

Apr 2011 – Apr 2012 India
  • Created a web-toolkit which generates VoiceXML formatted Pronunciation Lexicon Specification (PLS) for Indian languages. PLS can be referenced from other markup languages, such as the Speech Recognition Grammar Specification (SRGS) and the Speech Synthesis Markup Language (SSML) which can be used by TTS or ASR systems.
  • Lead a team of two students and one linguist and presented the results in Ministry of Human Resource Development, New Delhi, India. This work won the best paper award in Oriental COCOSDA 2013.

Accomplish­ments

Winner of 10th Christian Benoît Award

Research project on Neuro-behavioral Aware Conversational Agent

Travel Grant

Best Student Paper

Paper: PL-ILT: A web tool for creation of pronunciation lexicon in Indian languages.

Project Description: A web development tool to create comprehensive machine readable pronunciation lexicon for Indian languages.

Top Scorer in +10 and +12

Geography talent

Project

Hyperscanning

Neural marker during conversation

Speech Convergence

Imitation in conversation

Resources

Regex Tutorial

Blog