Grapheme-to-Phoneme Conversion using Encoder-Decoder LSTMs

This project implements sequence-to-sequence encoder-decoder architectures for Grapheme-to-Phoneme (G2P) conversion using PyTorch.

The project compares three sequence modelling approaches:

Bottleneck Encoder-Decoder
Fixed Context Vector Decoder
Attention-Based Decoder

The models are trained on the CMU Pronouncing Dictionary dataset (CMUdict).

Key Concepts

Sequence-to-Sequence Learning
LSTMs from Scratch
Encoder-Decoder Architectures
Attention Mechanisms
Cross-Attention
Speech and Language Processing
Grapheme-to-Phoneme Conversion

Features

Custom LSTM implementation (without nn.LSTM)
Dot-product attention implementation
Greedy decoding
Hyperparameter tuning
Phoneme Error Rate (PER) evaluation
Attention heatmap visualisation

Technologies Used

Python
PyTorch
NumPy
Matplotlib
Pandas

Results

The attention-based encoder-decoder achieved the best performance by overcoming the hidden-state bottleneck problem.

Key findings:

Attention improved convergence speed
Lower Phoneme Error Rate (PER)
Better handling of longer words
Improved sequence alignment

Repository Structure

notebooks/ → Main notebook
report/ → final report

Author

Kamva

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Encoder-Decoder.ipynb		Encoder-Decoder.ipynb
README.md		README.md
Report.pdf		Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grapheme-to-Phoneme Conversion using Encoder-Decoder LSTMs

Key Concepts

Features

Technologies Used

Results

Repository Structure

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Grapheme-to-Phoneme Conversion using Encoder-Decoder LSTMs

Key Concepts

Features

Technologies Used

Results

Repository Structure

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages