The Singlish to Sinhala text conversion component of සිංLingua offers multiple approaches to translate Singlish text to Sinhala. These approaches are designed to suit different requirements and accuracy levels.
`Usual tools` "oyata den kohomada" -> ඔයට ඩෙන් කොහොමඩ
The **Rule-Based Translation** approach involves converting Singlish text to Sinhala using a predefined set of rules. This method is efficient and provides quick translations based on specific patterns. To use this approach, the library provides a class `RuleBasedTransliterator`, which can be used as follows:
The **Machine Translation** approach using the FastText model involves leveraging word embeddings to enhance translation accuracy. This method uses FastText's pre-trained Sinhala word vectors to identify similar words for each word in the translated text. The process is as follows:
1. Convert Singlish text to Sinhala using the Rule-Based Translation approach.
2. Utilize the FastText Sinhala word vectors to find similar words for each translated word.
3. Enhance the translated text by replacing each word with its corresponding similar word from the FastText model.
To use this approach, the library provides a class `MachineTransliterator`. Here's how to use the MachineTransliterator with the FastText model:
The **Hybrid Translation** approach combines the rule-based and machine translation methods to provide accurate and context-aware translations. This approach offers a balance between efficiency and accuracy. The `HybridTransliterator` class is also used for this approach:
This function enhances machine-translated text by masking misspelled Sinhala words. It identifies misspelled Sinhala words and replaces them with a mask `"<mask>"` to improve text flow.
This function further refines machine-translated text by suggesting alternative words for masked words. It can be used in two different approaches through the library.
# level van be 0 or 1 for masking prompt and word suggesting prompt respectively
level=0
hybrid.view_prompt(level=level)
```
Sure, here is the section for the Manual Translation component, including the steps you provided:
## 4. Manual Translation
The **Manual Translation** approach allows you to manually manipulate and modify translations according to your preferences. This can be useful for refining translations and making context-specific adjustments. The `ManualTransliterator` class provides various methods to aid in this process.
### 1. Generate Coordinates
The first step in manual translation is generating coordinates for each word in the Sinhala text. Coordinates uniquely identify each word and its position in the text. This can be done using the `generate_coordinates` method.
To visualize the coordinates, you can export them to a CSV file using the `to_csv` method.
```python
manual.to_csv(dataframe=df,file="dataframe.csv")
```
### Optional parameters
1.`max_columns: int` You can specify the maximum number of columns in the coordinate plane.
### 2. Replace Cells
This method allows you to manually replace specific cells in the coordinate plane with desired words. You need to generate coordinates using the `generate_coordinates` method. Then, create a replacement dictionary where keys are the coordinates to be replaced and values are the replacement words. The `replace_cells` method returns a new dataframe with the changes applied.
Manual masking involves masking specific words in the coordinate plane to be replaced later. Similar to the previous steps, you generate coordinates using the `generate_coordinates` method. Specify the coordinates you want to mask, and then use the `manual_mask` method. The resulting `reconstructed_text` can be passed to the `machine_suggest` method from the Hybrid Translation approach to find the best-matching words for the masked positions.
# You can use machine_suggest to find matches for masked words
```
### 4. Reconstruct Text
Finally, the `reconstruct_text` method allows you to convert the modified dataframe back into a text. This step completes the manual translation process.
These manual translation methods provide flexibility and control over the translation process, enabling you to refine translations based on specific requirements.
---
For usage examples and detailed documentation, visit the [සිංLingua GitHub repository](https://github.com/SupunGurusinghe/SinlinguaDocumentation/).
Please remember to replace `"YOUR_SINHALA_TEXT"` and `"MISSPELLED_SINHALA_TEXT"` with the actual Sinhala text you are working with. Also, replace `"YOUR_USERNAME/REPO"` with your actual GitHub repository information.
## Getting Started
To use the Singlish to Sinhala text conversion component of සිංLingua, follow these steps:
1. Install the සිංLingua library:
```bash
pip install sinlingua
```
2. Import the required classes for the chosen translation approach: