Training a machine learning model requires a substantial dataset of annotated data. This data acts as the foundation upon which the model learns to make predictions. The quality and quantity of sample data directly influence the effectiveness of the trained model. A well-curated dataset should be representative, encompassing a wide range of example