Annotation means any extra information that is attached to the data.
In the machine learning domain, it refers to assigning predefined categories and tags/labels to documents and images. This data-label pair can be used for training classification-related problems by supervised learning where finding hidden patterns is easy.
Dataset is a collection of meaningful data which the machine sees and learns.
Dataset may contain raw information in the form of images, tabular data, signals, videos etc that helps derive inferences. An example is a tabular set of data where each column defines attributes/characteristics of the data and each row is a tuple/record in the dataset.
Image Preprocessing are the steps we take to convert a raw image into an enhanced form that the model is ready to use for training and inference.
The images collected for any computer vision tasks may be of different sizes, contrast, orientation. Image Preprocessing involves all the deterministic steps that we take to make the images all formatted correctly.