Breast cancer is the most common cancer in the world and the second most common type of cancer that causes death in women. The timely and accurate diagnosis of breast cancer using histopathological images is crucial for patient care and treatment. Pathologists can make more accurate diagnoses with the help of a novel approach based on computer vision techniques. This approach is an ensemble model of two pretrained vision transformer models, namely, Vision Transformer (ViT) and Data-Efficient Image Transformer (DeiT). The ViTDeiT ensemble model is a soft voting model that combines the ViT model and the DeiT model. The proposed ViT-DeiT model classifies breast cancer histopathology images into eight classes, four of which are categorized as benign, whereas the others are categorized as malignant. The BreakHis public dataset is used to evaluate the proposed model. The experimental results show 98.17% accuracy, 98.18% precision, 98.08% recall, and a 98.12% F1 score, which outperform existing classification models..