Study of historical Byzantine seal images: the BHAI project for computer-based sigillography

BHAI 1 (Byzantine Hybrid Artificial Intelligence) is the first project based on artificial intelligence dedicated to Byzantine seals. The scientific consortium comprises a multidisciplinary team involving historians specialized in the Byzantine period, specialists in sigillography, and computer science experts. This article describes the main objectives of this project: data acquisition of seal images, text and iconography recognition, seal dating, as well as our current achievements and first results on character recognition and spatial analysis of personages.


INTRODUCTION
The successful development of artificial intelligence (AI) approaches to image understanding has promoted their applications to the humanities.For example, several projects involving experts and computer scientists are devoted to deciphering and dating ancient written artifacts from their images, e.g.ILAC for reading dates or Roman imperial names on coin images [15], ARCADIA for recognizing drawing patterns on pottery sherds [3].Other projects aim at emulating paleographers by classifying scripts from sample images only, without the aid of codicological data (material), e.g.ancient Hebrew scripts [7] or medieval Latin scripts [5].Other initiatives seek to date handwriting [6], reconstruct papyri from fragments [16], or search for meaningful objects (heads, letters) [12].
Most systems mentioned above now use AI for completing tasks such as script classification, character recognition, named entities recognition, or document reconstruction.AI systems rely on a training step that requires annotated data, but collections may be hard to interpret even for human beings, mainly due to their damaged state and the lack of ancient language knowledge and historical context [18].
The BHAI ongoing project devoted to digital sigillography based on AI approaches is described in Section 2. Section 3 examines the correlation between the seals diameters and their hierarchical significance from a historical perspective.Section 4 details the data collection of images and their annotations by skilled experts.Section 5 presents one of our first achievements: a character recognition system that provides a plain transcription of the seal's text.Section 6 describes our second accomplishment, which refers to preliminary interpretations of Byzantine seals using spatial relations.

BHAI PROJECT
BHAI (Byzantine Hybrid Artificial Intelligence) is the first project dedicated to computer-based sigillography.The French Research Agency (ANR) funded this project which promotes the study of historical seal images of the Byzantine period.Byzantine seals are small circular objects used in the Middle Ages to identify the sender of documents.They carry much of the knowledge we have gained about the Byzantine administration, aristocracy, and the cult of saints.
Since most preserved Byzantine seals are made of lead, they suffer from corrosion and are often damaged.The work of interpretation by historians is difficult when seals have been crushed or broken, making their inscriptions difficult or impossible to read.However, given their intrinsic properties, such as the consistency between an image and its associated text, as well as the similarities between different seals belonging to the same person, historians can make assumptions about the missing parts.
Epigraphy, numismatics, and sigillography are disciplines working on inscribed objects.The writing is different from the one used in manuscripts.Therefore, a font with variants for each byzantine character was necessary.At Dumbarton Oaks, J. Kalvesmaki [10] created the OpenType and Unicode Athena Ruby font 2 .Figure 1 shows a sample seal image's obverse and reverse sides.The obverse side (Figure 1-a) includes iconography while the reverse side (Figure 1-b) includes a text written in byzantine Greek in capital letters.Figure 1-c shows the transcription using the Athena Ruby font.However, the text is abbreviated: some words and characters are missing.Moreover, sometimes words are not separated.Only historians who are experts in sigillography can derive the complete text (Figure 1-e) from the abbreviated one (Figure 1-c).The English translation "Lord help thy servant Paul protospatharios (a dignity, the first to be a member of the Senate) and taxiarch (military officer)" is shown in (Figure 1-f).
BHAI proposes to combine different AI approaches to automatically extract the content of Byzantine seals and restore the damaged or missing text (see Section 5).A second objective consists in dating image seals according to their content, i.e., the restored text and interpreted iconography (see Section 6).
To interpret iconography, specialists can use icons, wall paintings, and other artifacts bearing iconography.Seals often reproduce a well-established iconography making the figure easily recognizable.After the second council of Nicaea (787 A.D.), identification accompanies figures and scenes.
Although the object's surface is small on seals, abbreviated or complete names are provided to identify the image.These letters are not placed identically on each seal but above, on the side, or both sides of the figure.Some saints are more frequent than others.Mary, the mother of God (called the Theotokos), is a very frequent figure.Saint Nicolas, Saint Theodore, and Saint Demetrius are all frequently chosen saints.The seal owner can decide what saint to place on his or her seal.Women chose the Theotokos.For men, numerous factors come into play when choosing a saint: profession, baptismal name, and location.In the army, officers often choose a military saint.In the administration, Saint Nicolas is a first choice.Clerics, especially bishops, have to choose the saint of the cathedral church.The location also plays a role.In Antioch, the figures of Peter and Paul (apostles of Christ) are favorites, but also a local saint called Saint Symeon Stylites.Saint Paul is chosen in Tarsus, while in Thessaloniki, it is Saint Demetrius.Some seal owners change the iconography on their seals.Michael Cerularios, patriarch of Constantinople, chose the Theotokos, who protects the city of Constantinople, but then he chose the archangel Michael because he was called Michael.During the iconoclastic period (8th-9th c.), the emperors and the Church forbade images of saints but allowed crosses.Numerous crosses were created on seals with different shapes or ornaments.Finally, the Byzantines love monograms; they create a shape with the combined letters of their baptismal or family name.All letters of a name must be present; some can be used twice, and others are combined to produce two letters.Monograms are like puzzles.

SEALS DIAMETERS AND HIERARCHICAL SIGNIFICANCE
Byzantine seals range in diameter from 8-9mm to over 70mm.While most of these measure between 20 and 30 mm, the seals belonging to the category of ekklesiekdikoi stand out for their remarkable dimensions, ranging between 42 and more than 70 mm.Therefore, a question arises: Is there a relation between the seals' diameters and their hierarchical significance?
To answer this question, we examined a sample of 1500 ecclesiastical seals from a historical perspective [14].The study revealed that there is no correlation between the diameter of ecclesiastical seals and their hierarchical significance.The ekklesiekdikoi were priests, members of the tribunal founded by Justinian I and attached to the Church of Saint Sophia in Constantinople.The ecclesiastical tribunal sent sealed official documents concerning their deliberations, which may explain a particular concern with being identifiable and representing authority and institutional weight.These aspects could also be conveyed by the size and appearance of the bullae that accompanied the documents [1].

DATA COLLECTION AND ANNOTATION
The first stage of a computer-based system dedicated to sigillography is devoted to creating a cleaned and annotated corpus of seal images.The images were taken some years ago only to illustrate the books presenting the collections.Therefore, they were not captured with a professional setup (high resolution, staged illumination, HDR, etc.), and the acquisition protocol is neither fixed nor well known, which induces a lot of variability in the images.It can be noted that the background and characters have the same color.In addition, shades may be present depending on the light source position during digitization.
We started by using the Tatiş image collection [4].Then, after having established the annotation protocol (labels, choice of software, characters, and objects to be annotated), the manual segmentation began with the participation of domain experts.They worked first on the transcription of texts at line level (but without providing line positions) and at character level (with character position) using the Supervisely 3 platform.Characters have been 3 https://supervisely.com/annotated by manually setting points on their contours, the outside contour, and the inside one (see Figure 2).From these contour points, a pixel-level annotation could be derived, as well as the bounding box of each character (see Figure 3).
Finally, the annotation process was repeated for the iconography to isolate objects (such as crosses, clothes, or body parts) and scenes (such as the annunciation to the Blessed Virgin Mary).Then, jointly with Byzantine sigillographers, we decided to annotate: a) personages (such as the Christ, the Virgin Mary, or the Archangel Michel), b) objects (such as globes, swords, books), c) body parts (head, hands, wings), d) crosses (including fleurons, steps, and ornaments), e) clothes (such as veils or loros), and f) elements around the head (such as nimbi and crowns).
We did not fully annotate the Tatiş images, only a subset of reverse images.In contrast to pixel-based annotation, only character bounding boxes were considered to expedite the annotation process.In summary, a total of 102 annotated seal images were collected as well as 2313 character images, along with their annotations (pixel and/or character level).

SEAL CHARACTER RECOGNITION
Transcribing the text of seal images is one of our main objectives.In Byzantine seals, the characters are made by an engraver who creates reliefs on a boulloterion, the tool used to strike lead, silver, and gold bullae.Therefore, traditional OCR approaches cannot be applied since they require more contrast between foreground (characters) and background than lead seals have.To face such difficulties, we proposed an approach based on deep learning to read seal characters and provide a transcript.
There are several transcription levels, plain text and restored text (including hidden text because of lack of room or damaged text).However, the text is difficult to read because of absent words or characters.Thus, the transcription task is decomposed into: (1) locating and recognizing characters, (2) providing a plain transcript, (3) recovering words, (4) recovering missing text.
Steps 1 and 2, which are our current main focus, have to face the lack of contrast between characters and background.We actually perceive characters from shades.Moreover, characters can be damaged.To address these issues, deep neural networks have been built and trained to localize and recognize characters.We use transfer learning and data augmentation.
Figure 4 shows our learnable approach for obtaining seal plain transcriptions.Since we have few annotated images, we opted for a two-step approach: i) localizing character bounding boxes in the image, and ii) reading out the characters previously localized as a simple image classification approach.This approach splits a larger problem into two inherently simpler sub-problems, each of which can be solved by learnable models trained over far fewer annotated samples.Note that the chosen models perform well, and comparison with other potential architectures and models is left for future work.
For obtaining plain transcriptions, we use the outputs of both networks and apply a robust Hough-based approach that groups character bounding boxes into text lines [11].To detect and extract character crops from seal images, we rely on YOLO v5 [9], a deep convolutional architecture trainable end-to-end in a single shot.
Here we use the small version of YOLO v5.Namely, our YOLO learnable parameters have been previously initialized by training the network over the COCO dataset (328,000 images) and fine-tuned for 300 epochs.
Due to the limited number of data, we massively resort to data augmentation, consisting of geometric transformations (image shifts, scale variations) to increase the training set diversity.We optimize the network parameters with SGD (Stochastic Gradient Descent) for 300 epochs (the apparently large epochs count is due to the limited size of our actual training set) and a linear learning rate scheduler from 0.01 to 0.00001.
For character classification, we rely on a ResNet18 architecture [8] pre-trained on the ImageNet dataset [17].The network is trained over isolated character crops extracted from the training images resized to 256×256 pixels.We train the classifier on both obverse and reverse character images to face data scarcity, plus we rely on augmentation strategies to increment the apparent training set size.
Figure 5 shows the 29 character classes used in this work with their glyphs in the Athena Ruby font.Classes Xi, Psi, Zeta and Closed Beta are quite infrequent.The Upsilon and Nu classes include two different glyphs each.Character S is the abbreviated form of the word .There are also two ligatures: CT for  (infrequent) and OU.When transcribed in standard Greek, the number of character classes drops to 24 classes, the number of characters in the Greek alphabet.The croisette special symbol was also included since its shape and size is quite similar to characters.
While the ResNet18 is trained over character-exact crops from the annotated training images, when deployed in the complete localize-and-classify pipeline it is expected to operate on crops extracted by the YOLO-based localization pipeline.We also added an extra non-character class corresponding to bounding boxes containing no character or damaged characters.150 non-character images have been cropped and added to the character image set.We follow a K-fold evaluation framework by dividing seal images and character samples into  folds, training on  − 1 folds, testing on the remaining fold and cycling over the folds.In addition, we constrain training and testing characters to belong to distinct seal images.
Results in Table 1 are relative to the YOLO-v5 character localization task.The Recall value means that about 90 % of predicted bounding boxes match a ground truth box.There is a match if the two boxes overlap enough, i.e. if their so-called IOU (Intersection Over Union) is greater then a typical value of 0.5.The Precision is high but differs from the ideal value 1.This means that a few predicted bounding boxes do not include a character but an ornament, an incomplete character, or background.In order to evaluate the whole pipeline, we adopt the CER metric (Character Error Rate) [13].The CER compares the predicted character sequence with the ground truth sequence.Following the K-fold cross-validation framework, we compute for each seal of the testing fold its CER, and average over the seals to obtain the CER associated with that fold.The cross-validation CER, obtained by averaging over the folds, is equal to 0.31.This highlights the challenge of reading characters in the difficult context of ancient seals, but also the potential of this plain transcription as a source of the underlying text when it will be processed using dictionaries and character/word embeddings.

REASONING WITH SPATIAL RELATIONS FOR INTERPRETING BYZANTINE SEALS
This section presents preliminary results for interpreting Byzantine seals using spatial relations.The hypothesis is that the spatial organization of personages and objects provides useful information for their interpretation.Initially, we selected the seals with a personage (or object) in the center of the seal.The central personage (or object) usually has the largest area coverage.Therefore, the first step of our pipeline is to calculate the area coverage of different personages (or objects) and sort and compare each value to determine if there is a dominant personage (or object) in the seal.
Once the central personage (or object) is determined, we calculate the directional relations of other objects and the central personage.In other words, we want to know if a particular object is on the  left, right, above, or below the central personage (or object).We applied the fuzzy landscape method proposed in [2], where the degree of satisfaction of the relation to the reference object at any point in space is computed using a morphological dilation.Once the fuzzy landscape of the central personage (or object) is built, the analysis of the directional relations between any other object  and the central personage (or object) is completed in order of O ( ), with  the number of pixels in . Figure 6 presents an example analysis of the four basic directional relations between the Empress Théodora and a labarum.The high degree of satisfaction of the relation "the labarum is on the left of the Empress" is characteristic of this type of seal and helps interpret them.Similar results have been obtained on other seals and other spatial configurations.

CONCLUSIONS AND FUTURE RESEARCH
This article presented the BHAI project, an innovative approach in Artificial Intelligence applied to Byzantine sigillography.Our proposal combines computer vision, knowledge engineering, and mathematical modeling of spatial relationships to help interpret Byzantine seals.Up to now, the proposed methods have provided encouraging results, and are currently being further developed.
To the best of our knowledge, no other AI-based project has been suggested for Byzantine seal datasets.Nowadays, no software helps sigillography students in the reading of seals.We are convinced that such a tool may be a great support to introduce beginners to the challenging field of sigillography and support experts by corroborating their theories with mathematical evidence.While the proposed approaches have been applied so far to Greek characters and Byzantine iconography, they could be extended to other alphabets and historical periods.

Figure 1 :
Figure 1: Sample seal images.In this case, the obverse side (a) includes iconography, while the reverse side (b) includes an abbreviated text in Greek.We show plain transcriptions (c) and (d), the text to recover (e), and the English transcription (f).

Figure 2 :
Figure 2: Annotated contours of characters and abbreviation marks.

Figure 4 :
Figure 4: Pipeline of the proposed two-stage character recognition approach.CNN1 is an object detector, while CNN2 is a deep classifier.The output is the plain transcription of the input reverse seal image.

Figure 5 :
Figure 5: Byzantine Greek characters with their corresponding glyphs represented with font Athena Ruby.

Figure 6 :
Figure 6: Example of assessment of the spatial relation between the Labarum and the Empress Théodora represented on the seal in (a) and segmented in (b).Four basic relations (c,d,e,f) to Empress Théodora, represented as maps where high gray values represent high degrees of satisfaction of the relation, and corresponding satisfaction degrees on the Labarum.The highest degree (averaged over the points of the Labarum) is obtained for the relation left to the Empress, which is the expected result.

Table 1 :
Character Localization evaluation.Recall, precision metrics (in %) obtained by cross validating over K = 10 folds.The IOU threshold is equal to 0.5.Results in Table2are relative to the ResNet18 network (CNN2) evaluated on ground truth character crops for the 20 most represented classes (which are composed at least of 50 samples) plus the non-character class.Top 1 to Top 3 accuracies are provided.