ICDAR 2026

Leveraging Morphology for Historical
Script Metrological Analysis

1 Institut de Recherche et d'Histoire des Textes, Paris   2 LIGM, CNRS, Univ Gustave Eiffel, ENPC, IP Paris

Paleographic analysis — the study of historical handwriting — still relies heavily on expert eye and subjective comparison. Computational approaches have made progress in recognizing text and extracting character shapes (morphology), but they rarely produce the kind of quantitative measurements (metrology) that are needed for reproducible, large-scale study.


Detection-based approach to prototypical text line modeling

Our architecture reconstructs text line images from learned character prototypes, to predict character bounding boxes and class labels directly from a line image, using only line-level transcriptions as supervision. We add a reconstruction module to extract character prototypes and reconstruct text lines.

Overview of the proposed approach
Figure 1 — Overview. The backbone predicts bounding boxes, class labels, and color vectors from the text line image. These outputs, together with learnable prototypes and a predicted background, are used to render a reconstructed image. Bounding box dimensions control the aspect ratio of each rendered character.

Prototype quality

We compared our approach to the Learnable Handwriter (LHW), showing improved prototype quality as well as deformable prototypes.

Prototype comparison LHW vs Ours
Figure 2 — Prototype quality. The 15 most frequent letters extracted by Learnable Handwriter (LHW, top) and by our method (bottom) for the first page of the dataset.

From bounding boxes to paleographic measurements

Because bounding boxes are defined by the optimal alignment between a prototype and a specific character instance they are consistent across the whole dataset. From character bounding boxes we derive the following measures: character aspect ratio w/h; inter-character distance (bigrams) db its aspect ratio wb/hb; and inter-word distance dw. We normalize by a unit of space to keep measurements independent of image resolution.

Definition of measurements
Figure 3 — Measurements.

We compute the mean μ and coefficient of variation CV = σ/μ of each measure across all character instances within a unit of analysis (one page). These two statistics together capture both the central tendency and the executional consistency of a scribal hand. We visualize results using two types of graphs, linear and crossed.

Visualization types
Figure 4 — Visualization types. Linear graphs show how a single measure evolves across ordered units of analysis (e.g., pages). Crossed graphs plot two measures against each other, enabling hand separation and correlation analysis. Marker size encodes the number of character instances.

Dataset

📜

Text-line annotations of Paris, BnF, fr. 2813

160 annotated pages (6,808 lines, 279,780 characters) from Les Grandes Chroniques de France (Paris, BnF, fr. 2813), covering four scribal hands (graphic profiles GP1–GP4).

↗ Zenodo dataset  · 


Results — Case study on Paris, BnF, fr. 2813

We apply our method to the dataset and report results on letter proportionality, bigram and word separation. Here some examples of a linear and a crossed graph.


Letter proportionality

Character aspect ratios alone are sufficient to distinguish the four hands across the manuscript.

Character aspect ratio across dataset
Figure 5 — Linear graph: character aspect ratio evolution. Each point is the mean aspect ratio of a given letter over one page. Highlighted bands mark consecutive pages.

Executional consistency of ‹t›

Crossed graph for letter t
Figure 6 — Crossed graph: ‹t› aspect ratio vs. coefficient of variation.

Plotting the aspect ratio of ‹t› against its coefficient of variation reveals executional consistency within each scribal hand. GP1 forms a compact cluster with low variability — ‹t› . GP4 clusters similarly but with slightly higher CV. GP2 and GP3 are more dispersed. For GP2, this reflects a habit of elongating the horizontal bar of ‹t› specifically in line-final positions.


References


Cite

@inproceedings{vlachou2026metrology,
  title     = {Leveraging Morphology for Historical Script Metrological Analysis},
  author    = {Vlachou-Efstathiou, Malamatenia and Baena, Raphael
               and Stutzmann, Dominique and Aubry, Mathieu},
  booktitle = {Document Analysis and Recognition -- ICDAR 2026},
  publisher = {Springer},
  year      = {2026}
}

Acknowledgements

This study was supported by the CNRS through MITI and the 80|Prime program (CrEMe — Caractérisation des écritures médiévales), and by the European Research Council (ERC project DISCOVER, no. 101076028). This work was granted access to the HPC resources of IDRIS under the allocation AD010614956R1 and AD011015222 made by GENCI.