The ChartEx Gateway currently has one application specific portlet. The machine learning portlet allows end users to upload batches of annotated charters. This is done by uploading a .zip file that includes these batches of annotated charters. Typically, a batch consists of charters from the same source, language and period. Additionally, the user can choose between receiving the result as annotations in BioMine graph format (bgm) or as SQL statements. The first can be used to visualize the graph and explore the data visually, the second is more useful to import the data in a database and query the results. The portlet is developed using the ASM API and uses two workflows.
The first workflow has a single input: a .zip file containing batches of annotated charters. The job itself runs a Java process, uploaded as a .jar file that reads the annotations, represents them as a graph, and then looks for similarities between the persons and places occurring in different charters. Finally, it outputs three files: a ranking of the most striking similarities, a BioMine graph file containing the whole graph, and the same graph but omitting ‘witnesses'. Witnesses are people that are mentioned as witness in a transaction, but are of lesser interest for most purposes. The secondworkflow includes an extra step that outputs the extracted and learned annotations as SQL queries.
Both workflows are typically executed over many batches of charters at once, using a ‘parameter sweep': the input .zip file is decompressed, and for each batch of charters, a new workflow is started. As such, many instances of these workflows are created and run in parallel.