| MathML in the MOWGLI Project
Andrea Asperti and Michael Kohlhase
MOWGLI (Mathematics On the Web: Get it by Logic and Interfaces) is a European Project founded by the European Community in the ``Information Society Technologies'' (IST) Program. The project partners are the University of Bologna (coordinator), the German Research Center for Artificial Intelligence (DFKI Saarbrücken), Katholieke Universiteit Nijmegen, Max Planck Institute for Gravitational Physics (Albert Einstein Institute, TU Berlin), and Trusted Logic (France). The aim of the project is the study and the development of a technological infrastructure for the the creation and maintenance of a virtual, distributed, hypertextual library of mathematical knowledge based on a content description of the information. Currently, almost all mathematical documents available on the Web are marked up only for presentation, severely crippling the potential for added-value services like concept-oriented navigation and information retrieval, data mining or document personalization. The goal of MOWGLI is to overcome these limitations, passing from a machine-readable to a machine-understandable representation of the information and developing the technological infrastructure for its exploitation. Essentially, MOWGLI aspires to become an example of ``best practice'' in its use, and a leading project in the new area of the Semantic Web. In particular, we shall deeply explore the potentialities of XML in the following directions:
Mathematics and the Semantic WebThe goals of the MOWGL I project are shared on a larger scale by the so-called ``Semantic Web'' (see [12]), which is the new frontier of Content Based Information Systems: Associating meaning with content or establishing a layer of machine understandable data will allow will enable a higher degree of automation and more intelligent applications than are possible in the presentation-markup dominated Web today. The ultimate goal of the Semantic Web is to allow machines to share and exploit knowledge in the Web way, i.e. without central authority, with few basic rules, in a scalable, adaptable, extensible manner. However, the actual development of the Semantic Web and its technologies has been hindered so far by the lack of large scale, distributed repositories of structured, content oriented information. The case of Mathematical knowledge, the most rigorous and condensed form of knowledge, is paradigmatic. The World Wide Web is already now the largest single resource of mathematical knowledge, and its importance will be exponentiated by the emerging display technologies like MATH ML, OPEN MATH, and OMDOC. Due to its rich notational, logical and semantical structure, mathematical knowledge is thus a main case study for the development of the new generation of semantic Web systems: A content-oriented ``semantic Web'' for mathematics would make it possible to offer added-value services like
Content Markup based on MATHML, OPENMATH, and OMDOCTo represent the content of the mathematical libraries the MOWGLI project will develop a representation format either by layering on existing techologies, or by influencing them to incorporate the necessary functionality.
Authoring, transformation and renderingThe fact of encoding also the microscopic, logical level of mathematics opens the possibility to have completely formalized subsystems of the library [8], which could be checked automatically by standard tools for the automation of formal reasoning and the mechanization of mathematics (proof assistants and logical frameworks [14]). At the same time, any of these tools could be used as an authoring system for documents of the library, by simply exporting their internal libraries into XML, and using stylesheets to transform the output into a standard, machine-understandable representation, such as MATHML content markup or OPEN MATH. Thus a rapid way to get meaningful repositories of fully structured mathematical knowledge is by exporting them from the available libraries of Logical Frameworks and Proof Assistants, along the lines of the HELM Project [10] or the work of Oostdijk et al. [17]. In MOWGL I we shall primarily rely on the Coq Proof Assistant of INRIA [11], but we will also address the question of transforming controlled subsets of LaTeX input into mathematical libraries. Once we have the relevant information encoded in XML we are faced with the complex issue of its transformation into suitable rendering formats (like xhtml+MATHML for web presentation, and LaTeX for print media). In the rendering phase we must also heavily work on proofs in order to put them in a form more suitable to human reading. Typically, this requires the reorganization of the structure of the proof, where subproofs may be extracted from their position and brought to the top level (as a sort of let in). Finally, we must also automatically extract from the document all those informations which can be profitably used as metadata, i.e. data which provides some ``useful'' information on the document.References[1] Extensible Markup Language (XML) Specification. Version 1.0. W3C Recommendation, 1999, [2] Mathematical Markup Language (MATHML) 2.0 W3C Recommendation, 2001. [3] Language Design Task Group CoFI. Casl -- the CoFI algebraic specification language -- summary, version 1.0. Technical report, 1998. [4] Olga Caprotti and Arjeh M. Cohen. Draft of the Open Math standard. The Open Math Society, 1998. [5] Resource Description Framework (RDF) Model and Syntax Specification, W3C Recommendation 1999. [6] XSL Transformations (XSLT). Version 1.0, W3C Recommendation, 1999. [7] SOAP Version 1.2 Part 0: Primer. W3C Working Draft 2001. [8] Asperti, A., Padovani, L., Sacerdoti Coen, C., Schena, I., `` Formal Mathematics in MATH ML ''. Proceedings ``MATH ML and Math on the Web'', 2000. [9] Asperti, A., Padovani, L., Sacerdoti Coen, C., Schena, I., `` XML, Stylesheets and the re-mathematization of Formal Content ''. Proceedings of ``Extreme Markup Languages 2001 Conference'', 2001. [10] Asperti, A., Padovani, L., Sacerdoti Coen, C., Schena, I., `` HELM and the semantic Math-Web ''. Proc. TPHOLS. 2001, 01. [11] B. Barras et al., `` The Coq Proof Assistant Reference Manual, version 6.3.1 '', [12] Tim Berner's Lee. The Semantic Web. W3C Architecture Note, 1998. [13] The Dublin Core Metadata Initiative. [14] G. Huet, G. Plotkin (eds). Logical Environments. Cambridge University Press. 1993. [15] Kohlhase, M. ``OMDOC : An Open Markup Format for Mathematical Documents (Version 1.1) ''. Open Specification. [16] B. Naylor, S. Watt. ``Meta Style Sheets for the Conversion of Mathematical Documents into Multiple Forms''. On-line Proceedings of the First International Workshop on Mathematical Knowledge Management, 2001. [17] M. Oostdijk. Generation and Presentation of Formal Mathematical Documents, PhD. Thesis, University of Eindhoven. |