diff --git a/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.pdf b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.pdf new file mode 100644 index 0000000..2f8283e Binary files /dev/null and b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.pdf differ diff --git a/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex index 3b501d4..24b4242 100644 --- a/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex +++ b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex @@ -33,25 +33,25 @@ \title{Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae} \author{Martin Thoma} \maketitle -\section{The problem backgound} +\section{The problem background} There are people who don't know how to write even simple mathematical formulae with \LaTeX{} like \[\pi/\alpha=\sum_{n=-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}=\int_{-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}\, \text{d}n\] or who need much time to do so. Currently, there are several online - services, programms and apps that help to write mathematical - formulae, but all programms I know have serious disadvantages: + services, programs and apps that help to write mathematical + formulae, but all programs I know have serious disadvantages: \begin{itemize} \item \href{http://detexify.kirelabs.org/classify.html}{detexify.kirelabs.org} recognizes \textbf{only symbols}, - \item the formel editor of LibreOffice Writer 3.6 as showen + \item the formula editor of LibreOffice Writer 3.6 as shown in \Fref{fig:libre-office-3.6} offers some - guidiance by grouping common operations while showing + guidance by grouping common operations while showing a WYSIWYG editor, but it has \textbf{no handwriting recognition}. Another drawback is the fact that it is \textbf{not available as an online service}, so you have to install LibreOffice which might not be possible on all devices. \item The \enquote{Daum Equation Editor} (see \Fref{fig:daum-editor}) is available online - and offers guidiance through the creation of equations, + and offers guidance through the creation of equations, but does not offer handwriting recognition. Although it might be OpenSource, the \textbf{source code is difficult to find}. This means if you want to improve the recognition, @@ -66,7 +66,7 @@ single symbols. \item Wolfram Mathematica seems to be able to do complete formula recognition at least for simple formulae (\href{http://reference.wolfram.com/mathematica/tutorial/HandwrittenMathRecognition.html}{source}) - by using Microsofts \href{http://windows.microsoft.com/en-ph/windows7/use-math-input-panel-to-write-and-correct-math-equations}{Math Input Panel}, + by using Microsoft's \href{http://windows.microsoft.com/en-ph/windows7/use-math-input-panel-to-write-and-correct-math-equations}{Math Input Panel}, but this is neither OpenSource nor available as an online service. Additionally it is not available for Linux systems, so I can't test it. @@ -80,7 +80,7 @@ \begin{figure}[h] \centering \includegraphics*[width=5cm, keepaspectratio]{figures/libreoffice-writer.png} - \caption{LibreOffice Writer 3.6 - Formel Editor} + \caption{LibreOffice Writer 3.6 - Formula Editor} \label{fig:libre-office-3.6} \end{figure} @@ -106,7 +106,7 @@ starts recognizing while the user enters a formula. \item \textbf{Interactive}: The service offers symbols and constructs to the user before the user starts typing. These suggestions - might chage depending on what the user has typed before. + might change depending on what the user has typed before. \item \textbf{OpenSource}: Any license in this list: \href{http://opensource.org/licenses}{http://opensource.org/licenses} \item \textbf{Easy to find}: Ideally, the project should have an own domain that contains the source code, the service @@ -116,7 +116,7 @@ \end{itemize} This service should also encourage the users by techniques - of \enquote{gamification} to give as much + of \enquote{Gamification} to give as much meta information about their formulae as possible: \begin{itemize} \item Which problem domain does the formula belong to, e.~g. \enquote{Euclidean geometry}, \enquote{analysis} or \enquote{calculus}? @@ -131,8 +131,8 @@ offers on-line, interactive math handwriting recognition. But the need of such a software is there. But there are more reasons why this bachelor's thesis matters: -Projects like \LaTeX{}, Linux, Apache or FireFox have shown that -OpenSoure software can enrich the develpment in specific areas. The +Projects like \LaTeX{}, Linux, Apache or Firefox have shown that +OpenSource software can enrich the development in specific areas. The \enquote{Browser Wars} might be the most famous result of an active OpenSource community. Internet Explorer 6 had a market share of over 80\% in 2003. Prequels of Firefox and the Mozilla @@ -158,14 +158,14 @@ e.~g. a formula spotter for presentations or a math detector for speech. \section{Time schedule} \begin{itemize} \item[70h] Literature research about on-line handwriting recognition - techniques and gamification. + techniques and Gamification. \item[5h] Defining browsers and devices that should get supported and required client side software like HTML5, CSS 3 and ECMAScript (better known as JavaScript). Also, required input methods like touchscreens and stylus should be mentioned. \item[20h] Writing use cases. This is includes writing example - formula that the user shoud type and the system should + formula that the user should type and the system should be able to recognize; finding people with different knowledge of \LaTeX{} and from different fields who want to participate in user tests. @@ -182,8 +182,8 @@ e.~g. a formula spotter for presentations or a math detector for speech. in the thesis, but the improvements will get documented. \item[60h] Finding structures and ways how to enter them. Examples of structures that can be nested are sums: - \begin{verbatim}\sum_{}^{} \end{verbatim} - Implement the recognition of those strucutres. + \begin{verbatim}\sum_{}^{} \end{verbatim} + Implement the recognition of those structures. \item[30h] Observe \enquote{fresh} testers while they try to use the system. \item[70h] Improving the software to fix problems that were found @@ -242,8 +242,8 @@ A first draft of the outline could be like this: \bibliography{literatur} This literature list is only a list that seems to make sense to me -by now. As I proceed I might find more usefull sources for the different +by now. As I proceed I might find more useful sources for the different topics. So I might add, but also remove elements from this list. -Especially for gamification I might read documents from +Especially for Gamification I might read documents from \href{http://gamification-research.org/}{gamification-research.org}. \end{document} diff --git a/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex.bak b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex.bak new file mode 100644 index 0000000..3b501d4 --- /dev/null +++ b/documents/bachelor-proposal-latex-writing-recognition/bachelor-proposal-latex-writing-recognition.tex.bak @@ -0,0 +1,249 @@ +\documentclass[a4paper]{scrartcl} +\usepackage{amssymb, amsmath} % needed for math +\usepackage[utf8]{inputenc} % this is needed for umlauts +\usepackage[english]{babel} % this is needed for umlauts +\usepackage[T1]{fontenc} % this is needed for correct output of umlauts in pdf +\usepackage[margin=2.5cm]{geometry} %layout +\usepackage{hyperref} % links im text +\usepackage{color} +\usepackage{framed} +\usepackage{enumerate} % for advanced numbering of lists +\usepackage{csquotes} +\usepackage{ifxetex,ifluatex} +\usepackage{etoolbox} +\usepackage[svgnames]{xcolor} +\usepackage{tikz} +\usepackage{framed} +\usepackage{parskip} +\usepackage{cite} +\usepackage{fancyref} +\usepackage{mystyle} +\clubpenalty = 10000 % Schusterjungen verhindern +\widowpenalty = 10000 % Hurenkinder verhindern + +\hypersetup{ + pdfauthor = {Martin Thoma}, + pdfkeywords = {Bachelor proposal, LaTeX, handwriting recognition}, + pdftitle = {Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae} +} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\begin{document} + \title{Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae} + \author{Martin Thoma} + \maketitle +\section{The problem backgound} + There are people who don't know how to write even + simple mathematical formulae with \LaTeX{} like + \[\pi/\alpha=\sum_{n=-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}=\int_{-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}\, \text{d}n\] + or who need much time to do so. Currently, there are several online + services, programms and apps that help to write mathematical + formulae, but all programms I know have serious disadvantages: + \begin{itemize} + \item \href{http://detexify.kirelabs.org/classify.html}{detexify.kirelabs.org} + recognizes \textbf{only symbols}, + \item the formel editor of LibreOffice Writer 3.6 as showen + in \Fref{fig:libre-office-3.6} offers some + guidiance by grouping common operations while showing + a WYSIWYG editor, but it has \textbf{no handwriting recognition}. + Another drawback is the fact that it is \textbf{not available + as an online service}, so you have to install LibreOffice + which might not be possible on all devices. + \item The \enquote{Daum Equation Editor} (see \Fref{fig:daum-editor}) is available online + and offers guidiance through the creation of equations, + but does not offer handwriting recognition. Although + it might be OpenSource, the \textbf{source code is difficult to + find}. This means if you want to improve the recognition, + it is not possible. It also makes use of Adobe Flash + which is not available on many smartphones and tablet + computers. + \item Maple seems to offer handwritten symbol recognition (\href{http://www.maplesoft.com/products/maple/features/handwritten.aspx}{source}), + but on the one hand I was not able to test that, because + it is \textbf{not available for free}. On the other hand you + have to install additional software, it seems not to be + available for tablet computers and it does only recognize + single symbols. + \item Wolfram Mathematica seems to be able to do complete + formula recognition at least for simple formulae (\href{http://reference.wolfram.com/mathematica/tutorial/HandwrittenMathRecognition.html}{source}) + by using Microsofts \href{http://windows.microsoft.com/en-ph/windows7/use-math-input-panel-to-write-and-correct-math-equations}{Math Input Panel}, + but this is neither OpenSource nor available as an + online service. Additionally it is not + available for Linux systems, so I can't test it. + \end{itemize} + + A more comprehensive list can be found at \href{https://en.wikipedia.org/wiki/Formula_editor}{https://en.wikipedia.org/wiki/Formula\_editor}. + A problem of some of the projects presented there is that they + require the client to execute Java Applets which is a security + risk. + + \begin{figure}[h] + \centering + \includegraphics*[width=5cm, keepaspectratio]{figures/libreoffice-writer.png} + \caption{LibreOffice Writer 3.6 - Formel Editor} + \label{fig:libre-office-3.6} + \end{figure} + + \begin{figure}[h] + \centering + \includegraphics*[width=15cm, keepaspectratio]{figures/daum-editor.png} + \caption{Daum Equation editor} + \label{fig:daum-editor} + \end{figure} +\break +\section{The problem statement} + What I would like to have is an interactive on-line handwriting + recognition service, that is available as a web service which makes + use of touchscreens. Additionally, it should be for free and + OpenSource, the source code should be easy to find and documented. + This means: + \begin{itemize} + \item \textbf{Service}: The program can be accessed over the web, so + that the user does only have to have a modern browser. + As a consequence, the software could be used with any + device that has a touch screen. + \item \textbf{On-line handwriting recognition}: The service + starts recognizing while the user enters a formula. + \item \textbf{Interactive}: The service offers symbols and constructs + to the user before the user starts typing. These suggestions + might chage depending on what the user has typed before. + \item \textbf{OpenSource}: Any license in this list: \href{http://opensource.org/licenses}{http://opensource.org/licenses} + \item \textbf{Easy to find}: Ideally, the project should have + an own domain that contains the source code, the service + and documentation. But it might be enough to provide + an email address to a developer within the top of + of the source code of the delivered HTML document. + \end{itemize} + + This service should also encourage the users by techniques + of \enquote{gamification} to give as much + meta information about their formulae as possible: + \begin{itemize} + \item Which problem domain does the formula belong to, e.~g. \enquote{Euclidean geometry}, \enquote{analysis} or \enquote{calculus}? + \item Does the formula itself have a name, e.~g. \enquote{Pythagorean theorem}, \enquote{Fibonacci numbers} or \enquote{geometric series}? + \end{itemize} + + This information should be used to create a formula database. + +\section{Significance} +For me as a Linux user, there no software that I can test and which +offers on-line, interactive math handwriting recognition. But the +need of such a software is there. + +But there are more reasons why this bachelor's thesis matters: +Projects like \LaTeX{}, Linux, Apache or FireFox have shown that +OpenSoure software can enrich the develpment in specific areas. The +\enquote{Browser Wars} might be the most famous result of an active +OpenSource community. Internet Explorer 6 had +a market share of over 80\% in 2003. Prequels of Firefox and the Mozilla +foundation already existed, but Firefox 1.0 was released not until +November 2004. After that, Firefox and other open browsers added many +features that Internet Explorer had to compete with, like tabbed browsing, +HTML4 standard conformance, support of the \texttt{} tag and +speed of HTML rendering and JavaScript execution.\footnote{\href{http://www.evolutionoftheweb.com/}{www.evolutionoftheweb.com} offers a graphical overview. Although supporting standards like HTML4 or CSS~2 is not done with one version, but rather an incremental process.} Some of these +questions are interesting for science such as many problems related +to layouts and just-in-time compilation (JIT). With OpenSource software +that makes it easy to find its source and offers good documentation, +researchers can simply try their ideas without being blocked by +having to try to access the source code. + +Additionally, such a project might give researchers more time to +concentrate on the tasks they really want to do rather than spending +hours by learning \LaTeX{}. + +One last reason why this thesis matters is the formula database that +gets created by users. This database might be used in follow-up work, +e.~g. a formula spotter for presentations or a math detector for speech. + +\section{Time schedule} +\begin{itemize} + \item[70h] Literature research about on-line handwriting recognition + techniques and gamification. + \item[5h] Defining browsers and devices that should get supported + and required client side software like HTML5, CSS 3 + and ECMAScript (better known as JavaScript). Also, + required input methods like touchscreens and stylus + should be mentioned. + \item[20h] Writing use cases. This is includes writing example + formula that the user shoud type and the system should + be able to recognize; finding people with different + knowledge of \LaTeX{} and from different fields who + want to participate in user tests. + \item[60h] Implementing the core of the application: Handwriting + recognition of digits and symbols by using only + HTML, CSS and on the client side. This includes implementing + a way for the user to enter new symbols and to correct the + symbol that was suggested by the recognition system. + \item[20h] Introduce testers that already know \LaTeX{} to the + current system. At this point, the system does only do + symbol recognition. The testers should train it, + insert symbols like $a-z, A-Z, 0-9, \alpha-\omega, A-\Omega, \cdot, \circ, \dots$ + \item[10h] Get feedback by the users. This feedback will not be included + in the thesis, but the improvements will get documented. + \item[60h] Finding structures and ways how to enter them. Examples + of structures that can be nested are sums: + \begin{verbatim}\sum_{}^{} \end{verbatim} + Implement the recognition of those strucutres. + \item[30h] Observe \enquote{fresh} testers while they try to use + the system. + \item[70h] Improving the software to fix problems that were found + with user tests + \item[50h] Fix bugs, improve code quality and readability as well + as documentation. + \item[45h] Usability testing: Try Hallway testing. The results + of these tests get documented and will be part of the + bachelor's thesis. If possible, I would like + to let the testers use their own devices. + \item[10h] Mentioning open questions and ideas how they could be + analyzed with the service that was created. +\end{itemize} + +\section{Outline} +I have described in which steps I would like to write the software, +but almost all points include writing the bachelor's thesis document. +A first draft of the outline could be like this: + +\begin{enumerate} + \item Introduction + \item Definitions + \begin{enumerate} + \item Hardware: What is available and what is the distribution? + \item Software: What is available and what is the distribution? + \item Support of standards like HTML, CSS, ECMA-Script, Flash, Cookies, ... + \item Choice of hardware, software and standards that should get supported as well as the choice of Libraries and the required server-side software + \item Application to the domain of math recognition + \end{enumerate} + \item On-line handwriting techniques + \begin{enumerate} + \item Description of techniques in general + \item Application to the domain of math recognition + \end{enumerate} + \item Gamification techniques + \begin{enumerate} + \item Description of techniques in general + \item Application to the domain of math recognition in the web + \end{enumerate} + \item Software Project + \begin{enumerate} + \item Structure of the code + \item Availability of documentation + \item Availability of the service + \end{enumerate} + \item Summary + \begin{enumerate} + \item Future Work + \end{enumerate} +\end{enumerate} +\break + +\renewcommand\refname{Related Literature} +\nocite{*} +\bibliographystyle{itmalpha} +\bibliography{literatur} + +This literature list is only a list that seems to make sense to me +by now. As I proceed I might find more usefull sources for the different +topics. So I might add, but also remove elements from this list. +Especially for gamification I might read documents from +\href{http://gamification-research.org/}{gamification-research.org}. +\end{document}