2
0
Fork 0
mirror of https://github.com/MartinThoma/LaTeX-examples.git synced 2025-04-19 11:38:05 +02:00

documents/write-math-ba-paper: Fixed some spelling mistakes

This commit is contained in:
Martin Thoma 2014-12-25 18:47:07 +01:00
parent 2f03172b3a
commit c9def13de2
3 changed files with 30 additions and 26 deletions

View file

@ -1,3 +1,6 @@
[Download compiled PDF](https://github.com/MartinThoma/LaTeX-examples/blob/master/documents/write-math-ba-paper/write-math-ba-paper.pdf)
## Spell checking
* Spell checking `aspell --lang=en --mode=tex check write-math-ba-paper.tex`
* Spell checking with `http://www.reverso.net/spell-checker`
* Spell checking with `http://www.reverso.net/spell-checker`
* https://github.com/devd/Academic-Writing-Check

View file

@ -75,22 +75,21 @@ Daniel Kirsch describes in~\cite{Kirsch} a system called Detexify which uses
time warping to classify on-line handwritten symbols and claims to achieve a
TOP-3 error of less than $\SI{10}{\percent}$ for a set of $\num{100}$~symbols.
He also published his data on \url{https://github.com/kirel/detexify-data},
which was collected by a crowd-sourcing approach via
which was collected by a crowdsourcing approach via
\url{http://detexify.kirelabs.org}. Those recordings as well as some recordings
which were collected by a similar approach via \url{http://write-math.com} were
used to train and evaluated different classifiers. A complete description of
all involved software, data and experiments is given in~\cite{Thoma:2014}.
\section{Steps in Handwriting Recognition}
The following steps are used in all classifiers which are described in the
following:
The following steps are used in many classifiers:
\begin{enumerate}
\item \textbf{Preprocessing}: Recorded data is never perfect. Devices have
errors and people make mistakes while using devices. To tackle these
problems there are preprocessing algorithms to clean the data. The
preprocessing algorithms can also remove unnecessary variations of
the data that do not help in the classification process, but hide
errors and people make mistakes while using the devices. To tackle
these problems there are preprocessing algorithms to clean the data.
The preprocessing algorithms can also remove unnecessary variations
of the data that do not help in the classification process, but hide
what is important. Having slightly different sizes of the same symbol
is an example of such a variation. Four preprocessing algorithms that
clean or normalize recordings are explained in
@ -117,15 +116,16 @@ following:
improve the performance of learning algorithms.
\end{enumerate}
After these steps, we are faced with a classification learning task which consists of
two parts:
After these steps, we are faced with a classification learning task which
consists of two parts:
\begin{enumerate}
\item \textbf{Learning} parameters for a given classifier. This process is
also called \textit{training}.
\item \textbf{Classifying} new recordings, sometimes called
\textit{evaluation}. This should not be confused with the evaluation
of the classification performance which is done for multiple
topologies, preprocessing queues, and features in \Cref{ch:Evaluation}.
topologies, preprocessing queues, and features in
\Cref{ch:Evaluation}.
\end{enumerate}
The classification learning task can be solved with \glspl{MLP} if the number
@ -141,7 +141,7 @@ and feature extraction easier, more effective or faster. It does so by resolving
errors in the input data, reducing duplicate information and removing irrelevant
information.
Preprocessing algorithms fall in two groups: Normalization and noise
Preprocessing algorithms fall into two groups: Normalization and noise
reduction algorithms.
A very important normalization algorithm in single-symbol recognition is
@ -157,12 +157,12 @@ Another normalization preprocessing algorithm is resampling. As the data points
on the pen trajectory are generated asynchronously and with different
time-resolutions depending on the used hardware and software, it is desirable
to resample the recordings to have points spread equally in time for every
recording. This was done with linear interpolation of the $(x,t)$ and $(y,t)$
recording. This was done by linear interpolation of the $(x,t)$ and $(y,t)$
sequences and getting a fixed number of equally spaced points per stroke.
\textit{Connect strokes} is a noise reduction algorithm. It happens sometimes
that the hardware detects that the user lifted the pen where the user certainly
didn't do so. This can be detected by measuring the euclidean distance between
didn't do so. This can be detected by measuring the Euclidean distance between
the end of one stroke and the beginning of the next stroke. If this distance is
below a threshold, then the strokes are connected.
@ -207,19 +207,20 @@ activation functions can be varied. The learning algorithm is parameterized by
the learning rate $\eta \in (0, \infty)$, the momentum $\alpha \in [0, \infty)$
and the number of epochs.
The topology of \glspl{MLP} will be denoted in the following by separating
the number of neurons per layer with colons. For example, the notation $160{:}500{:}500{:}500{:}369$
means that the input layer gets 160~features, there are three hidden layers
with 500~neurons per layer and one output layer with 369~neurons.
The topology of \glspl{MLP} will be denoted in the following by separating the
number of neurons per layer with colons. For example, the notation
$160{:}500{:}500{:}500{:}369$ means that the input layer gets 160~features,
there are three hidden layers with 500~neurons per layer and one output layer
with 369~neurons.
\glspl{MLP} training can be executed in
various different ways, for example with \gls{SLP}.
In case of a \gls{MLP} with the topology $160{:}500{:}500{:}500{:}369$,
\gls{SLP} works as follows: At first a \gls{MLP} with one hidden layer ($160{:}500{:}369$)
is trained. Then the output layer is discarded, a new hidden layer and a new
output layer is added and it is trained again, resulting in a $160{:}500{:}500{:}369$
\gls{MLP}. The output layer is discarded again, a new hidden layer is added and
a new output layer is added and the training is executed again.
\glspl{MLP} training can be executed in various different ways, for example
with \gls{SLP}. In case of a \gls{MLP} with the topology
$160{:}500{:}500{:}500{:}369$, \gls{SLP} works as follows: At first a \gls{MLP}
with one hidden layer ($160{:}500{:}369$) is trained. Then the output layer is
discarded, a new hidden layer and a new output layer is added and it is trained
again, resulting in a $160{:}500{:}500{:}369$ \gls{MLP}. The output layer is
discarded again, a new hidden layer is added and a new output layer is added
and the training is executed again.
Denoising auto-encoders are another way of pretraining. An
\textit{auto-encoder} is a neural network that is trained to restore its input.