132 lines
5.6 KiB
TeX
132 lines
5.6 KiB
TeX
\documentclass[conference]{IEEEtran}
|
|
\usepackage[english]{babel} % To obtain English text with the blindtext package
|
|
\usepackage{blindtext}
|
|
\begin{document}
|
|
\title{Analysis of bot detection and spam prevention systems}
|
|
% TODO get author names from Prof. Sibi
|
|
\author {
|
|
\IEEEauthorblockN{Aravinth Manivannan, Sibi Chakkaravarth S, TODO}
|
|
\IEEEauthorblockA{SENSE, VIT AP, AP, India - pincode\\
|
|
foo@example.com
|
|
}
|
|
}
|
|
\maketitle
|
|
|
|
\begin{abstract}
|
|
CAPTCHA systems were originally designed to protect against automated
|
|
bot-based Denial of Service(DoS) attacks and spam. But over time, these
|
|
systems have become ineffective due to overfocus on identifying humans from
|
|
bots than combating DoS attacks and spam. As a result, they have become
|
|
privacy invasive systems that pose accessibility challenges with reduced
|
|
effectiveness and accuracy. mCaptcha is a proof of work based,
|
|
non-interactive DoS protection system designed to overcome the limitations
|
|
of traditional CAPTCHA systems' limitations while offering superior
|
|
protection services. The mechanism is stateless, so it is able accurately
|
|
defend against attacks over anonymous networks like TOR and the
|
|
non-interactive nature makes it ideal users with auditory, cognitive and
|
|
visual disabilities.
|
|
\end{abstract}
|
|
|
|
\section{Introduction}
|
|
\label{sec:intro}
|
|
|
|
Denial of Service(DoS) attacks and spam campaigns reduce the quality of service
|
|
for internet services. Different types of rate-limiters were employed to combat
|
|
such attacks. Today rate-limiters on the web are synonymous with CAPTCHAs.
|
|
CAPTCHA systems work on the premise that an automated bot user can inflict more
|
|
damage than a human user and attacks can be contained if they can accurately
|
|
differentiate a human from a bot. The rise of cheap human labor powered CAPTCHA
|
|
farms in third-world countries have given attackers a way to bypass CAPTCHA
|
|
systems. To combat this new threat, CAPTCHA implementers are constantly raising the
|
|
difficulty of the challenges. This universal raise in difficulty impacts bots
|
|
and unassuming alike. The web is becoming increasingly less accessible to users
|
|
with disabilities and non-English speaking users. Some CAPTCHA systems employ
|
|
multiple methods to in their process. Privacy invasive mechanisms like cookies
|
|
and IP tracking are popular methods that are used in conjunction with
|
|
traditional CAPTCHA mechanisms, both of which are ineffective against
|
|
anonymous networks like TOR and pose serious privacy risks to their users.
|
|
|
|
The rest of this paper, rates different CAPTCHA mechanisms and systems based on
|
|
parameters mentioned below and describe how mCaptcha overcomes some of
|
|
them.
|
|
|
|
\subsection{CAPTCHA rating parameters}
|
|
CAPTCHA systems use a variety of methods in their decision process. Every method
|
|
has it's own strengths and limitations but the following parameters have been
|
|
chosen to uniformly rate CAPTCHA methods and systems in an attempt to compare
|
|
them.
|
|
\begin{description}[\IEEEsetlabelwidth{Effectiveness}]
|
|
\item[Privacy]
|
|
\begin{itemize}
|
|
\item Does the method use trackers or any other identifying method?
|
|
\item Does the method work in anonymous networks like TOR?
|
|
\end{itemize}
|
|
\item[Effectiveness]
|
|
\begin{itemize}
|
|
\item Is the method/system effective in containing DoS attacks?
|
|
\item Can the method be circumvented? If yes, how practical/feasible
|
|
the attack?
|
|
|
|
\end{itemize}
|
|
\item[Accessibility]
|
|
\begin{itemize}
|
|
\item Is the method posing any challenges to visually to users
|
|
with auditory, cognitive and visual disabilities?
|
|
\item How easy is it to use?
|
|
\item Does the method have a language dependency which poses a challenge to
|
|
non-English speakers?
|
|
\end{itemize}
|
|
\item[Accuracy]
|
|
\begin{itemize}
|
|
\item How accurate is the method in detecting potentially malicious
|
|
users?
|
|
\item Are there any factors that method's impact accuracy?
|
|
\end{itemize}
|
|
\end{description}
|
|
|
|
\subsection{CAPTCHA methods analysed}
|
|
We analysed at the following CAPTCHA methods using the above mentioned
|
|
parameters. These are popular methods are currently in deployment.
|
|
%TODO add images
|
|
|
|
\subsubsection{Align object}
|
|
Objects in various degrees of misalignments are displayed to the user and are
|
|
asked to chose the one that is perfectly aligned.
|
|
% Example GitHub/Kik inverted Hipop
|
|
|
|
\subsubsection{Blurred Text}
|
|
A sequence of randomly generated letters and digits are
|
|
presented to the user with added noise, scattered distribution and
|
|
rotations. Sometimes, they are also presented in 3D form.
|
|
|
|
\subsubsection{Context based}
|
|
This method is personalised to the platforms they are displayed on. They usually
|
|
pose challenges which can only be solved if the user is familiar with the
|
|
platforms. Some examples are:
|
|
\begin{itemize}
|
|
\item What is the name of the website's mascot?
|
|
\item Who owns this website?
|
|
\item What are our members collectively called?(example: Reddit users are
|
|
called Redditors)
|
|
\end{itemize}
|
|
|
|
\subsubsection{Audio based}
|
|
A audio recording with added noise is presented to the user who is asked to
|
|
transcribe the content of the recording.
|
|
|
|
\subsubsection{IP tracking}
|
|
IP address is used to blacklist misbehaving users. Strictly speaking, this isn't
|
|
a CAPTCHA method but is frequently used in conjunction with other methods.
|
|
|
|
\subsubsection{Image identification}
|
|
A blurred image with added noise or unusual cropping is presented to the user
|
|
who is requested to identify the object in it. Sometimes, the users are also
|
|
asked to pick images that match a certain description from a collection of
|
|
images.
|
|
|
|
\subsubsection{Proof of Work based}
|
|
This is an alternative to CAPTCHA method that has been used for rate-limiting.
|
|
The user agent is presented with a challenge and is tasked generate a
|
|
cryptographic proof which computationally expensive.
|
|
|
|
\end{document}
|