\documentclass[conference]{IEEEtran} \usepackage[english]{babel} % To obtain English text with the blindtext package \usepackage{blindtext} \begin{document} \title{Analysis of bot detection and spam prevention systems} % TODO get author names from Prof. Sibi \author { \IEEEauthorblockN{Aravinth Manivannan, Sibi Chakkaravarth S, TODO} \IEEEauthorblockA{SENSE, VIT AP, AP, India - pincode\\ foo@example.com } } \maketitle \begin{abstract} CAPTCHA systems were originally designed to protect against automated bot-based Denial of Service(DoS) attacks and spam. But over time, these systems have become ineffective due to overfocus on identifying humans from bots than combating DoS attacks and spam. As a result, they have become privacy invasive systems that pose accessibility challenges with reduced effectiveness and accuracy. mCaptcha is a proof of work based, non-interactive DoS protection system designed to overcome the limitations of traditional CAPTCHA systems' limitations while offering superior protection services. The mechanism is stateless, so it is able accurately defend against attacks over anonymous networks like TOR and the non-interactive nature makes it ideal users with auditory, cognitive and visual disabilities. \end{abstract} \section{Introduction} \label{sec:intro} Denial of Service(DoS) attacks and spam campaigns reduce the quality of service for internet services. Different types of rate-limiters were employed to combat such attacks. Today rate-limiters on the web are synonymous with CAPTCHAs. CAPTCHA systems work on the premise that an automated bot user can inflict more damage than a human user and attacks can be contained if they can accurately differentiate a human from a bot. The rise of cheap human labor powered CAPTCHA farms in third-world countries have given attackers a way to bypass CAPTCHA systems. To combat this new threat, CAPTCHA implementers are constantly raising the difficulty of the challenges. This universal raise in difficulty impacts bots and unassuming alike. The web is becoming increasingly less accessible to users with disabilities and non-English speaking users. Some CAPTCHA systems employ multiple methods to in their process. Privacy invasive mechanisms like cookies and IP tracking are popular methods that are used in conjunction with traditional CAPTCHA mechanisms, both of which are ineffective against anonymous networks like TOR and pose serious privacy risks to their users. The rest of this paper, rates different CAPTCHA mechanisms and systems based on parameters mentioned below and describe how mCaptcha overcomes some of them. \subsection{CAPTCHA rating parameters} CAPTCHA systems use a variety of methods in their decision process. Every method has it's own strengths and limitations but the following parameters have been chosen to uniformly rate CAPTCHA methods and systems in an attempt to compare them. \begin{description}[\IEEEsetlabelwidth{Effectiveness}] \item[Privacy] \begin{itemize} \item Does the method use trackers or any other identifying method? \item Does the method work in anonymous networks like TOR? \end{itemize} \item[Effectiveness] \begin{itemize} \item Is the method/system effective in containing DoS attacks? \item Can the method be circumvented? If yes, how practical/feasible the attack? \end{itemize} \item[Accessibility] \begin{itemize} \item Is the method posing any challenges to visually to users with auditory, cognitive and visual disabilities? \item How easy is it to use? \item Does the method have a language dependency which poses a challenge to non-English speakers? \end{itemize} \item[Accuracy] \begin{itemize} \item How accurate is the method in detecting potentially malicious users? \item Are there any factors that method's impact accuracy? \end{itemize} \end{description} \subsection{CAPTCHA methods analysed} We analysed at the following CAPTCHA methods using the above mentioned parameters. These are popular methods are currently in deployment. %TODO add images \subsubsection{Align object} Objects in various degrees of misalignments are displayed to the user and are asked to chose the one that is perfectly aligned. % Example GitHub/Kik inverted Hipop \subsubsection{Blurred Text} A sequence of randomly generated letters and digits are presented to the user with added noise, scattered distribution and rotations. Sometimes, they are also presented in 3D form. \subsubsection{Context based} This method is personalised to the platforms they are displayed on. They usually pose challenges which can only be solved if the user is familiar with the platforms. Some examples are: \begin{itemize} \item What is the name of the website's mascot? \item Who owns this website? \item What are our members collectively called?(example: Reddit users are called Redditors) \end{itemize} \subsubsection{Audio based} A audio recording with added noise is presented to the user who is asked to transcribe the content of the recording. \subsubsection{IP tracking} IP address is used to blacklist misbehaving users. Strictly speaking, this isn't a CAPTCHA method but is frequently used in conjunction with other methods. \subsubsection{Image identification} A blurred image with added noise or unusual cropping is presented to the user who is requested to identify the object in it. Sometimes, the users are also asked to pick images that match a certain description from a collection of images. \subsubsection{Proof of Work based} This is an alternative to CAPTCHA method that has been used for rate-limiting. The user agent is presented with a challenge and is tasked generate a cryptographic proof which computationally expensive. \end{document}