How does CAPTCHA work and why do we need it

09.04.21 в 10:42 Interesting 7278

Any Internet users are complaining about annoying CAPTCHAs. You get kicked from a web-source for a second and suddenly, artificial intelligence asks you to type a set of symbols with numbers, or Russian & Latin letters, or click all the traffic lights and crosswalks in the pictures. Ignorant people might think it is silly and laughable. You seem to do everything correctly but CAPTCHA still doesn’t let you through. And we can’t get through authentication on the website no matter what. In order to understand why we need these pointless symbols to make sure you are not a robot, let’s figure out what CAPTCHA actually is.

How does CAPTCHA work and why do we need it


CAPTCHA is an abbreviation from a computer slang (Completely Automated Public Turing test to tell Computers and Humans Apart). That is, a completely automated Turing test that helps artificial intelligence understand that a person is trying to open a website and not a bot. For instance, selling websites, offering to buy a proxy server, have their system test you with a CAPTCHA in order to figure out the soft price. Services with authorization can’t do without testing as well.

Turing test was first launched by a developer named Alan Turing in 1950. Initially, it was made in order to see if artificial intelligence would be able to win over the human mind in the future. In this experiment, the researcher asked two participants several questions. He had no idea whether it was a human or a robot. The researcher has to understand who is who during testing. If the researcher makes a mistake, the machine passes the test.

How does CAPTCHA work

It is considered that a bot is not capable of passing a procedure that requires typing in mathematical and alphabetical symbols or determine what it is in the picture. Aside from that, alphabetical and digital signs in CAPTCHA are placed over a distorted background and can be so overstretched or flattened that even a human can struggle to determine the meaning. You have to request another code variant in order to pass it. The database of alphabetical and digital variants is so large that it allows to present a unique variation each time. Otherwise, a bot would be able to get a code for the resource access a long time ago.

How does CAPTCHA work

If CAPTCHA seems simple at first glance, not every person can pass the transaction from the first try. Statistics show that a human can solve only 80% of existing codes, whereas a bot’s success rate equals 0,01%.

Visual perception of image plays a huge role in correct data entry. A human brain is able to indicate patterns by connecting a picture into a single whole. Artificial intelligence cannot connect objects together and indicate patterns.

CAPTCHA developers have also considered people with bad sight. They are offered an audio variant of the code. To prevent a bot from understanding it, there is a multi-level audio with side noises layered over it.

Why were CAPTCHAs introduced

The Internet is an endless source of information supported by a direct proxy server for regular people, students, scientists, teachers, etc. Information sources are used to gather statistical data and for further research. Researches can become considerably slow because of the requirement of code passing, but in the opposite case, web resources can be threatened with many dangers.

CAPTCHA slows researches down, but it protects resources from robots aimed at automatic registration and mass account creation. It can also block bots that are involved in automatic parsing: collecting texts, pictures, prices for certain items.

Unfortunately, this is not the limit of their activity on the web. There are ambitious people who wish to make huge profit on the Internet that can be achieved with a multi-million audience. Parsers help them in this goal, they are constantly scanning the network and instantly react to necessary information.

For example, resellers use bots for mass ticket sales to popular events. The reseller can have quite a large number of tickets just in seconds after the sales had started, and they can sell them double or triple the original price. This is the reason why the majority of companies who are involved in ticket sales use CAPTCHAs on their resources.

Some perpetrators carry out hacker attacks on web services in order to interrupt their work and stop the services. In order to avoid DDoS attacks, resource owners install CAPTCHAs and use reliable proxy servers for additional protection from parsers.

What is reCAPTCHA and how is it different from CAPTCHA

ReCAPTCHA essentially is just like CAPTCHA but much simpler for human perception. It was designed and presented by Google. If you surf the net quite often, you might have noticed that, recently, it has become more popular to tick a box with a caption “I am not a robot” in addition. This is reCAPTCHA. If artificial intelligence doubts that you are a real human after it, it will offer you another challenge. There are several kinds of reCAPTCHA differing in difficulty.

Keyword recognition

In 2007, reCAPTCHA that used words and expressions from old books or newspapers was designed. They were reformatted into a digital version and were not comprehensible for artificial intelligence. The main task of the user was to recognize a keyword.

Image recognition

In 2012, images started to be used for reCAPTCHAs, among which you should choose the ones with a common element. Correct transactions prove to artificial intelligence that you are not a robot. The answer is considered correct if it corresponds with the answers of most users. In 2014, the system was modernized and simplified. The user has to pass the test in one click by simply adding a tick in a needed place. If the system suspects that you are a robot, it will offer you a test with image recognition.

Image recognition

noCAPTCHA Choose all images with road signs

It is not clear for many users what importance noCAPTCHA actually has. You only have to tick “I am not a robot” to access a web resource. It is actually not as simple. During the test, artificial intelligence analyzes the trajectory of mouse movement towards the tick. Even the shortest movement wouldn’t be direct and would be unique in each case, which will prove that you are not a robot.

Apart from behavior analysis of the mouse, there is an additional cookie check on the computer. If cookie files are deleted after each session in an autonomous mode, noCAPTCHA is not available.

User behavior analysis

In 2018, there was developed new reCAPTCHA that not many users even know about. The system automatically analyzes user behavior and the browser history and decides whether you are a human or a bot. If it detects suspicious behavior, it offers you reCAPTCHAs.

How can you authorize on the website skipping CAPTCHA?

If you want to get on the needed resource without proving that you are not a robot, you need to understand how CAPTCHA and reCAPTCHA work.

CAPTCHA appears only if the system sees that one IP address has a big number of log ins into a certain resource. This behavior is typical of parsers and appears as suspicious.

Before reCAPTCHA, artificial intelligence analyzes three main indicators of user’s activity: log in history, cookie files and mouse movement trajectory. If the system doesn’t notice any suspicious activity, you will not have to do reCAPTCHA.

If you’ve decided to create your own way of avoiding CAPTCHAs, then bot creators are most likely working on it already, too. When there are widespread cases of avoiding the test on the web, developers will start working on a new improved version. The most recent reCAPTCHA version is optimized for the users and the test goes automatically in a discreet mode.

Let’s draw some conclusions

So, now we know that CAPTCHA is a test that proves to the system that you are a human, not a robot. It was designed by Alan Turing in order to prevent perpetrators’ and spammers’ attacks. It is in constant development process so that it can be one step ahead of ambitious developers of parsers.

ReCAPTCHA is like CAPTCHA, simple for a human mind, but more complicated for a machine. Mass user answers are key to this testing. Many people are not even aware of the most recent version of reCAPTCHAs, many aren’t aware of it since it does not require any actions from the person but judges them based on their web activity.

Some computer geniuses can avoid CAPTCHA by artfully manipulating the system, but it requires a lot of resources and time. A regular person will hardly ever attempt it. And this way out will only work for one device. Another computer will need a new version of this method.

We are all used to this annoying testing, and from time to time we all do these tasks. But restless hackers never sleep and are persistently looking for solutions to this problem by developing new program software, which will be able to deal with web testing as well as human intellect can. But network scanners are consistently looking for hacker bots. Any suspicious behavior will be noticed and the system will sound the alarm. An improved modified model with new codes will appear and will make perpetrators’ software fail.

Author: vorobevaes


Sign in to comment

Do you need to use a proxy server to increase your anonymity on the Internet? Not sure how to set up a proxy properly before you start? In this article, we will try to answer all the questions that arise when you first try to connect to the network through a single proxy server on Windows 10.

Together with wide opportunities the Internet carries a number of dangers. First of all, when it comes to anonymity and security.

Initially, the World Wide Web was conceived as a space without borders, where you can get absolutely any information on an anonymous basis.

In today's world, it becomes more difficult to keep personal and corporate data in secret, so the issue of information security is becoming more acute every day.

In the recent past by the standards of the development of information technologies, in 2015 Google created artificial intelligence based on neural networks, which was able to analyze the condition around itself and draw conclusions about its further education. The name of the new offspring from Google was given in abbreviated from the term "deep Q-network" - DQN. The DQN started training in common arcade games (Pakman, Tennis, Space Invaiders, Boksing and other classics).


The Dolphin{anty} browser, which has made some noise in affiliate marketing, is a familiar tool for those who drive traffic through social networks or media and contextual advertising services. In the article we will talk about this antidetect and explain how to configure a proxy in it.

With the advent of Node.js, the development of JavaScript as one of the most powerful and user-friendly languages ​​for web scraping and data parsing has accelerated significantly. Node.js is one of the most popular and fastest growing software platforms. Its main purpose is to execute JavaScript code without the participation of a browser.

Receiving big volumes of data from websites for its following analysis plays a key role for plenty of projects. Target resource structure analysis and scraping of the relevant information are often connected to blocks or access restriction issues from website administration.

If you’re looking for a package of residential or mobile proxies with the ability to work with a particular country or ISP, the best option is definitely Exclusive Mix. With it you will be able to download the list which consists of proxies from preliminarily chosen countries and carriers, flexibly filtering it for your needs.

How to web scrape with python? It's a question that many beginners have. At the entry level, the process is quite simple, and anyone can quickly get their project off the ground. However, to successfully work on such a task, you should not forget about many aspects, which are not easy to understand at once.

Have you got any question?

Click here and we’ll answer