It attempts to look at the problem of privacy through the prism of the leak and consolidation of information by advertising service providers, analytical service providers, and social networks in particular. The paper comes to the conclusion that a violation of privacy on the web is an irreversible practice made by the websites and social networks and the only way to reduce it is the increase of users’ awareness in privacy policies and the use of strict account settings and tools.
New computer technologies developed for the collection, storage, processing, and transmission of information have defined the emergence of new approaches to its use. Furthermore, they have also given rise to new problems, including ethical ones. The speed and efficiency of information systems, which include technical devices, databases, information processing programs, local and global networks, led to the emergence of entirely new rights and responsibilities of public authorities, businesses, and citizens in their use of the information.
The 21st century is not only the age of new technologies and progress, but also the age of information warfare. Society is gradually joined the virtual world, succumbed to the temptation of the Internet, computerized and is not opposed to the process of increasing the role of computers in people’s lives. The modern world of computer technology allows to solve an incredibly large number of problems. Furthermore, it helps to process the huge volumes of information in a few minutes. The rapid growth of the Internet, its availability, and a huge opportunity it provides to the user, led to the emergence of entire industries in business, science, and education. Today, one can easily process the video and audio information, conduct Internet conference, and promptly transmit any information anywhere in the world using the Internet. However, the rapid growth of the capabilities and services of the Internet brings a number of new issues, the most serious of which is, of course, the problem of privacy. This paper discusses the problem of privacy on the web and its different aspects.
Description of the Problem
A decade ago, the making of online payment was considered a risky business and was done with great care. The use of real names in social communication in the network was unthinkable, and the anonymity of Internet activities was almost absolute. Social networks have radically changed the situation. In social networks, people want to talk not to strangers or virtual characters, but to people they know or want to know. The change mentioned above suggests that people’s image on the web is quite truthful. People share with their friends, acquaintances, and often with the entire world quite accurate information about their age, interests, location, current concerns and so on.
For example, there is a social media called Blippy, which allows people to exchange information about their purchases of goods and services. Imagine that people inform the whole world that they bought the case for the iPad for $41, spent at Applebee’s restaurant $24, and paid $6450 on a plastic surgery clinic in Florida to change the form of their nose. Is it a unique case? It seems that it is just the beginning. For example, Foursquare is a mobile social network that allows individuals to announce their current location. Another instance is SpurtUp, which shows the world how many push-ups one has done per day (Trepte & Reinecke, 2011). All these websites and social networks for the exchange of a wide range of everyday information are growing on a massive scale and are very popular.
At first glance, Internet publishing of personal, but rather innocuous information such as hobbies, photos, or a number of schools that one has finished, does not present any danger. However, the website called PleaseRobMe clearly demonstrated how information in social networks can be used in clearly not innocuous purposes. This website has used the information available on Twitter and Foursquare to discover empty apartments, the owners of which spent time away from home.
Another aspect is that the area of distribution of one’s personal information may be much wider than one thinks. It is rarely limited to a narrow circle of friends and acquaintances. Moreover, it almost always goes beyond the social network. In turn, this fact can severely limit people’s ability to remain anonymous when they want it.
In 1993, the New Yorker magazine published a caricature depicting two dogs sitting in front of a computer screen, one of which tells another: “On the Internet, nobody knows you’re a dog” (Steiner, 2015). Today’s reality is substantially different from the one depicted in this picture and this issue requires more detailed analysis.
Leak and Consolidation of Information
It would be right to start with the question of the nature of confidentiality on the Internet. At first glance, the use of the Internet is quite anonymous. Of course, the website one visited knows the IP-address of the computer, but it is just a separate website and impersonal address. Therefore, it seems to be no reason for concern.
Another option of the third parties present on a website includes companies offering analytical services (traffic, customers, trends), such as Google Analytics, Omniture, aQuantive, and Quantcast (Subramanian, 2008). Finally, content distribution networks such as akamai.net and yimg.com place images and videos. Everything mentioned above is the streams, where information about one’s visit extends much further than the website one visited.
At the plenary session of the IETF 78, the interesting report was presented on the theme of personal information leaks on the web. This presentation was based on many years of research. Scientists have studied the so-called “spot of confidentiality”, which determines the extent of user information among seemingly unrelated websites. The study consisted of nine experiments covering a period of 5 years, 1200 most popular sites in various categories, 68 countries, and 19 languages (Trepte & Reinecke, 2011).
One of the investigated parameters was the so-called degree of association of the visited websites with one another through the third-party sites. For example, if the two independent websites use Google Analytics, they are associated. The results were quite interesting. 70% of all surveyed visited websites had an association with more than 400 unrelated websites. Even more interesting was the degree of concentration of third-party websites. Just 10 largest third-party websites (including the above-mentioned Google Analytics, Omniture, aQuantive, and Quantcast) turned out to be represented on 78.5% of all the visited websites.
However, there was more to come. Analysis of the purchases and mergers in this sector of the market has allowed researchers to combine most of the websites in three major families. They include Google (DoubleClick, Google Analytics, and Google Syndication), Adobe (Omniture, Offermatica, and Hitbox) and Microsoft (aQuantive). The families of Yahoo and AOL were less significant but also notable. Monitoring the spread of these families on the visited websites has shown its constant growth from 40% in 2005 to 84% in March 2010 (Trepte & Reinecke, 2011). The depth of penetration was also increased, meaning that more than one family of third-party websites tracks the visits of a user.
Thus, there is a huge amount of information about the users of the Internet at the disposal of the companies that provide the services described above. The ability to correlate data on visits to independent websites allows them to create a profile of users, determine their tastes, habits, attitudes, location, and other individual characteristics.
Confidential Information Leak
When people say “user”, in most cases it is a question of the unnamed IP-address or computers on which the user is working. In other words, one can get pretty reliable information about users, but without their identification.
The companies providing services to the third-party websites often use this argument. However, it is possible to identify a particular user through social networks. It is no secret that the majority of social networking sites uses the services of third parties just described. It is the placement of advertisements, content, and a variety of counters and trackers to track the movement of the user in the web space. It is worse when the reference to the third-party websites contains a user ID in a particular social network. For example, the display of the home page of one of the largest social network in Europe vk.com includes the reference to the unknown website counter.yadro.ru (Figure 1).
Figure 1. Reference of the social network website to the unknown website (Trepte & Reinecke, 2011).
As can be seen, the user ID (id12XXXXX) is contained in this message. Of course, the amount of information that can be obtained about users, knowing their ID, depends on the specific network and user settings. However, the analysis of the most popular networks has shown that the name, personal photo, gender, hobbies, friends list, and education are in most cases opened by default. Now, a third party can associate different users’ attributes such as IP-address of their computer with specific characters in a social network. Such an association may be also established with the use of cookie-tracker, which is also frequently transmitted in the request. Since the cookie-trackers are usually long-lived (e.g. the validity of the cookie in the example above will expire in April next year), past and future user’s surfing on the web may be associated with this character and, thus, identified.
Taking into account the degree of association of websites and the concentration of the third-party sites, the possibility of third parties to monitor and profile user activities on the Internet is great. Certainly, there is no guarantee that third parties produce or use such associations, but they have the potential available for this purpose.
Why People Need Privacy
Even rejecting the extreme cases, most people do not see a great danger in the publication of their photos, friends list, age, name of the city where they live, or their favorite authors and films. Eventually, they do not see a big secret in the information about the popular websites they visit. In other words, they have nothing special to hide. People can assess the damage from the publication of a separate piece of data on themselves such as name and photo. Nevertheless, a large-scale aggregation and correlation of multiple fragments, including preferences, movement, communication, and so on, can represent information that they would prefer to keep to themselves.
Going back to the social networks, user ignorance about the extent of the publicity of certain personal data is subject to serious criticism. Under the pressure of this criticism, Facebook, the largest of the social networking sites, announced on May 2010 the major changes in the system of control of personal data. In particular, users received full information on how, whom, and which data are available. Moreover, users had the opportunity to set the master installation of confidentiality applicable to all present and future applications in Facebook.
However, another problem is that the user has no idea about the composition, scope, and utilization of this information, and, accordingly, is unable to assess its effects. The user also cannot correct the errors in the data. The latter facts represent the strong threat to the user’s privacy.
Ways to Deal with the Problem
Many people provide access to the larger amount of information than they originally wanted to. Unfortunately, configure privacy options, provided on the website, solve the problem only partially. However, the use of a strict account settings and tools can help one to ensure own privacy.
Third-party applications can use social networks. They often request access to the users’ information in order to personalize their work with them. If one believes that an application requests too much data, it would be better not to install it.
Evaluation of the Problem
Since the beginning of the use of computer technology in all spheres of human activity, there have been numerous problems associated with the protection of privacy. This issue arose mainly due to the processing of documents using computer technology. Various administrative measures to protect the confidentiality of individuals and organizations have lost their power in the transition of workflow and social communication in a totally new environment (Rao & Upadhyaya, 2009).
The Internet has become a common means of social communication. Relationships in the network often generate many situations that constitute a violation of human and civil rights from a legal point of view. In this case, the legal regulation of the virtual space is quite complicated because the latter is beyond the states (the material boundaries and territories). Therefore, it cannot be subjected to the regulation by national legal acts. The right to privacy, as well as any subjective right, is a social good. Individuals may resort to the coercive power of the state in order to protect their rights. However, the difficulty is that the category of private life has no clear legal boundaries. Thus, the legal regulation establishes only the limits of its integrity and the limits of an admissible interference (Thuraisingham, 2007).
Therefore, it is crucial to maintain the delicate balance between privacy and openness, ease of use and leakage of personal information. To some extent, the users themselves can take care of it through the settings of the browser or social network. For example, they can block Referer header in the request or reject the reception of cookie from the third-party websites. However, these measures are often ineffective. For example, it is almost impossible to eliminate the use of the cookie on the website visited without a significant loss of functionality.
Moreover, the creators of the websites and social networks themselves can make a significant contribution to the solution of this problem. For example, they can assist through minimizing the amount of information transferred to third parties or through notifying users of the degree of their data openness in the social networks. These measures should be implemented without waiting for the legal process, as it happened with the Facebook service called Beacon, which had automatically signed all network users in November 2007. Due to Beacon, the user’s friends received notification about the user activity at some other sites such as the purchase of movie tickets. The case ended two years later with a complete cessation of service and the creation of a Facebook fund to work in the field of online privacy in the size of $9.5 million.
Knowledge and open discussion of these issues are a step towards addressing them. The problems are really solved: users get more control on social networks; browsers are concerned about security and privacy of the users. After all, in spite of everything, people want to share information with their friends, acquaintances, and, sometimes, with the whole world.
Nowadays, opportunities to exchange information on the Internet are truly endless and continue to evolve rapidly. Social networks have created a hitherto unprecedented degree of information connectivity among people on the global scale. The Internet has evolved into a dynamic social environment, which combines hundreds and thousands of millions of people. In the words of the founder of Facebook Mark Zuckerberg, “our mission is to make the world more open and connected.”
However, privacy is an important element of the open information space of the Internet. Neglect of the human right to privacy and, more specifically, the right to control the composition and extent of personal information, results in the loss of user confidence, increase of the control and the reduction of innovativeness, intensity and scope of information exchange as a result.
It is not possible to solve the problems associated with the protection of the confidentiality quickly and efficiently. There is a need for an integrated approach for solving these problems. This approach should involve the use of organizational and legal measures, as well as software and hardware to protect the confidentiality, integrity, and availability of information.