Effective Techniques for Protecting the Privacy of Web Users
Abstract
:1. Introduction
2. Related Study
3. Background
3.1. Data Collection Practice by Websites
- navigator.languages, which could disclose the nationality and the native language of the visitor.
- navigator.userAgent, which reveals whether the visitor is on a desktop or mobile device.
- screen, which records information about the visitor’s screen settings such as height, width, pixel depth, resolution, default orientation, etc.
- document.location, which gives valuable details about the visited webpage including the host, protocol, port, the entire URL of the webpage, the origin, and many other details.
- document.cookie, which is one of the most discussed methods for tracking website visitors provides a list of all cookies that the website use to track its users.
3.2. Privacy Protection Tools (PP-Tools)
- Disconnect(DC): [17] Is a blacklist-based technique primarily used to block 3rd-party tracking cookies and JavaScript programs that are used on social networks like YouTube, Facebook, Instagram, or Twitter. It also provides a Premium version to protect VPN networks and detect malicious software.
- Ghostery(GT): [18] Is a powerful privacy extension or add-on launched in 2009 that can detect and block tracking scripts and cookies from webpages in order to improve privacy and focus on only important content. Moreover, it scans the DOM tree with regular expressions for Advertisements, tracking, and other entities stored in a pre-defined blacklist. As a further feature, the users are given the option to enable tracking do-mains manually with this tool. Therefore, users can activate or deactivate any third-party services within different categories such as Social Media Widgets, Analytics Services, Ads, and so on.
- Adblock Plus (ABP): [19] Is primarily focused on blocking unwanted online content including advertisements, banners, pop-ups, videos, and other forms of advertising that are disruptive and bothersome, as well as preventing malware and tracking. It relies on blacklists with a large number of community-maintained rules. Blacklists can be customized to suit the user’s needs, such as EasyList (The most popular one that blocks ads from English sites), Fanboy’s list (The second most popular Adblock Plus filter list), FR List (which Blocks ads from French sites), or AR List (which Blocks ads from Arabic sites). It also works by searching HTML rendered pages (DOM trees) with regular expressions and blocks the downloading of web resources that refer to blacklisted advertisements and trackers.
- uBlock (UB): [20] It is an open-source browser extension for blocking ads and filtering content. It works with most web browsers including Firefox, Chrome, Chromium, Opera, and some versions of Safari. According to some reports [21], uBlock is considered less memory-consuming than any other extension that offers similar functionality. It uses the same lists of the previous extensions including EasyList, Peter Lowe, Malware Domains, EasyPrivacy, etc. As well as users can set their own filtering preferences.
- Privacy Badger(PB): [22] It uses an internal blacklist and a heuristic algorithm to block different types of third-party tracking, including canvas fingerprinting, local storage super cookies, and identifying cookies. In comparison to other PPTs, PB does not require any prior knowledge, configurations, or setup on the user’s part. It works by sending the Do Not Track header with every page request, and assessing whether the user is still being tracked. Upon determining that the probability is too high, the algorithm automatically denies the request for a third-party domain.
- NoScript(NS): [23] A white list-based tool allows content that has been explicitly authorized by the user, so the default behavior is to block all website content. The rules are expressed as regular expressions, similar to AdBlock Plus [19], and It works by blocking all executable content on a webpage that may threaten security including Java programs, Flash, Silverlight, JavaScript programs, and more. However, this may result in usability problems and requires considerable user interaction.
4. Methodology
4.1. Analysis of Websites
4.2. Experimental Setup
- A set of the most common 50 websites to extract unique JavaScript programs.
- Six of the most common and well-known freeware browser plug-ins (Disconnect, Ghostery, Adblock Plus, uBlock, NoScript, and Privacy Badger).
- Tests were performed on a Linux x64 Ubuntu Operating System and jupyter notebook for running the Python code.
- An open-source web browser with an associated webdriver for automated testing of websites. In our experiment, we used the Firefox browser with geckodriver to test sites automatically.
4.3. Data Collection
5. Experimental Results
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ikram, M.; Asghar, H.J.; Kaafar, M.A.; Krishnamurthy, B.; Mahanti, A. Towards Seamless Tracking-Free Web: Improved Detection of Trackers via One-class Learning. arXiv 2017, arXiv:1603.06289. [Google Scholar] [CrossRef] [Green Version]
- Xu, K.; Li, X.; Bose, S.K.; Shen, G. Joint Replica Server Placement, Content Caching, and Request Load Assignment in Content Delivery Networks. IEEE Access 2018, 6, 17968–17981. [Google Scholar] [CrossRef]
- Ermakova, T.; Fabian, B.; Bender, B.; Klimek, K. Web tracking-A Literature Review on the State of Research. In Proceedings of the 51st Hawaii International Conference on System Sciences, HICSS 2018, Hilton Waikoloa Village, HI, USA, 3–6 January 2018. [Google Scholar]
- Maryam Abdulaziz Saad Bubukayr, M.F. Web Tracking Domain and Possible Privacy Defending Tools: A Literature Review. J. Cyber Secur. 2022, 4, 79–94. [Google Scholar] [CrossRef]
- Englehardt, S.; Reisman, D.; Eubank, C.; Zimmerman, P.; Mayer, J.; Narayanan, A.; Felten, E.W. Cookies That Give You Away: The Surveillance Implications of Web Tracking. In Proceedings of the 24th International Conference on World Wide Web (WWW’15), Florence, Italy, 18–22 May 2015; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2015; pp. 289–299. [Google Scholar]
- Kalavri, V.; Blackburn, J.; Varvello, M.; Papagiannaki, K. Like a Pack of Wolves: Community Structure of Web Trackers. In Proceedings of the International Conference on Passive and Active Network Measurement (PAM), Heraklion, Greece, 31 March–1 April 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 42–54. [Google Scholar]
- Schelter, S.; Kunegis, J. Tracking the Trackers: A Large-Scale Analysis of Embedded Web Trackers. In Proceedings of the 10th International AAAI Conference on Web and Social Media (ICWSM 2016), Cologne, Germany, 17–20 May 2016; pp. 679–682. [Google Scholar]
- Muzamil, M.; Khan, A.; Hussain, S.; Jhandir, M.Z.; Kazmi, R.; Bajwa, I.S. Analysis of Tracker-Blockers Performance. Pak. J. Eng. Technol. 2021, 4, 184–190. [Google Scholar]
- Cozza, F.; Guarino, A.; Isernia, F.; Malandrino, D.; Rapuano, A.; Schiavone, R.; Zaccagnino, R. Hybrid and Lightweight Detection of Third Party Tracking: Design, Implementation, and Evaluation. Comput. Netw. 2020, 167, 106993. [Google Scholar] [CrossRef]
- Garimella, K.; Kostakis, O.; Mathioudakis, M. Ad-blocking: A study on performance, privacy and counter-measures. In Proceedings of the ACM on Web Science Conference, WebSci’17, New York, NY, USA, 25–28 June 2017; pp. 259–262. [Google Scholar]
- Bouhnik, D.; Carmi, G. Interface Application Comprehensive Analysis of Ghostery. Int. J. Comput. Syst. 2018, 5, 4–10. [Google Scholar]
- Oulasvirta, A.; De Pascale, S.; Koch, J.; Langerak, T.; Jokinen, J.; Todi, K.; Laine, M.; Kristhombuge, M.; Zhu, Y.; Miniukovich, A.; et al. Aalto Interface Metrics (AIM) A Service and Codebase for Computational GUI Evaluation. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings, Berlin, Germany, 14–17 October 2018; pp. 16–19. [Google Scholar]
- Malandrino, D.; Petta, A.; Scarano, V.; Serra, L.; Spinelli, R.; Krishnamurthy, B. Privacy awareness about information leakage: Who knows what about me? In Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society, WPES13, Berlin, Germany, 4 November 2013; pp. 279–284. [Google Scholar]
- Wang, Y.; Cai, W.d.; Wei, P.C. A deep learning approach for detecting malicious JavaScript code. Secur. Commun. Netw. 2016, 9, 1520–1534. [Google Scholar] [CrossRef] [Green Version]
- Pujol, E.; Hohlfeld, O.; Feldmann, A. Annoyed users: Ads and ad-block usage in the wild. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC, Tokyo, Japan, 28–30 October 2015; pp. 93–106. [Google Scholar]
- Mathur, A.; Vitak, J.; Narayanan, A.; Chetty, M. Characterizing the Use of {Browser-Based} Blocking Extensions To Prevent Online Tracking. In Proceedings of the 14th USENIX Conference on Usable Privacy and Security, Baltimore, MD, USA, 12–14 August 2018; pp. 103–116. [Google Scholar]
- Disconnect. Available online: https://disconnect.me (accessed on 20 March 2022).
- Ghostery. Available online: https://www.ghostery.com (accessed on 20 March 2022).
- Adblock Plus|The world’s #1 Free Ad Blocker. Available online: https://adblockplus.org/ (accessed on 20 March 2022).
- uBlock Origin—Free, Open-Source ad Content Blocker. Available online: https://ublockorigin.com/ (accessed on 20 March 2022).
- uBlock, the Memory-Friendly Ad-Blocker, Is Now Available for Firefox. Available online: https://lifehacker.com/ublock-the-memory-friendly-ad-blocker-is-now-availabl-1681818949 (accessed on 20 March 2022).
- Privacy Badger. Available online: https://privacybadger.org/ (accessed on 20 March 2022).
- What is It?—NoScript: Block Scripts and Own Your Browser! Available online: https://www.noscript.net/ (accessed on 20 March 2022).
- What is Selenium? Available online: http://www.seleniumhq.org/ (accessed on 20 March 2022).
- Chrome DevTools Protocol.(n.d.). Chrome DevTools Protocol. Available online: https://chromedevtools.github.io/devtools-protocol/ (accessed on 20 March 2022).
- Alazab, A.; Khraisat, A.; Alazab, M.; Singh, S. Detection of Obfuscated Malicious JavaScript Code. Future Internet 2022, 14, 217. [Google Scholar] [CrossRef]
- Masood, R.; Vatsalan, D.; Ikram, M.; Kaafar, M.A. Incognito: A Method for Obfuscating Web Data. In Proceedings of the 2018 World Wide Web Conference, WWW’18, Lyon, France, 23–27 April 2018; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2018; pp. 267–276. [Google Scholar]
Add-On or Extension | Users Base | Rule-Based Filtering |
---|---|---|
Disconnect | 600,000+ users | Blacklist |
Ghostery | 2,000,000+ users | Blacklist |
Adblock Plus | 10,000,000+ users | EasyList |
uBlock | 700,000+ users | Blacklist |
NoScript | 100,000+ users | Whitelist |
Privacy Badger | 1,000,000+ users | Heuristic algorithm |
JSec | PPTs Off | DC | GT | ABP | UB | PB | NS |
---|---|---|---|---|---|---|---|
In-page | 828 | 467 | 691 | 590 | 646 | 496 | 698 |
Blocked | - | 40% | 11% | 24% | 17% | 36% | 11% |
Allowed | - | 60% | 89% | 76% | 83% | 64% | 89% |
JSec | PPTs Off | DC | GT | ABP | UB | PB | NS |
---|---|---|---|---|---|---|---|
External | 750 | 385 | 372 | 421 | 403 | 373 | 278 |
Blocked | - | 47% | 48% | 42% | 44% | 48% | 61% |
Allowed | - | 53% | 52% | 58% | 56% | 52% | 39% |
PPTs | Positive Class | Negative Class | ||
---|---|---|---|---|
(Functional Jsec) | (Tracking Jsec) | |||
TPR | FPR | TNR | FNR | |
DC | 0.69 | 0.31 | 0.68 | 0.32 |
GT | 0.87 | 0.13 | 0.84 | 0.16 |
ABP | 0.82 | 0.18 | 0.73 | 0.27 |
UB | 0.86 | 0.14 | 0.82 | 0.18 |
PB | 0.71 | 0.29 | 0.69 | 0.31 |
NS | 0.80 | 0.20 | 0.87 | 0.13 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bubukayr, M.; Frikha, M. Effective Techniques for Protecting the Privacy of Web Users. Appl. Sci. 2023, 13, 3191. https://doi.org/10.3390/app13053191
Bubukayr M, Frikha M. Effective Techniques for Protecting the Privacy of Web Users. Applied Sciences. 2023; 13(5):3191. https://doi.org/10.3390/app13053191
Chicago/Turabian StyleBubukayr, Maryam, and Mounir Frikha. 2023. "Effective Techniques for Protecting the Privacy of Web Users" Applied Sciences 13, no. 5: 3191. https://doi.org/10.3390/app13053191
APA StyleBubukayr, M., & Frikha, M. (2023). Effective Techniques for Protecting the Privacy of Web Users. Applied Sciences, 13(5), 3191. https://doi.org/10.3390/app13053191