Internet censorship consists of restrictions on what information can be publicized or viewed on the Internet. According to Freedom House's annual Freedom on the Net report, more than half the world's Internet users now live in a place where the Internet is censored or restricted. However, members of the Internet Freedom community lack comprehensive real-time awareness of where and how censorship is being imposed. The challenges to achieving such a solution include but are not limited to coverage, scalability, adoption, and safety. The project explores a linguistically-informed approach for measuring and circumventing Internet censorship.The research takes a new perspective on the problem by investigating a hybrid method for censorship detection and evasion from the lens of linguistic analysis. The team develops new models to measure Internet censorship, investigates mechanisms to circumvent censorship using linguistic techniques, conducts communication and social network measurements of censored content. Active Sensing and natural language processing techniques, in conjunction with machine learning and optimization, invigorates new research directions in Internet Freedom and produces new high quality data and tools available for public use. This new allogamy between computer science, information security, network analysis and linguistics provides the foundation for evolution of anti-censorship technologies. The research contributes to a number of fields including Internet censorship, privacy and online information retrieval, as well as computational social science by modeling and analyzing the phenomenon of censorship using the signal available in language. The broader contribution includes wide dissemination of the research results via peer-reviewed publications, special topic courses and workshops. Additional benefits include providing graduate and undergraduate researchers with significant experience of highly practical work on a difficult interdisciplinary problem. Significant gains are obtained in recruitment of minority students through research training in computer science and linguistics.
Adoption; Awareness; censorship; Communication; Communities; computer science; Data Quality; Data Security; Detection; dissemination research; Educational workshop; Evolution; experience; Foundations; Freedom; Hybrids; Information Retrieval; Internet; Language; lens; Linguistics; Machine Learning; Measurement; Measures; member; Methods; Minority Recruitment; Modeling; Names; Natural Language Processing; Pathway Analysis; Peer Review; Privacy; Publications; Reporting; Research; research data dissemination; Research Personnel; Research Training; Safety; Signal Transduction; Social Network; Social Sciences; Student recruitment; Techniques; Technology; Time; tool; Work