ΤTechnical Description of the Scientific and Technological Methodology
For the needs of the project, Agile Methodology will be followed in 2 development cycles, a short one that will lead to a fast standard (Minimum Viable Product – MVP) and a longer one that will complete the final product and improve its training in real data. Each development cycle includes the phases of receipt of requirements, their prioritization, construction and integration in the final product. By the end of the first development cycle there is expected to be an initial product with basic functionality (MVP) which can be demonstrated to potential interested customers and bring evaluation of the result and feedback on additional functionality.
The basis of the proposed solution is the modeling of the way news is disseminated on social media (either through sources or through user accounts that retransmit) with training of appropriate Graphic Synthetic Neural Networks (GCN) (Figure 1).
Figure 1. The GCN that models the resource influence network and identifies news diffusion patterns
Appropriate recurrent neural networks (RNN) will be adapted to them, ideal for data analysis, which will utilize past knowledge and will be dynamically adapted to the most recently received information (Azar et al, 2016; Oliveira et al, 2017) so that they can respond better to the dynamic changes of social networks and the discussions that are created in them. Historical data will allow us to define and evaluate quantities such as the influence of various sources and the news they transmit and to utilize this information in forecasting models (Piñeiro ‐ Chousa, 2017). On top of this complex structure, which will be periodically retrained to respond to the current social media influence structure, there will be a Reinforcement Learning Mechanism that will continually learn from astroturfing examples as they are identified and highlighted, and will be able to gradually identify cases. (Figure 2).
Figure 2. he complete architecture of the RL-RNN-GCN that combines the modeling of the influence of resources on social media over time and is constantly trained in detecting astroturfing campaigns.
Current research level
The concept of “astroturfing” in social media and the wider digital environment began to be discussed at the beginning of the decade (Ratkiewicz et al., 2010) and already, since 2013, we have the first attempts to detect it systematically, the first tools and algorithms (Zhang et al., 2013). Digital “astroturfing” can therefore be summarized as an organized communication campaign, with the support of a sponsor, in which users post targeted posts on the Internet (eg by posting “custom” product reviews) and clearly include the concept of a group of users (web profiles and online resources) that acts systematically and more or less organized (Alallaq et al, 2018). The majority of “astroturfing” incidents involve politics, with media such as Twitter (Ratkiewicz et al., 2011) and blogs leading the way in such campaigns. Recently, similar efforts have been made to create and detect “astroturfing” in various fields and in more dynamic environments (Shah et al., 2017).
The strategy with which astroturfing campaigns are organized is based in part on the formulation of false news in a plausible manner and in part on its dissemination on the social media network in a way that maximizes influence (Aslay et al, 2018). In the first case, the validation of the correctness of a news item requires a combined evaluation of the listed facts (fact checking) but also of the writing style (stylistic analysis) and is often influenced by the correctness of the source that quotes it. Existing techniques focus on one source at a time and do not look at how the news is transmitted. In the second case of maximizing influence, carefully selected sources and additional automatically created bots are used to automatically retransmit a news item by maximizing the exposure of all users to the propaganda. The techniques that have been identified so far in the literature focus on the automatic detection of such bots using the text analysis techniques they retransmit (Peng et al, 2017) but still do not fully examine the network of influence that is formed in each campaign.
Progress beyond the current research level
Trust and influence are two very important factors that influence the dissemination of information on social networks and contribute to the formation of public opinion and markets (Eirinaki et al, 2014; Lassen et al, 2014). In the case of pre-planned propaganda, the pattern through which a false news is spread, especially in its early stages, is very specific as it is expected to start from specific users and be relayed by their circles of influence. Despite advances in assessing the validity of a news story and news sources, to date, no single predictive model has been presented to learn the social media network and use it to detect false news in the first place. stages of their creation.
This project will utilize the existing research results and focus on the dissemination of news in bilateral and weighted graphs (Neal, 2014) but also on the problems that can be caused by the high connectivity they show (Dianati, 2016) and will proceed with the research a step further, utilizing the potential of Deep Machine Learning and Reinforcement Learning. More specifically:
- Design and develop solutions that utilize existing PALO and Qix tools for collecting content from social media and extracting corporate reports and polarity from texts (Tsirakis et al, 2016) and model social media influence networks by analysis of their reports and retransmissions.
- It will design a unified forecasting model that will focus on the pattern of information dissemination on social networks through reliable, unreliable and unknown nodes. It will incorporate features such as depth, number of stakeholders, bandwidth, structural transmissibility (Goel et al. 2015) and transmission speed that seem to differentiate between false and true news (Vosoughi et al. 2018).
- He will train a combination of Graphic Syndication and Repeat Neural Networks (RNN-GCN) in different subject areas, with news data reproduced on social media and will learn news dissemination patterns.
- It will use Reinforcement learning (RL) techniques on GCNs to train a predictor who will be fed with news about the spread of a news story and will assess if there is any covert propaganda behind it.
As explained in detail above, the proposed scientific and technological approach: (a) is completely reliable as it is based on cutting-edge technologies and solutions that utilize data perfectly suited to the problem we are trying to solve, (b) the proposed project adopts innovative principles and approaches and exceeds the bar of current technological weighting as it aspires to implement a complete solution with great applicability and generalizability in other areas.
Alallaq, N., Dohan, M. I., & Han, X. (2018, November). Sentiment Analysis to Enhance Detection of Latent Astroturfing Groups in Online Social Networks. In International Conference on Applications and Techniques in Information Security (pp. 79-91). Springer, Singapore.
Aslay, C., Lakshmanan, L. V., Lu, W., & Xiao, X. (2018, February). Influence maximization in online social networks. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (pp. 775-776). ACM.
Dianati, N. (2016). Unwinding the hairball graph: pruning algorithms for weighted complex networks. Physical Review E, 93(1), 012304.
Eirinaki, M., Louta, M. D., & Varlamis, I. (2014). A trust-aware system for personalized user recommendations in social networks. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44(4), 409-421.
Goel, S., Anderson, A., Hofman, J., & Watts, D. J. (2015). The structural virality of online diffusion. Management Science, 62(1), 180-196.
Lassen, N. B., Madsen, R., & Vatrapu, R. (2014, September). Predicting iphone sales from iphone tweets. In Enterprise Distributed Object Computing Conference (EDOC), 2014 IEEE 18th International (pp. 81-90). IEEE.
Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., … & Schudson, M. (2018). The science of fake news. Science, 359(6380), 1094-1096.
Neal, Z. (2014). The backbone of bipartite projections: Inferring relationships from co-authorship, co-sponsorship, co-attendance and other co-behaviors. Social Networks, 39, 84-97.
Peng, J., Detchon, S., Choo, K. K. R., & Ashman, H. (2017). Astroturfing detection in social media: a binary n‐gram–based approach. Concurrency and Computation: Practice and Experience, 29(17), e4013.
Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., & Menczer, F. (2010). Detecting and tracking the spread of astroturf memes in microblog streams. arXiv preprint arXiv:1011.3768.
Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Patil, S., Flammini, A., & Menczer, F. (2011, March). Truthy: mapping the spread of astroturf in microblog streams. In Proceedings of the 20th international conference companion on World wide web (pp. 249-252). ACM.
Shah, N. (2017, April). FLOCK: Combating astroturfing on live-streaming platforms. In Proceedings of the 26th International Conference on World Wide Web (pp. 1083-1091). International World Wide Web Conferences Steering Committee.
Stauber, J. C., & Rampton, S. (1995). Toxic sludge is good for you. Common Courage Press.
Tsirakis, N., Poulopoulos, V., Tsantilas, P., & Varlamis, I. (2017). Large scale opinion mining for social, news and blog data. Journal of Systems and Software, 127, 237-248.
Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151.
Zhang, J., Carpenter, D., & Ko, M. (2013). Online astroturfing: A theoretical perspective.