Digital technologies have opened up new data sources for measuring human behaviour. These range from tracking methods at an individual level to network analyses of social media data. At a LIfBi Lecture, Professor Jürgen Pfeffer, holder of the Chair of Computational Social Science at the Technical University of Munich, presented the methodological, conceptual and also ethical and moral challenges that arise for researchers when working with these new data sources. He focussed on his key question ‘Is this sample good enough?’
In his LIfBi Lecture, Jürgen Pfeffer first presented some of his own studies that utilise data from social networks and tracking tools, among other things. For example, he presented the development of a data set that for the first time maps the complete activity on Twitter/x over a 24-hour period and makes the data from more than 375 million tweets available to the scientific community. The successful attempt to manipulate the Twitter algorithm in a targeted manner also illustrated the potential and risks of these new data worlds.
Pfeffer pointed out that such data can open up new perspectives on human actions, interactions and collective attitudes. However, it also brings with it considerable methodological and conceptual challenges, as data creation and processing is often opaque. Unknown changes in the data and algorithms, falsification of the data by technological artefacts and changes in user behaviour through interaction with the technology can distort the results and findings and raise concerns about reliability.
Pfeffer used ChatGPT to impressively demonstrate how problematic it can be if users do not know how the technology works in the background. The example of assessing moral and ethical issues shows that different models of AI software generate different answers. Pfeffer sees the concrete danger here that opaquely programmed chat bots such as ChatGPT are more likely to harm the moral judgement of their users than improve it.
With regard to his key question, Pfeffer came to a clear conclusion: as appealing as the seemingly endless amounts of data available to researchers are, their quality is almost impossible to judge. Their origin is often unclear, falsifications are not recognisable and there is always a risk of targeted manipulation. For Pfeffer, research with qualitatively assured samples and methods therefore remains irreplaceable and the use of new types of data sources requires constant critical scrutiny of their ‘black boxes’.
During his visit, Jürgen Pfeffer was also available to the LIfBi researchers for individual discussions. Staff from various departments took up the offer and held discussions with the TUM researcher on topics such as the quality of online surveys, the possibilities of utilising data from career networks and methodological issues.