The mass digitization of sources that has been initiated by the project on the History of the Max Planck Society (GMPG) presents a challenge for computational history. The vast quantity of content is far too large to be read by the researchers in the project. Digital methods therefore have to be developed and applied in order to structure the sources. This involves the development of a database that contains the relevant information about actors and organizations, about the application of OCR (optical character recognition) to render the documents searchable, and text-mining methods both to pinpoint the relevant sources as well as to direct the process of selecting and prioritizing sources for digitization. Network analytical methods help to uncover the cooperation structures that are documented in the sources. A combination of topic-modeling and network analysis allows the relation of topics mentioned in the context of a commission with persons also mentioned in the documents.
The focus so far has been on gaining first insights into how interdisciplinary cooperation within the Max Planck Society is structured. This has been pursued by analyzing the background of those members of commissions who make decisions, not only about new institutions but also about the role of persons in the network based on their function within the Society. At this stage of the project, the research question is directed primarily at finding significant differences between the data and accepted narratives. As opposed to producing new explanations, this research aims to highlight the areas where more research and additional sources are needed. Thus, the research is intended to help with the selection of case studies for deeper analysis and to place these case studies in a wider context.