The State of Open Data 2023: unparalleled insights

Digital Science, Figshare and Springer Nature are proud to publish The State of Open Data 2023. Now in its eighth year, the survey is the longest-running longitudinal study into researchers’ attitudes towards open data and data sharing. 

The 2023 survey saw over 6,000 responses and the report that has now been published takes an in-depth look at the responses and purposefully takes a much more analytical approach than has been seen in previous years, unveiling unprecedented insights.

Five key takeaways from The State of Open Data 2023

Support is not making its way to those who need it

Over three-quarters of respondents had never received any support with making their data openly available. 

One size does not fit all

Variations in responses from different subject expertise and geographies highlight a need for a more nuanced approach to research data management support globally. 

Challenging stereotypes

Are later career academics really opposed to progress? The results of the 2023 survey indicate that career stage is not a significant factor in open data awareness or support levels. 

Credit is an ongoing issue

For eight years running, our survey has revealed a recurring concern among researchers: the perception that they don’t receive sufficient recognition for openly sharing their data. 

AI awareness hasn’t translated to action

For the first time, this year we asked survey respondents to indicate if they were using ChatGPT or similar AI tools for data collection, processing and metadata creation. 

Diving deeper into the data than ever before 

This year, we dive deeper into the data than ever before and look at the differing opinions of our respondents when we compare their regions, career stages, job titles and subject areas of expertise. 

Figshare founder and CEO Mark Hahnel said of this approach, “It feels like the right time to do this. Whilst a global funder push towards FAIR data has researchers globally moving in the same direction, it is important to recognize the subtleties in researchers’ behaviors based on variables in who they are and where they are.”

This year features extensive analysis of the survey results data and provides an in-depth and unique view of attitudes towards open data. 

This analysis provided some key insights; notably that researchers at all stages of their careers share similar enthusiasm for open data, are motivated by shared incentives and struggle to overcome the same obstacles. 

These results are encouraging and challenge the stereotype that more experienced academics are opposed to progress in the space and that those driving progress are primarily early career researchers. 

We were also able to look into the nuanced differences in responses from different regions and subject areas of expertise, illuminating areas for targeted outreach and support. These demographic variations also led us to issue a recommendation to the academic research community to look to understand the ‘state of open data’ in their specific setting.  

Benchmarking attitudes towards the application of AI 

In light of the intense focus on artificial intelligence (AI) and its application this year, for the first time, we decided to ask our survey respondents if they were using any AI tools for data collection, processing or metadata collection. 

The most common answer to all three questions was,“I’m aware of these tools but haven’t considered it.”

State of Open Data: AI awareness hasn't translated to action

Although the results don’t yet tell a story, we’ve taken an important step in benchmarking how researchers are currently using AI in the data-sharing process. Within our report, we hear from Niki Scaplehorn and Henning Schoenenberger from Springer Nature in their piece ‘AI and open science: the start of a beautiful relationship?’ as they share some thoughts on what the future could hold for research data and open science more generally in the age of AI. 

We are looking forward to evaluating the longitudinal response trends for this survey question in years to come as the fast-moving space of AI and its applications to various aspects of the research lifecycle accelerate farther ahead. 

Recommendations for the road ahead 

In our report, we have shared some recommendations that take the findings of our more analytical investigation and use them to inform action points for various stakeholders in the community. This is an exciting step for The State of Open Data, as we more explicitly encourage real-world action from the academic community when it comes to data-sharing and open data. 

Understanding the state of open data in our specific settings: Owing to the variations in responses from different geographies and areas of expertise, we’re encouraging the academic community to investigate the ‘state of open data’ in their specific research setting, to inform tailored and targeted support. 

Credit where credit’s due: For eight years running, our respondents have repeatedly reported that they don’t feel researchers get sufficient credit for sharing their data. Our recommendation asks stakeholders to consider innovative approaches that encourage data re-use and ultimately greater recognition. 

Help and guidance for the greater good: The same technical challenges and concerns that pose a barrier to data sharing transcend different software and disciplines. Our recommendation suggests that support should move beyond specific platform help and instead tackle the bigger questions of open data and open science practices. 

Making outreach inclusive: Through our investigation of the 2023 survey results, we saw that the stage of an academic’s career was not a significant factor in determining attitudes towards open data and we saw consensus between early career researchers and more established academics. Those looking to engage research communities should be inclusive and deliberate with their outreach, engaging those who have not yet published their first paper as well as those who first published over 30 years ago. 

What’s next for The State of Open Data?  

The State of Open Data 2023 report is a deliberate change from our usual format; usually, our report has contributed pieces authored by open data stakeholders around the globe. This year, we’ve changed our approach and we are beginning with the publication of this first report, which looks at the survey data through a closer lens than before. We’ve compared different subsets of the data in a way we haven’t before, in an effort to provide more insights and actionable data for the community.

In early 2024, we’ll be releasing a follow-up report, with a selection of contributed pieces from global stakeholders, reflecting on the survey results in their context. Using the results showcased in this first report as a basis, it’s our hope that this follow-up report will apply different contexts to these initial findings and bring new insights and ideas. 

In the meantime, we’re hosting two webinars to celebrate the launch of our first report and share the key takeaways. In our first session, The State of Open Data 2023: The Headlines, we’ll be sharing a TL;DR summary of the full report; our second session, The State of Open Data 2023: In Conversation, will convene a panel of global experts to discuss the survey results. 

You can sign up for both sessions here: 

The State of Open Data 2023: The Headlines

The State of Open Data 2023: In Conversation

Laura Day

About the Author

Laura Day, Marketing Director | Figshare

Laura is the Marketing Director at Figshare, part of Digital Science. Before joining Digital Science, Laura worked in scholarly publishing, focusing on open access journal marketing and transformative agreements. In her current role, Laura focuses on marketing campaigns and outreach for Figshare. She is passionate about open science and is excited by the potential it has to advance knowledge sharing by enabling academic research communities to reach new and diverse audiences.

Source link

#State #Open #Data #unparalleled #insights

Zooming in on zoonotic diseases – Digital Science

This blog addresses the impact of climate change on infectious diseases, in particular infectious diseases with the potential to transmit from animals to humans, also known as zoonotic diseases. To set the scene for this, we first consider the wider context of how global warming has far-reaching consequences for humans and the planet. The global changes that we are currently experiencing have never happened before, with climate change representing one of the principal environmental and health challenges. We use Dimensions to explore published research, research funding, policy documents and citation data. To help us perform a deeper analysis of the data, we access the Dimensions data through its Google BigQuery (GBQ) provision. This allows us to integrate data from Dimensions with one of the  publicly available World Bank datasets on GBQ.  

We also look at the research in conjunction with two United Nations (UN) Sustainable Development Goals (SDGs) – SDG3 Good Health and Well-being and SDG13 Climate Action – and assess how they add to the narrative. Many of the health impacts associated with climate change are a particular threat to the poorest people in low- and middle-income countries where the burden of climate sensitive diseases is the greatest. This also suggests that the impact in these regions, based on the UN SDGs, may reach beyond climate (SDG13) and health (SDG3) to affect those who live in extreme poverty (SDG1) and/or those who experience food insecurity (SDG2).

“The climate crisis is a health crisis”

Introduction

1. Climate change and zoonotic diseases

Climate change has far-reaching implications for human health in the 21st century, with significant increases in temperature extremes, heavy precipitation, and severe droughts.1 It directly impacts health through long-term changes in rainfall and temperature, climatic extremes (heatwaves, hurricanes, and flash floods), air quality, sea-level rise in low-land coastal regions, and many different influences on food production systems and water resources.2

In terms of human health, climate change has an important impact on the transmission of vector-borne diseases (human illnesses caused by parasites), in particular zoonotic infectious diseases (infections transmitted from animal to humans by the bite of infected arthropod species, such as mosquitoes and bats), and has a particular relevance due to the most recent COVID-19 and Zika virus outbreaks. Arthropods are of major significance due to their abundance, adaptability, and coevolution to different kinds of pathogens.3 

Zoonotic infectious diseases are a global threat because they can become pandemics, as we have seen in the case of COVID-19, and are currently considered one of the most important threats for public health globally. The COVID pathogen spread worldwide, recording 255,324,963 cases with 5,127,696 deaths as of November 2021.4

One reason for this turnaround could be related to the widespread adoption of the United Nations Sustainable Development Goals (SDGs), and in particular SDG6, which sets out to “ensure availability and sustainable management of water and sanitation for all”.9 The achievement of this Goal, even if partially, would greatly benefit people and the planet, given the importance of clean water for socio-economic development and quality of life, including health and environmental protection. SDG6 considers improvement of water quality by reducing by half the amount of wastewater that is not treated by 2030.

The changes in climatic conditions have forced many pathogens and vectors to develop adaptation mechanisms. For example, in the case of African Ebola, climate change is a factor in the rise in cases over the past two decades, with bats and other animal hosts of the virus being driven into new areas when temperatures change, potentially bringing them into closer contact with humans.  

Examples highlighting how the acceleration of zoonotic pathogens is attributable to changes in climate and ecology due to human impact are common. According to the Center for Disease Control (CDC), almost six out of every 10 infectious diseases can be spread from animals to humans; three out of every four emerging infectious diseases in humans originate from animals.5 Zoonotic diseases, such as those spread by mosquitoes and other related vectors, have increased in recent years. This is because the rise in global temperatures has created favourable conditions for breeding specific pathogens, especially in poorly developed countries predominantly in the Global South.6 Further, climate change is causing people’s general health to deteriorate, making it easier for zoonotic infections to spread, as seen with the Zika and dengue viruses.7

The changes in climatic conditions have forced pathogens and vectors to develop adaptation mechanisms. Such development has resulted in these diseases becoming resistant to conventional treatments due to their augmented resilience and survival techniques, thus further favouring the spread of infection.

Figure 1: Effect of climatic changes on infectious diseases.8

2. Exploring links between climate change and zoonotic diseases as evidenced by mentions in policy documents

Developments in policy are generally rooted in academic research. Applying research to policy relevant questions is increasingly important to address potential problems and can often identify what has been successful or not successful elsewhere. Citations to the research that underpins policy documents is known to be an important (proxy) indicator of the quality of the research carried out. Awareness and the course of action taken by governments, NGOs and other health-focused institutions is evident by their activity in this area. For example, in the UK the government has recently allocated £200 million to fight zoonotic diseases.9 Actions that are taken relevant to this are communicated by, for example, relevant policy documents which mention the research influencing public policy decision making in this area. Policy documents provide us with a different perspective for analysis, allowing a closer proximity to ‘real world’, society-facing issues. 

3. The SDG3 and SDG13 crossover: research outputs associated with zoonotic diseases and climate change

The UN launched the 2030 Agenda for Sustainable Development to address an ongoing crisis: human pressure leading to unprecedented environmental degradation, climatic change, social inequality, and other negative planet-wide consequences.10 There is growing evidence that environmental change and infectious disease emergence are causally linked and there is an increased recognition that SDGs are linked to one another. Thus, understanding their dynamics is central to achieving the vision of the UN 2030 Agenda. But environmental change also has direct human health outcomes via infectious disease emergence, and this link is not customarily integrated into planning for sustainable development.11

Two of the 17 UN SDGs of most relevance to zoonotic diseases and climate change are SDG3 and SDG13.

Looking specifically at SDG3, reducing global infectious disease risk is one of the targets for the Goal (Target 3.3), alongside strengthening prevention strategies to identify early warning signals (Target 3.d).12 Given the direct connection between environmental change and infectious disease risk, actions taken to achieve other SDGs also have an impact on the achievement of SDG3. Moreover, strengthening resilience and adaptive capacity to climate-related hazards and natural disasters is one of the targets for SDG13 (Target 13.1).13 The two SDGs perhaps highlight two sides of the same coin – SDG3 focusing on preventing and reducing disease risks and SDG13 focusing on strengthening resilience of climate-related hazards (infectious disease being an obvious hazard).

Exploring the crossover between SDG3 and SDG13 using Dimensions, reveals interlinkages with other SDGs – SDG1 No Poverty and SDG2 Zero Hunger. We know that living in poverty has negative impacts on health, and in respect of climate change, economic loss attributed to climate-related disasters is now a reality. Experiencing hunger can be a consequence of vulnerable agricultural practices that negatively impact food productivity and production. In 2020, between 720 and 811 million persons worldwide were suffering from hunger, as many as 161 million more than in 2019.14 Moreover, climate change, extreme weather, drought, flooding and other disasters progressively deteriorate land and soil quality, severely affecting the cost of food items.

4. Funding of research associated with SDG3 and SDG13 – increases in SDG research funding

Scientific advances reveal empirical observations of the association between climate change and shifts in infectious diseases. Using Dimensions we can examine the scientific evidence for this by looking at the impact of climate change on zoonotic diseases. We can also track the science, through the lens of research outputs associated with both SDG3 and SDG13.  

Being able to assess publishing and funding behaviours by comparing the Global North and Global South countries provides us with an insight into where research is both funded and ultimately published. Moreover, one question we might ask is, given that the Global South is currently hardest hit by the consequences of climate change from an infectious disease perspective, will we see changes in publishing and funding practices in the future?

Being able to assess publishing and funding behaviours by comparing the Global North and Global South countries provides us with an insight into where research is both funded and ultimately published. Moreover, one question we might ask is, given that the Global South is currently hardest hit by the consequences of climate change from an infectious disease perspective, will we see changes in publishing and funding practices in the future?  Furthermore, climate change has exacerbated many influencing factors. It has generated habitat loss, pushed wild animals from hotter to cooler climates where they can mix with new animals and more people, and it has lengthened the breeding season and expanded the habitats of disease-spreading mosquitoes, ticks, etc.,15 and so we could potentially see more zoonotic infectious disease spreading to countries in the Global North. Given these factors, and the capability of Dimensions, we can make comparisons over time and geolocation to track where changes are occurring.

Dimensions search strategy and data investigation

i. Search strategies

Research data were retrieved using Digital Science’s Dimensions database and Google BigQuery (GBQ). For initial searches we created a specific search term to identify publications associated with zoonotic/infectious diseases and climate change. Two sets of terms were used to define the searching keywords. The first was made up of keywords associated with zoonotic and infectious diseases, and the second was simply one word, ‘Climate’, as follows:

Zoonoses OR "zoonotic diseases" OR "parasitic diseases" OR "zoonotic pathogens" OR "vector borne diseases" OR "climate-sensitive infectious diseases" OR "infectious disease risk" OR "infectious diseases" AND Climate.
Figure 2: Word cloud illustrating the strength of association of research that includes both climate change and zoonotic (infectious) diseases and their variants.

Dimensions’ inbuilt SDG classification system allowed for the linking of research outputs associated with SDGs both individually and in combination. On this basis we were able to include SDG3 Good Health and Well-being and SDG13 Climate Action to the search, allowing us to include outputs associated with both Goals. The main focus of the search carried out was on peer-reviewed articles and government policy documents between 2010 and 2022. A set of 1,436 research publications were retrieved and entered into further analyses separately. The research outputs retrieved shared a focus on the impact of climate change on pathogen, host and transmission of human zoonotic/infectious diseases.

A dataset based on the research outputs retrieved from Dimensions was created within GBQ. This allowed integration with publicly available datasets from the World Bank to ascertain low and high income countries and regions. The Dimensions GBQ provision also facilitates in-depth targeted analyses. This allowed us to look solely at the publications resulting from our search in order to identify trends in concepts, citations, policy documents and collaborations by geographic region.

ii. Findings

a) Publication timeline trends for research outputs tagged in Dimensions jointly with SDG3 and SDG13 and associated with zoonotic/infectious diseases and climate change were plotted.

Figure 3: Publications on climate change and zoonotic diseases, and their variants that have been linked to both SDG3 and SDG13 using Dimensions’ SDG classification system

Figure 3 highlights the trajectory over a 13-year time period for publications associated with both SDG3 and SDG13 in Dimensions. Of note, following implementation of the UN SDGs in January 2016, the upward trend in numbers of publications begins to rise sharply until the end of 2021, with a dip in 2022.

b) Co-authorship analysis: Collaboration by geographic region

Figure 4: 4a) One in 40 publications from researchers in high-income countries have been co-authored with researchers from a low-income country; 4b) Two in three publications from researchers in low-income countries have been co-authored with researchers from a high-income country.

Figure 4a reveals that for every 40 publications authored in a high-income country, one publication was in collaboration with a low-income country-based researcher. Figure 4b reveals that two in three publications authored by low-income country based researchers have been in collaboration with high-income country based researchers. We conclude from this that it is proportionately more likely for low-income country researchers to collaborate with researchers in the Global North than for researchers in the Global North to collaborate with researchers in the Global South. However, it is important to note here that numbers of research outputs are disproportionate between the global regions (see Table 1 below). 

2010-2022 Number and percentage of authors publishing climate change and infectious (zoonotic) diseases research Number of authors publishing research outputs associated with SDG13 Number of authors publishing research outputs associated with SDG3 Total number of authors publishing in each geographic income region
Global South
Low-income countries 52 (0.11%) 2,818 (6.22%) 26,649 (58.85%) 45,285 (100%)
Lower-middle-income countries 468 (0.03%) 85,931 (6.07%) 409,355 (28.93%) 1,415,019 (100%)
Global North
High-income countries 618 (0.01%) 365,917 (4.73%) 2,337,971 (30.22%) 7,736,160 (100%)
Upper-middle-income countries 2,419 (0.06%) 194,187 (4.56%) 850,954 (19.97%) 4,260,966 (100%)
Table 1: Number and proportion of authors by geographic income region publishing research on climate change and infectious (zoonotic) diseases, and SDG3 and SDG13

Table 1 outlines the combined total number of authors of published research in the Global South and Global North, including the proportion of researchers against the total number of researchers in each of these regions. The figures in the table reveal that proportionally the number of researchers publishing research on zoonotic diseases and climate change is higher than that of higher-income countries. We argue here that this research focus is not necessarily a niche area for Global South countries (even though their number of research outputs and activity is low in real terms). Consideration of the number of authors publishing zoonotic diseases and climate change research papers against numbers of authors publishing in areas associated more generally with SDG3 and SDG13 provides a glimpse of the breadth of sustainable development research of which our topic area is just one component. 

Despite the crossover with SDG3 and SDG13 not being high, it shows that the engagement of researchers in low-income countries with zoonotic diseases research is notable and contributes to research progress in this area. However, the research is better represented if we look proportionally. For example, 52 researchers in low-income countries represent 8% of the number of zoonotic disease researchers in high-income countries (618), but the total number of researchers publishing overall in low-income countries (45,285) represents just 0.5% of all researchers in high-income countries (7.7 million) making the proportional contribution by low-income country researchers 40 times greater than high-income country researchers in this research area.

c) Research publications by geographic region

Figure 5: Research outputs by year of publication pre- and post-SDG time period.

Figure 5 above reveals a total of 1,419 research publications pre- and post-SDG period from 2010-2022 by country income group have been captured by Dimensions. The numbers represented in the chart reveal that publications have at least one author in the country income groupings outlined. In order to incorporate collaborations, a publication is included twice if it includes an author within each income group. This only applies for the analysis of country income groups. It allows us to see any increases/decreases in collaborative behaviour. In this respect, we note the contribution (either through collaborating or writing their own publications) from low/low-medium-income (Global South) countries has risen both in number and as a proportion of the outputs from 2010.

d) Citation analysis by geographic regions

Figure 6a – Number of publications and corresponding citation counts that include  authors in low- and low -medium income countries.
Figure 6b  Number of publications and corresponding citation counts that include authors in  high- and high-medium income countries.

The data in Figure 6a and 6b above reveal that:

1. South-East Asia as a producer of this research is dominant in the Global South (see Fig. 6b).

2. In the Global South, South-East Asia both publishes research and favourably cites research from the same region (see Fig. 6a).

3. Research output in South-East Asia is not as highly cited by the Global North (see Fig. 6b). What is notable however, is the overall dominance of the Global North for both research output and citation counts. We conjecture one reason for why this might be the case is that the Global South may not have access to the same level of funding or collaboration opportunities. Moreover, differences in research focus could account for the distinction. Moreover, interest in these areas by high-income country research(ers) may be less pronounced than those research areas elsewhere in the Global South (eg, Africa) where there is more collaboration, or more ‘gain’ for Global North countries (Ebola, Zika etc). For example, if India’s research focus was local to aspects of zoonotic diseases that only affect this country, then it might be less likely that higher income countries would cite the research. This warrants a deeper dive into the data to uncover such findings but is outside the scope of the blog.

In conclusion, it is perhaps the case that areas which are most affected by climate change and zoonotic diseases have become publication ‘hotspots’ which are not yet attractive to researchers in Global North countries.

e) Funding – by income/geography; Funder type

Figure 7: Breakdown of Country groupings by income and type of funding organisation revealed by Dimensions. 

The general trend seen in Fig. 7 above reveals government funding to be the major driving force in zoonotic diseases and climate change research in all of the country groupings.  What Dimensions reveals in this respect is that governments in the Global North provide 100% of the government funding that is held in the Dimensions database for research on these topics in the Global South. This would explain perhaps why low-income countries in the Global South, where research infrastructure isn’t as well funded, receives less government funding as it is awarded by the Global North. Looking at funding from non-profit sources, which includes organisations such as Bill and Melinda Gates Foundation, the Wellcome Trust and the Science and Technology Development Fund, we note that such organisations provide nearly a quarter of all research funding held in Dimensions, in the Global South. As with government funding, 98% of all non-profit research funding in both regions comes from non-profit organisations in the Global North. It is interesting to note, given the focus of the research, that only a very small proportion of funding is received across all funder types from the healthcare sector. All other funders included in Fig. 7 92.5% of funding comes from the Global North (healthcare funding is included in this figure).16

f) Policy documents and their citing publications

Figure 8: Top 12 publishers of policy documents citing research on climate change and zoonotic diseases (based on our Dimensions search criteria – see above in “Search strategies”). 

In Dimensions, policy sources and document types range from government guidelines, reports or white papers; independent policy institute publications; advisory committees on specific topics; research institutes; and international development organisations. The top 12 policy publishers that are outlined in Fig. 8 above represent those publishers of policies citing research outputs associated with climate change and zoonotic diseases. It is perhaps not unexpected that the number of publications cited by the World Health Organization would be high given its global vision to eliminate the disease burden globally and to reverse climate change. Zoonotic diseases are very much on the radar of the global agencies concerned with global health which, given climate change, means that spread of these diseases in the Global North is more likely.

Takeaway findings

Using Dimensions’ capability to take a deep dive into research exploring zoonotic diseases and climate change in the context of SDGs has enabled us to uncover a number of interesting findings that are illuminating in the context of a world view.

Our investigations have revealed several interesting findings, including:

  • Research publications in this area have increased more than two-fold since the implementation of the SDGs.  
  • Collaboration patterns in the Global North and Global South reveal that researchers in Global South countries are more likely to collaborate with researchers in the Global North than vice versa.
  • The total number of authors publishing research on zoonotic diseases and climate change in the lowest-income countries represents 8% of the total number of zoonotic disease researchers in high-income countries (see Table 1). Expanding this out across all research publications, the total number of researchers publishing in low-income countries represents just 0.5% of all researchers in high-income countries, making the proportional representation of low-income country researchers 40 times greater than high-income country researchers. Although actual numbers would reveal a different story, we believe that depicting the data in this way provides a balanced representation of the research output.
  • Research carried out on zoonotic diseases and climate change in the lower income countries is less well cited by higher income countries.
  • The data in Dimensions highlights that government organisations in the Global North award much of the funding for research in the Global South, and likewise for funding from non-profit agencies. What we might consider here as an explanation is that numerous organisations in the Global North such as Bill and Melinda Gates Foundation, the SCI Foundation, along with governments, are committed to the elimination of zoonotic diseases and in helping reduce carbon emissions to reverse climate change at a global level.

Conclusion

What is apparent is that governments around the world are investing large sums of money as part of the global mission to halt the spread of animal diseases and to protect the public against zoonotic disease outbreaks before they become pandemics that pose a risk globally.

Digital Science’s Dimensions database provided us with enormous opportunities for the interrogation of data to gather insights on zoonotic diseases and climate change (much more than could be included in this blog). The comprehensiveness of the database in terms of its coverage of publications, policy documents, grant funding and SDG-associated output (among others) in the Global North and Global South allows for creating the most value. As a linked research database, the possibilities for generating downstream link- and flow- analyses across geographies means it is an invaluable tool for the widest possible discovery across the research ecosystem.

About Dimensions

Part of Digital Science, Dimensions is the largest linked research database and data infrastructure provider, re-imagining research discovery with access to grants, publications, clinical trials, patents and policy documents all in one place. www.dimensions.ai

Source link

#Zooming #zoonotic #diseases #Digital #Science

White House OSTP public access recommendations: Maturing your institutional Open Access strategy – Digital Science

While the global picture of Open Access remains something of a patchwork (see our recent blog post The Changing Landscape of Open Access Compliance), trends are nevertheless moving in broadly the same direction, with the past decade seeing a move globally from 70% of all publishing being closed access to 54% being open access

The White House OSTP’s new memo (aka the Nelson Memo) will see this trend advance rapidly in the United States, stipulating that federally-funded publications and associated datasets should be made publicly available without embargo.

In this blog post, Symplectic‘s Kate Byrne and Figshare‘s Andrew Mckenna-Foster start to unpack what the Nelson Memo means, along with some of the impacts, considerations and challenges that research institutions and librarians will need to consider in the coming months.

Demystifying the Nelson Memo’s recommendations

The focus of the memo is upon ensuring free, immediate, and equitable access to federally funded research. 

The first clause of the memo is focused on working with the funders to ensure that they have policies in place to provide embargo-free, public access to research. 

The second clause encourages the development of transparent procedures to ensure scientific and research integrity is maintained in public access policies. This is a complex and interesting space, which goes beyond the remit of what we would perhaps traditionally think of as ‘Open Access’ to incorporate elements such as transparency of data, conflicts of interest, funding, and reproducibility (the latter of which is of particular interest to our sister company Ripeta, who are dedicated to building trust in science by benchmarking reproducibility in research).  

The third clause recommends that federal agencies coordinate with the OSTP in order to ensure equitable delivery of federally-funded research results in data. While the first clause mentions making supporting data available alongside publications, this clause takes a broader stance toward sharing results. 

What does this mean for institutions and faculty?

The Nelson memo introduces a clear set of challenges for research institutions, research managers, and librarians, who now need to consider how to put in place internal workflows and guidance that will enable faculty to easily identify eligible research and make it openly available, how to support multiple pathways to open access, and how to best engage and incentivize researchers and faculty. 

However, the OSTP has made very clear that this is not in fact a mandate, but rather a non-binding set of recommendations. While this certainly relieves some of the potential immediate pressure and panic around getting systems and processes in place, it is clear that what this move does represent is the direction of travel that has been communicated to federal funders. 

Funders will look at the Nelson Memo when reviewing their own policies, and seek alignment when setting their own policy requirements that drive action for faculty members across the US. So while the memo does not in itself mandate compliance for institutions, universities, and research organizations, it will have a direct impact on the activities faculty are being asked to complete – increasing the need for institutions to offer faculty services and support to help them easily comply with their funders requirements.

How have funders responded so far? 

We are already seeing clear indications that funders are embracing the recommendations and preparing next steps. Rapidly after the announcement, the NIH published a statement of support for the policy, noting that it has “long championed principles of transparency and accessibility in NIH-funded research and supports this important step by the Biden Administration”, and over the coming months will “work with interagency partners and stakeholders to revise its current Public Access Policy to enable researchers, clinicians, students, and the public to access NIH research results immediately upon publication”. 

Similarly, the USDA tweeted their support for the guidance, noting that “rapid public access to federally-funded research & data can drive data-driven decisions & innovation that are critical in our fast-changing world.”

How big could the impact be?

While it will take some time for funders to begin to publish their updated OA Policies, there have been some early studies which seek to assess how many publications could potentially fall under such policies. 

A recent preprint by Eric Schares of Iowa State University [Impact of the 2022 OSTP Memo: A Bibliometric Analysis of U.S. Federally Funded Publications, 20217-2021] used data from Dimensions to identify and analyse publications with federal funding sources. Schares found that: 

  • 1.32 million publications in the US were federally funded between 2017-2021, representing 33% of all US research outputs in the same period. 
  • 32% of federally funded publications were not openly available to the public in 2021 (compared to 38% of worldwide publications during the same period). 

Schares’ study included 237 federal funding agencies – due to the removal of the $100m threshold, many more funders now fall under the Nelson memo than under the previous 2013 Holdren memo. This makes it likely that disciplines who previously were not impacted will now find themselves grappling with public access requirements.

Source: Impact of the 2022 OSTP Memo: A Bibliometric Analysis of U.S. Federally Funded Publications, 2017 2021: https://ostp.lib.iastate.edu

In Schares’ visualization here, where each dot represents a research institution, we can see that two main groupings emerge. The first is a smaller group made up of the National Laboratories. They publish a smaller number of papers overall, but are heavily federally funded (80-90% of their works). The second group is a much larger cluster, representing Universities across the US. Those organisations have 30–60% of their publications being federally-funded, but building from a much larger base number of publications – meaning that they will likely have a lot of faculty members who will now need support.

Where do faculty members need support?

According to the 2022 State of Open Data Report, institutions and libraries have a particularly essential role to play in meeting new top-down initiatives, not only by providing sufficient infrastructure but also support, training and guidance for researchers. It is clear from the findings of the report that the work of compliance is wearing on researchers, with 35% of respondents citing lack of time as reason for not adhering to data management plans and 52% citing finding time to curate data as the area they need the most help and support with. 72% of researchers indicated they would rely on an internal resource (either colleagues, the Library or the Research Office) were they to require help with managing or making their data openly available.

How to start?

Institutions who invest now in building capacity in these areas to support open access and data sharing for researchers will be better prepared for the OSTP’s 2025 deadline, helping to avoid any last-minute scramble to support their researchers in meeting this guidance.

Beginning to think about enabling open access can be a daunting task, particularly for institutions who don’t yet have internal workflows or appropriate infrastructure set up, so we recommend breaking down your approach into more manageable chunks: 

1. Understand your own Open Access landscape 

  • Find out where your researchers are publishing and what OA pathways they are currently using. You can do this by reviewing your scholarly publishing patterns and the OA status of those works.
  • Explore the data you have for your own repositories – not only your own existing data sets, but also those from other sources such as data aggregators or tools like Dimensions.
  • Begin to overlay publishing data with grants data, to benchmark where you are now and work to identify the kinds of drivers that your researchers are likely to see in the future. 

2. Review your system capabilities

  • Is your repository ready for both publications and data?
  • Do you have effective monitoring and reporting capabilities that will help you track engagement and identify areas where your community may need more support? Are your systems researcher-friendly; how quickly and easily can a researcher make their work openly available??

3. Consider how you will support your research ecosystem 

  • Identify how you plan to support and incentivize researchers, considering how you will provide guidance about compliant ways of making work openly available, as well as practical support where relevant.
  • Plan communication points between internal stakeholders (e.g. Research Office, Library, IT) to create a joined-up approach that will provide a shared and seamless experience to your researchers.
  • Review institutional policies and procedures relating to publishing and open access, considering where you are at present and where you’d like to get to.

How can Digital Science help? 

Symplectic Elements was the first commercially available research information management system to be “open access aware”, connecting to institutional digital repositories in order to enable frictionless open access deposit for publications and accompanying datasets. Since 2009 through initial integration with DSpace – later expanding our repository support to Figshare, EPrints, Hyrax, and custom home-grown systems – we have partnered with and guided many research institutions around the globe as they work to evolve and mature their approach to open access. We have deep experience in building out tools and processes which will help universities meet mandates set by national governments or funders, report on fulfilment and compliance, and engage researchers in increasing levels of deposit. 

Our sister company Figshare is a leading provider of cloud repository software and has been working for over a decade to make research outputs, of all types, more discoverable and reusable and lower the barriers of access. Meeting and exceeding many of the ‘desirable characteristics’ set out by the OSTP themselves for repositories, Figshare is the repository of choice for over 100 universities and research institutions looking to ensure their researchers are compliant with the rising tide of funder policies.

Below is an example of the type of Open Access dashboard that can be configured and run using the various collated and curated scholarly data held within Symplectic Elements.

In this example, we are using Dimensions as a data source, building on data from Unpaywall about the open access status of works within an institution’s Elements system. Using the data visualizations within this dashboard, you can start to look at open access trends over time, such as the different sorts of open access pathways being used, and how that pattern changes when you look across different publishers or different journals, or for different departments within your organization. By gaining this powerful understanding of where you are today, you can begin to think about how to best prioritise your efforts for tomorrow as you continue to mature your approach to open access. 

Growing maturity of OA initiatives over time – not a “one and done”.

You might find yourself at Level 1 right now where you have a publications repository along with some metadata, and you’re able to track a number of deposits and do some basic reporting, but there are a number of ways that you can build this up over time to create a truly integrated OA solution. By bringing together publications and data repositories and integrating them within a research management solution, you can enter a space where you can monitor proactively, with an embedded engagement and compliance strategy across all publications and data. 

For more information or if you’d like to set up time to speak to the Digital Science team about how Symplectic Elements or Figshare for Institutions can support and guide you in your journey to a fully embedded and mature Open Access strategy, please get in touch – we’d love to hear from you.

This blog post was originally published on the Symplectic website.



Source link

#White #House #OSTP #public #access #recommendations #Maturing #institutional #Open #Access #strategy #Digital #Science

Why is it so difficult to understand the benefits of research infrastructure? – Digital Science


Persistent identifiers – or PIDs – are long-lasting references to digital resources. In other words, they are a unique label to an entity: a person, place, or thing. PIDs work by redirecting the user to the online resource, even if the location of that resource changes. They also have associated metadata which contains information about the entity and also provide links to other PIDs. For example, many scholars already populate their ORCID records, linking themselves to their research outputs through Crossref and DataCite DOIs. As the PID ecosystem matures, to include PIDs for grants (Crossref grant IDs), projects (RAiD), and organisations (ROR), the connections between PIDs form a graph that describes the research landscape. In this post, Phill Jones talks about the work that the MoreBrains cooperative has been doing to show the value of a connected PID-based infrastructure.

Over the past year or so, we at MoreBrains have been working with a number of national-level research supporting organisations to develop national persistent identifier (PID) strategies: Jisc in the UK; the Australian Research Data Commons (ARDC) and Australian Access Federation (AAF) in Australia; and the Canadian Research Knowledge Network CRKN, Digital Research Alliance of Canada (DRAC), and Canadian Persistent Identifier Advisory Committee (CPIDAC) in Canada. In all three cases, we’ve been investigating the value of developing PID-based research infrastructures, and using data from various sources, including Dimensions, to quantify that value. In our most recent analysis, we found that investing in five priority PIDs could save the Australian research sector as much as 38,000 person days of work per year, equivalent to $24 million (AUD), purely in direct time savings from rekeying of information into institutional research management systems.

Investing in infrastructure makes a lot of sense, whether you’re building roads, railways, or research infrastructure. But wise investors also want evidence that their investment is worthwhile – that the infrastructure is needed, that it will be used, and, ideally, that there will be a return of some kind on their investment. Sometimes, all of this is easy to measure; sometimes, it’s not.

In the case of PID infrastructure, there has long been a sense that investment would be worthwhile. In 2018, in his advice to the UK government, Adam Tickell recommended:

Jisc to lead on selecting and promoting a range of unique identifiers, including ORCID, in collaboration with sector leaders with relevant partner organisations

More recently, in Australia, the Minister for Education, Jason Clare, wrote a letter of expectations to the Australian Research Council in which he stated:

Streamlining the processes undertaken during National Competitive Grant Program funding rounds must be a high priority for the ARC… I ask that the ARC identify ways to minimise administrative burden on researchers

In the same letter, Minister Clare even suggested that preparations for the 2023 ERA be discontinued until a plan to make the process easier has been developed. While he didn’t explicitly mention PIDs in the letter, organisations like ARDC, AAF, and ARC see persistent identifiers as a big part of the solution to this problem.

A problem of chickens and eggs?

With all the modern information technology available to us it seems strange that, in 2022, we’re still hearing calls to develop basic research management infrastructure. Why hasn’t it already been developed? Part of the problem is that very little work has been done to quantify the value of research infrastructure in general, or PID-based infrastructure in particular. Organisations like Crossref, Datacite, and ORCID are clear success stories but, other than some notable exceptions like this, not much has been done to make the benefits of investment clear at a policy level – until now.

It’s very difficult to analyse the costs and benefits of PID adoption without being able to easily measure what’s happening in the scholarly ecosystem. So, in these recent analyses that we were commissioned to do, we asked questions like:

  • How many research grants were awarded to institutions within a given country?
  • How many articles have been published based on work funded by those grants?
  • What proportion of researchers within a given country have ORCID IDs?
  • How many research projects are active at any given time?

All these questions proved challenging to answer because, fundamentally, it’s extremely difficult to quantify the scale of research activity and the connections between research entities in the absence of universally adopted PIDs. In other words, we need a well-developed network of PIDs in order to easily quantify the benefits of investing in PIDs in the first place! (see Figure 1.)

Luckily, the story doesn’t end there. Thanks to data donated by Digital Science, and other organisations including ORCID, Crossref, Jisc, ARDC, AAF, and several research institutions in the UK, Canada, and Australia, we were able to piece together estimates for many of our calculations.

Take, for example, the Digital Science Dimensions database, which provided us with the data we needed for our Australian and UK use cases. It uses advanced computation and sophisticated machine learning approaches to build a graph of research entities like people, grants, publications, outputs, institutions etc. While other similar graphs exist, some of which are open and free to use – for example, the DataCite PID graph (accessed through DataCite commons), OpenAlex, and the ResearchGraph foundation – the Dimensions graph is the most complete and accessible so far. It enabled us to estimate total research activity in both the UK and Australia.

However, all our estimates are… estimates, because they involve making an automated best guess of the connections between research entities, where those connections are not already explicit. If the metadata associated with PIDs were complete and freely available in central PID registries, we could easily and accurately answer questions like ‘How many active researchers are there in a given country?’ or ‘How many research articles were based on funding from a specific funder or grant program?’

The five priority PIDs

As a starting point towards making these types of questions easy to answer, we recommend that policy-makers work with funders, institutions, publishers, PID organisations, and other key stakeholders around the world to support the adoption of five priority PIDs:

  • DOIs for funding grants
  • DOIs for outputs (eg publications, datasets, etc)
  • ORCIDs for people
  • RAiDs for projects
  • ROR for research-performing organisations

We prioritised these PIDs based on research done in 2019, sponsored by Jisc and in response to the Tickell report, to identify the key PIDs needed to support open access workflows in institutions. Since then, thousands of hours of research and validation across a range of countries and research ecosystems have verified that these PIDs are critical not just for open access but also for improving research workflows in general.

Going beyond administrative time savings

In our work, we have focused on direct savings from a reduction in administrative burden because those benefits are the most easily quantifiable; they’re easiest for both researchers and research administrators to relate to, and they align with established policy aims. However, the actual benefits of investing in PID-based infrastructure are likely far greater.

Evidence given to the UK House of Commons Science and Technology Committee in 2017 stated that every £1 spent on Research and Innovation in the UK results in a total benefit of £7 to the UK economy. The same is likely to be true for other countries, so the benefit to national industrial strategies of increased efficiency in research are potentially huge.

Going a step further, the universal adoption of the five priority PIDs would also enable institutions, companies, funders, and governments to make much better research strategy decisions. At the moment, bibliometric and scientometric analyses to support research strategy decisions are expensive and time-consuming; they rely on piecing together information based on incomplete evidence. By using PIDs for entities like grants, outputs, people, projects, and institutions, and ensuring that the associated metadata links to other PIDs, it’s possible to answer strategically relevant questions by simply extracting and combining data from PID registries.

Final thoughts

According to UNESCO, global spending on R&D has reached US$1.7 trillion per year, and with commitments from countries to address the UN sustainable development goals, that figure is set to increase. Given the size of that investment and the urgency of the problems we face, building and maintaining the research infrastructure makes sound sense. It will enable us to track, account for, and make good strategic decisions about how that money is being spent.


Phill Jones

About the Author

Phill Jones, Co-founder, Digital and Technology | MoreBrains Cooperative

Phill is a product innovator, business strategist, and highly qualified research scientist. He is a co-founder of the MoreBrains Cooperative, a consultancy working at the forefront of scholarly infrastructure, and research dissemination. Phill has been the CTO at Emerald Publishing, Director of Publishing Innovation at Digital Science and the Editorial Director at JoVE. In a previous career, he was a bio-physicist at Harvard Medical School and holds a PhD in Physics from Imperial College, London.

The MoreBrains Cooperative is a team of consultants that specialise in and share the values of open research with a focus on scholarly communications, and research information management, policy, and infrastructures. They work with funders, national research supporting organisations, institutions, publishers and startups. Examples of their open reports can be found here: morebrains.coop/repository



Source link