CentAUR statistics for 2016

Ever wondered how the Reading University CentAUR repository is used by people looking for research articles?
This infographic gives you a summary of the activity around our repository in 2016.

Infographic showing statistics from usage of the CentAUR repository

Some details of the activity around our CentAUR repository

Posted in CentAUR, Open Access, Statistics | Tagged , , , , , , , | Leave a comment

5 things to do with data in 2017

Let us not curse them with the name of New Year’s Resolutions. It’s a bit late for that anyway. But here are five simple and positive things you can do with your research data this year.

1. Be ‘as open as possible, as closed as necessary’

Make this your mantra. Say it once a day, when you brush your teeth in the morning. It is the governing principle of the European Commission’s Open Research Data Pilot, which from the start of this year has been extended from its pilot focus to cover all thematic areas of the Horizon 2020 programme. It is a good principle: let it inform how you think about the data you collect, and how you manage them. Consider what actions will enable you to share the data you collect as openly as possible, while honouring your ethical and legal obligations and any contractual restrictions. If you collect data from participants, ensure you obtain consent for data sharing, and use robust methods of anonymisation to make data safe for sharing. The UK Data Service offers excellent guidance on legal and ethical issues.

2. Have a data spring-clean

Clear out the cupboards: those USB sticks lying in your drawer, that external hard drive gathering dust on the shelf, your Dropbox folder, your personal drive on the University network, your project fileshare. Get rid of what you don’t need. If the data support published research and/or have long-term value, archive them in a data repository. If they are part of your working capital, make sure they are properly stored and backed-up. Use your institutional network as the primary storage for working data, as they will be automatically replicated to separate data centres, backed up on a daily basis, and recoverable in case of disaster.

While you’re sorting things out, why not also rationalise that monstrous proliferation of folders in your network drive? Organise your data so that you can navigate them and find what your want: arrange them by project, by experiment, by date, etc.; use folder and file names that make sense and help you manage versions, e.g. by including the date.

3. Plan for data management

When you prepare a new research project, one of the first documents you start should be your data management plan. Data are the foundation of your research – so don’t build on sand. Start with an outline plan with the bare essentials, and fill it in as your research proposal evolves and your collaborators contribute their input. Your plan should identify: what data will be collected; how data will be managed during the project; and how they will be preserved and shared after the end of the project. Use DMPonline or the Checklist for a Data Management Plan to help you write the plan and make sure you cover everything.

If you will be applying for funding, your funder may ask you to complete a data management plan as part of the application, so starting to develop one early in proposal development will improve the quality of the plan and make the application process easier.

4. Don’t rely on supplementary information – use a repository

When you submit your next paper, don’t submit supporting primary data as supplementary information files to be published alongside your article on the publisher’s website, but deposit the data into a suitable data repository and link to them from your article. Here are some reasons why:

  • What is provided as supplementary information is often not primary data, but selected derived data, in the form of graphs, charts, tables reporting mean values, etc. Without access to the full primary dataset, your results cannot be properly validated or replicated, and the data themselves have limited re-use value;
  • Supplementary data are often provided in PDF, one of the least user-friendly data formats ever invented: numerical and textual data cannot be manipulated within the file format, or easily extracted and imported into other formats (e.g. tabular formats for numerical data, or simple text formats) where they are amenable to manipulation and further analysis.
  • While many publishers allow access to supplementary information even where the articles themselves are concealed behind a paywall, this is not necessarily always the case, and even where the data are made freely accessible, publishers may require you to transfer copyright to them, and may not allow others to reproduce or redistribute the data. Most data repositories on the other hand will simply ask for a licence to manage data on the rights-holder’s behalf.

5. Make the data FAIR

When you do make your primary data available, make sure they meet the FAIR Data Principles:

  • Findable: a detailed metadata record is published and indexed online describing the data and including a unique persistent identifier assigned to the data.
  • Accessible: the data are retrievable and accessible, preferably openly, or with as few intermediate steps or restrictions as possible.
  • Interoperable: the data are made available and described using open and/or widely-used formats and metadata standards, enabling the greatest possible opportunities for integration and interoperation with other data and systems.
  • Re-usable: the data are well-described and documented, so that the conditions in which they were collected or generated can be clearly understood, and they are accompanied by a licence stating the terms of use.

This may seem a lot to achieve, but in fact this will mostly be done simply by archiving your data in a suitable data repository, which will as a matter of course ensure that a standards-compliant metadata record describing the data is created and published, including a Digital Object Identifier (DOI) or other unique persistent identifier, that the data themselves are stored in suitable formats for access and re-use, and with relevant documentation, and that the data are made accessible under an appropriate licence.

Posted in Research Data Management, Uncategorized | Tagged , , , | Leave a comment

Happy 7th birthday, CentAUR!

Today CentAUR, our open access institutional repository, lists over 37,000 research outputs, and has had over one millions downloads since opening at the end of January 2010. Here are some of CentAUR’s key milestones.

Graph showing increase in downloads from CentAUR from 2010-2017

Downloads from CentAUR as measured by IRstats2.

 

 

 

 


2010

  • Researchers are automatically registered on CentAUR to begin adding publications.
  • Publications lists on School and personal profile pages are generated from CentAUR. This means that researchers don’t need to maintain separate lists anymore.

2011

  •  The ‘Request a copy from Reading author’ button is enabled. Authors can choose to share their publications ahead of embargo expiry dates. In January 2017 the button was used 28 times.

2012

infographic for CentAUR repository statistics

CentAUR statistics

  • CentAUR joins IRUS-UK. We can now view and analyse CentAUR downloads and also benchmark these against other repositories. We use IRUS and IRStats for the monthly infographics published in this blog, Opening Research at Reading Blog (ORRB)

 

 

2013

  • CentAUR features in the University’s half day conference on open access in June 2013. Four years on, open access is just one aspect of our upcoming ‘Open in Practice’ conference for academic and research students on 30th March 2017. Don’t forget to register!

2014

  • The HEFCE policy for open access in the next REF is introduced by the University ahead of the 1st April 2016 deadline. The deposit of full texts increases to the extent that in January 2017 75% of all items deposited that month included a full text.

2015

  • We begin to add University of Reading e-theses to CentAUR and EThOS harvests them. Did you know that our theses are some of the most downloaded items in CentAUR?
  • CentAUR links up with the newly established University of Reading Research Data Archive. Upload your research data into the Archive and link it to your relevant publication in CentAUR!

2016

  • The policy for open access in the REF began 1st April 2016. Don’t forget to add your author final versions of articles and conference items to CentAUR as soon as accepted for publication!
  • CentAUR is harvested by Altmetric Explorer so that we can identify social media attention around Reading’s research publications. In the last week of January 2017, Altmetric identified 573 new posts about our research publications comprising: 28 news stories; 9 blog posts; 499 tweets; 24 Facebook posts; 5 Wikipedia pages; 6 Google+ posts; 2 videos
  • CentAUR is one of the earliest adopters of the Publications Router service, which delivers records from a growing number of publishers directly to repositories, including CentAUR. With significant engagement from publishers the Router could source and deposit most of CentAUR’s article content.

2017

  • CentAUR starts to tweet about CentAUR! – giving service information, promoting open access, highlighting new research papers added to the repository and our high download and Altmetric scores. Don’t forget to follow us on Twitter!

 

Posted in CentAUR, Open Access, Statistics | Tagged , , , | Leave a comment

Have you checked who you are recently?

I’m sure we’ve all Googled our own name at some point and been interested or surprised to see what comes up in the search results. When it comes to your academic profile online, it is always a good idea to keep an eye on which publications are being credited to you – the results can be equally surprising!

Check out your digital researcher identity

If you’ve published a research output in a book, in a conference proceedings or journal, you may have an online identity that you are not aware of and it might not be accurate. Why not do a quick identity health check in the Scopus database to check your details are correct?

What is Scopus?

The Scopus database collates outputs from thousands of journals and other publications and track citations to them. The database is useful for searching for articles relevant to your research, helping you to decide where to publish, identifying potential collaborators and also helping you to discover who is citing your work and how often it is being cited.

In the Scopus database, outputs from the same author are aggregated in to a Scopus Author ID. As the information is collated automatically, you may find that the wrong articles have been attributed to you or that your articles have been split across several duplicate IDs.

Why is it important to check your author ID?

If your details in Scopus are incorrect, your publication record will be incomplete and possibly confusing to those interested in reading or citing your research. It is also worth checking out your Scopus Author ID to make sure that the articles attributed to you are correct because the bibliometric data used in the University of Reading’s Research Outputs Support System (ROSS) dashboards are taken from the Scopus database.  If your details are wrong, unreliable data will be pulled through into the University’s reporting process.

How do I find my author ID?

scopus author ID 1

Find yourself in the Scopus database by using the ‘Author Search’ tab

To check your ID, visit the Scopus website www.scopus.com (available when using the University’s IP range). Choose the ‘Author Search’ tab from the Search menu and enter your details. If you’ve worked at several institutions it is best to leave the affiliation information blank. When the search results appear, it is worth choosing the ‘Show profile matches with one document’ option as publications can sometimes fail to aggregate under one author ID.

If your details are right

Great! Take a look at how your papers/articles are being cited, view your h-graph and analyse your author output. You might want to link your Scopus ID to your ORCID ID if you have one – check out our ORCID library guide  for help on how to do this. Check your Scopus author ID from time to time to check that new publications are being added.

If your details are wrong

If you have several Author IDs or there are publications in your profile that do not belong to you, you can ask Scopus to merge them. You can do this by using the ‘Request author detail corrections’ link (or contact Karen Rowlett, the University’s Research Publications Adviser who can do this on your behalf). It is worth checking that any missing publications have not been attributed to another researcher of a similar name. Corrections are usually done within 2 weeks.

Screenshot from Scopus database

Select the duplicate profiles and choose ‘Request to Merge Authors’ to correct duplicate author IDs

Missing Publications

Publications might be missing from your Author ID either because they have been attributed to someone else or because Scopus does not cover the journal/book/conference in which your article appeared. You can check the Scopus coverage by consulting their guide or downloading the Book and Journal source lists.

Wrong affiliation

If you have recently moved from another institution, it may take a while for your new affiliation to be reflected in your Scopus Author ID. You have to publish three outputs with your new address before it will change. If your affiliation is showing as somewhere that you’ve never worked, you can request a correction.

Help and support

If you are not sure how to check your Scopus Author ID or need help in sorting out your profile, please contact Karen Rowlett, Research Publications Adviser. There are also some regular sessions running through People Development on Managing your digital researcher profile and ORCID. You can check when the next course is running by searching the People Development course database.

 

Posted in Bibliometrics, Publications | Tagged , , , , , , , , , | 1 Comment

CentAUR statistics for December 2016

infographic for CentAUR repository statistics

A selection of statistics for the CentAUR repository

Posted in CentAUR, Statistics | Tagged , , | Leave a comment

Towards Open Research: a new report from the Wellcome Trust

October this year saw publication of Towards Open Research: Practices, experiences, barriers and opportunities, a study investigating researcher’s attitudes and behaviours in respect of open research, commissioned by the Wellcome Trust, and based on surveys of researchers and focus groups conducted by the study and the Economic and Social Research Council (ESRC).

The aim of the study was to identify practical actions the Wellcome Trust can implement to remove barriers and maximise the opportunities for practicing open science. Under the aegis of open science the report study considered Open Access publishing, data sharing and re-use, and code-sharing and re-use.

Both the Wellcome Trust and ESRC are strong proponents of open research. The Wellcome Trust has long been at the forefront of policy initiatives to advance an open science agenda through Open Access and open data practices. It mandates both Open Access publication of funded research outputs and data sharing to the fullest achievable extent, and supports these activities through its research grants. The report commissioned by the Trust coincides with the launch of Wellcome Open Research, a platform on which its funded researchers can rapidly publish any results they wish to share, including study protocols and null and negative results, as well as articles and data, which are all made available without editorial intervention for open peer review.

ESRC similarly mandates both Open Access publication and data sharing wherever possible; Open Access publishing for its funded researchers is supported through the RCUK block grant to institutions, and it manages the UK Data Archive, the UK’s largest collection of social, economic and population data and a service that its funded researchers can use to preserve and share the data arising from their research. The UK Data Archive has been in existence since 1967, and ESRC was one of the earliest among funders to adopt  a data sharing policy, in the mid-1990s, making it an informed and progressive force in the promotion of open and accessible data, with particular expertise in the management of controlled access to disclosive and sensitive data. It also offers a wealth of invaluable supporting information and resources for research data management via the UK Data Service website, covering topics such as participant consent for data sharing, anonymisation of datasets, and dealing with rights in data.

General findings

The study finds that open research is widely practised and on the increase, with researchers not only using Open Access and data sharing, but engaging in growing numbers in other emerging open practices, such as code sharing and open peer review, and experiencing the benefits of increased citation rates, and accelerated communication of a broader range of research outputs beyond the traditional peer-reviewed journal paper.

Open Access

  • Over 70% of Wellcome-funded papers are published as Open Access and a third of researchers publish all their papers as Open Access.
  • Researchers appreciate the value of Wellcome’s new Open Research platform in enabling the rapid dissemination of research materials and results, enabling data visualisation in papers, and providing a forum for open and constructive peer review of of methods and findings.

Data sharing

  • Half of researchers make their data available for use by others, largely via institutional and community data repositories.
  • The drivers for researchers to share data largely come from funder and journal requirements; although researchers generally accept the case for sharing data, very few report any direct benefits from sharing their data, and many are still concerned about possible misuse or misinterpretation of their data, loss of first-use privilege to competitors, and deterred by the effort required to prepare and deposit data.
  • Early career researchers in particular may show reluctance to share their data for fear of losing future publication and career progression opportunities by releasing their research capital. Their supervisors and seniors have a role to play in encouraging good practice.
  • On the plus side, very few researchers have had negative experiences from sharing their data, and many of the fears reported by researchers are largely unfounded.
  • The key things that those who fund and support researchers can do to encourage sharing of data are: provide funding to cover the cost of data preparation, and create incentive systems that reward and recognise researchers for sharing data.
  • Approximately three quarters of researchers have re-used research data, mostly to provide background information and context to research, for research validation, and to help develop methodologies for new analyses.
  • Data sharing can be complex and effort-intensive: researchers need to be given training and guidance, to have easy-to-use data repository services, and to be supported in producing and curating high-quality data and addressing challenges such as making disclosive and confidential data safe for sharing, or sharing large resources, e.g. imaging data.

Code sharing

  • Code-sharing is less well-established than other open practices, at least in part because fewer researchers create software code in their research: two-fifths of researchers do so (mostly those using surveys, secondary analysis and simulation), but less than half of them make it available for access and re-use by others.
  • A significant amount of code use may be hidden and opportunity for code sharing not realised: some researchers may think of research code in terms of software outputs, and not consider processing scripts (such as stata.do files and batch files) within this definition, even though they may be essential to the replication and validation of research results.
  • Where code-sharing takes place, it is driven far less than data sharing by the requirements of funders and journals, and more by a desire to engage in good research practice and to enable other researchers to collaborate and contribute to the work.
  • There are no significant barriers to code-sharing, although lack of skills and funding and rapid changes in software can disincline researchers to invest the time and effort, especially where code sharing is not widely funded, incentivised or rewarded by funders and research organisations, and where norms of code citation and acknowledgement of re-use are not well-established. Code, unlike data in many cases, must be actively maintained and supported once it has been distributed and established a user community, and this demands both commitment from code developers and significant resources. This needs to be recognised by funders and properly supported.
  • Code re-use is currently limited, with just over a third of researchers having used existing code in their research.
  • Many researchers pick up software skills ad hoc, and lack formalised training or knowledge of best practice in code development and sharing. Software skills acquisition needs to be better integrated into standard researcher training models, and researchers should be able to draw on the support offered, e.g. through the Software Sustainability Institute’s Software Carpentry activities.
  • While many code repositories are available for hosting and development, such as GitHub and Bitbucket, it is not clear that these are reliable long-term preservation solutions, and there may be a need for better provision of repositories dedicated to preservation and maintenance of research software. Wellcome is exploring the possibility of setting up such a code repository.

Open research in general

The overall message for funders is that they need to incentivise and reward not just the production of original and interesting research, but the whole ensemble of ‘open’ practices that together ensure the quality, accessibility and usability of all research. The communication of research outputs is not auxiliary or incidental to research, but at its very heart, and however inherently valuable a piece of research may be, its value is diminished if it is not communicated at all because it is a negative or null result, or because the researcher sees no benefit in communicating it, or it is communicated but accessible only to a few selected by ability to pay, or its methods are not transparent and open to replication and critique, or the tools it used are unavailable to others. If researchers are to be incentivised to engage with Open Research practices, then those who fund and reward research need to ensure the money is in the right places for them.

Data from surveys of 583 Wellcome Trust-funded researchers and 259 ESRC-funded researchers, and 5 focus group discussions are available from the UK Data Archive.

 

Posted in Open Access, Research Data Management, Uncategorized | Tagged , , , | Leave a comment

CentAUR statistics for November 2016

Key statistics from the CentAUR repository

Key statistics from the CentAUR repository

Posted in CentAUR, Open Access, Statistics | Tagged , , , | Leave a comment

An open data snapshot

The State of Open Data, a selection of articles and analyses based on a survey of over 2,000 researchers, was published in October this year by Digital Science, owners of the popular figshare research outputs repository.

Key survey findings are summarised in this infographic:

Digital Science, The State of Open Data Infographic

Three findings stand out for me:

1. There is broad interest in and enthusiasm for use of open data practices on the part of researchers across disciplines, career stages and throughout the world, and many researchers (approximately 75% of those surveyed) have experience of sharing data and place value on the credit they receive for doing so.

It’s always good to hear this: the means of sharing data are many and various and it is not easy to get an overview of the totality of data sharing practices.

But I wonder if the picture is quite as rosy as the Digital Science headline suggests, for two reasons.

First, neither the report nor the underlying data explain how the survey sample was obtained or provide any evidence of how representative it is. The survey dataset does not contain a protocol or any explanation of sampling methodology. The survey report merely states: ‘Figshare has garnered many insights from its users in the past, from formal surveys and informal feedback […] Working with Springer Nature and Digital Science, we surveyed researchers […] over 2,000 researchers responded to the survey,spread across continents and disciplines, from all types of institution and researchers at different career stages’ (p. 12), which would suggest (although it is not clear), that researchers were selected from the companies’ contact lists. Given that figshare is a data sharing service and Springer Nature has a strong data policy for its journals, one might expect survey respondents to be more active in data sharing than the global average. One might compare the 76% of researchers in this survey who shared data with the 51% of Wellcome Trust and ESRC-funded researches recently surveyed who had made data available to the research community by one means or another (see survey report, p. 27).

Secondly, the report does not define data sharing with sufficient precision. The survey asked respondents how often they made data ‘freely available’. It’s not clear how ‘freely available’ was defined in the survey, if at all. A survey question about tools researchers used to share data reported responses in the following categories: Email; Google Drive; Dropbox; Figshare; GitHub; Other. This seems a curious list to me, as it identifies tools that I would associate primarily with restricted sharing (e.g. within a project team or among selected peers), such as email and cloud file-sharing services, and does not specify the key categories of open data sharing vehicles: data repositories and journal platforms (which may publish data as supplementary information alongside articles). Only one data repository is identified: figshare, which is owned by… Digital Science, the publisher of this report. Presumably, all the other data repositories in the world are subsumed under the Other category. I would have liked to see in the report a clearer definition of what survey respondents were given to understand ‘freely available’ meant, and whether their responses did fully justify the claim that ‘approximately three quarters of researchers have made research data openly available at some point’ (my emphasis). I would again make a comparison with the Wellcome Trust report (see above), which arrived at its 51% figure by asking researchers if they had made data ‘available to the research community‘ (my emphasis) and specifically excluded informal sharing or sharing on request – since if you have to ask for the data it clearly isn’t open or ‘freely available’.

I believe the rate of open data sharing is in fact considerably lower than the Digital Science report suggests.

2. Where open data practices are adopted, there are likely to be positive correlations with with overall research quality as well as with good practice in management and documentation of data.

This is something I find interesting, as it indicates that basics of good research practice, such as internal discussion and challenge of assumptions and methods, documentation of methods and values, and rigorous quality control in collection and processing, can be reinforced where it is known that data will be made publicly available and open to the same level of scrutiny as peer-reviewed papers, ultimately resulting in findings that are higher in quality, contain fewer errors, and prove more reliable in the long term. It is an effect that has been reported in the literature, and which I think merits greater emphasis as we seek to persuade our researchers of the benefit to them of being open with their data. For a light on this issue, see Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE 6(11): e26828. http://dx.doi.org/10.1371/journal.pone.0026828.

3. Researchers can be uncertain of the benefits of sharing data, may be unsure how to manage their data effectively or obtain resources for open data practices, and would welcome more support in these areas from their funders and institutions.

This is definitely the case in my experience: it can be hard to persuade researchers of the benefits of sharing their data, where there is rarely a direct correlation between effort invested and return to the researcher, in terms of recognition and reward. It has been said many times, but in spite of funders’ and institutional and publishers’ data sharing policies, the systemic incentives to share are weak: this is why out of nearly 191,000 research outputs submitted to the 2014 REF, only 68 outputs – that is 0.04% of the total – were Research datasets and databases (see this presentation from Ben Jonson of HEFCE and the REF Research outputs submissions data).

Professionals such as myself providing institutional services supporting research data management need to persuade researchers of the benefits to themselves, to scholarship and to society of sharing their research data; and we need to deliver services that meet their needs in intelligent and efficient ways.

But funders, policy-makers and research organisations also need to restructure the incentive frameworks that define how researchers progress in their careers, and receive recognition and reward for the communication of research. We need a much broader focus in the academic reward systems beyond the published peer-reviewed paper reporting positive, novel and exciting results, both to the papers reporting the less headline-grabbing outcomes (the negative, the null and the apparently nugatory), and to other kinds of output, including the datasets that can serve to validate research results or establish a foundation for future research.

Posted in Research Data Management, Uncategorized | Tagged , , | Leave a comment

What’s new in SciVal ? – Media mentions

A new release of SciVal, the research intelligence tool from Elsevier, was launched in mid-November.

The tool now features two additional sources of information that can be mined: Awarded Grants and Mass Media Mentions. This post covers the Mass Media Mentions and how you can make the best use of this data. An earlier post covered the Awarded Grants feature.

The societal impact of an institution can now be measured in SciVal via mass media mentions of its research outputs. SciVal tracks only English-speaking media sources at present and covers 39,000 online sources and 6,000 print sources. The data cover 2 full years plus the current year for online sources and 5 full years and the current year for print sources. Currently, most of the media sources being tracked are based in the USA and so this should be borne in mind when looking at the data for a particular institution.

Overview module

Mass media mentions in Scival

Image from SciVal (redacted)

To look at mass media mentions for an institution, use the ‘Societal Impact’ tab in the Overview module in SciVal. Use the button to choose which kind of media type you are interested in. You can also use the subject filter to narrow down your search to a particular area of research.

Breakdown of media exposure

Image from SciVal (redacted)

You can also get a breakdown of the kind of media sources that picked up the research and whether they were internationally recognised (eg. BBC) or a local interest source. Below is an example of a Media Exposure graph.

There is also a field-weighted graph that makes comparisons of institutions in the same country possible regardless of their subject areas (medical research is often picked up by the media more than other subjects).

Benchmarking against other institutions

Graph comparing different institutions

Use the Benchmarking module to compare media mentions between institutions in the same country (image from SciVal, redacted)

The data can also be displayed in a table format and downloaded as a PDF, image file or CSV file.

It is anticipated that links to the media mentions will be added in early 2017.

Further information

SciVal is available for all users at the University of Reading. You have to register for an account to use the tool. Access is only available when on campus (or using the VPN). For help and support with SciVal and to gain access to Reading University’s customised structures, contact the Research Publications Adviser.

Posted in Bibliometrics, Research intelligence | Tagged , , , | Leave a comment

CentAUR statistics for October 2016

Infographic featuring key statistics fro the CentAUR repository

Key statistics from the University of Reading’s CentAUR repository

Posted in CentAUR, Open Access, Statistics | Tagged , , , | Leave a comment