Mon 27 Jan 2020 3:52PM

Preprint hosting requirements and priorities discussion

BC Bruce Caron Public Seen by 104

We did this before, when we selected COS. Probably need to have some new priorities about long-term financial stability, etc. (lesson learned).


Bruce Caron Mon 27 Jan 2020 3:54PM

The software platform should be open, standardized, and supported by existing efforts (e.g., existing, widely used repository platforms).


Bruce Caron Mon 27 Jan 2020 3:55PM

The organization should be committed to Earth science, with funded staff (e.g., a university department/research unit) in this area.


Stéphanie Girardclos Tue 28 Jan 2020 9:30AM

For me this point is less important. It doesn't need to not be related to Earthscience if they have a strong support to preprints. However, would be nice.


Bruce Caron Tue 28 Jan 2020 3:30PM

That's true. However, the conversations I've had with universities suggest that their local earth scientists can make an internal argument about why that university might want to host EarthArXiv, and maintain it for the long term. Having inside support helps the library/repository folks. But... there is also an issue where the local folks may want to have more of a say in governance.


Bruce Caron Mon 27 Jan 2020 3:57PM

The organization should already have a commitment to maintaining a repository for science outcomes, with a staff that is committed to this and funded for this.


Rebecca Williams Mon 27 Jan 2020 4:38PM

Can you copy over the original list for new members of the council to consider? I don't think I've seen it as I think it was discussed before I joined over a year ago.


Bruce Caron Mon 27 Jan 2020 5:19PM

List... hmmm. not sure we had a list. Here are some of the considerations we did focus on: we needed to have (above the basic repository platform with stable support): standard-compliant meta-data and DOIs; an input curation service we could manage; indexed through Google Scholar; a method to revise submissions as needed...


Rebecca Williams Mon 27 Jan 2020 5:26PM

OK, some additional things to think about...
*ability to inform what the front page/submission pages look like (so we can e.g. ensure the moderation code goes upfront to try and avoid the huge number of rejections for non-compliance we have)
*support to maintain DOIs and move the database in the event of the platform ending
*tech support that both authors and moderators can access
*assurances on future commercialisation (e.g. a non-profit platform won't suddenly turn profit)
*assurances on where future fundraising responsibilities will lie


Bruce Caron Mon 27 Jan 2020 6:44PM

great points!


Daniel Pastor Galán Tue 28 Jan 2020 2:46AM

Well @Rebecca Williams had great points. About the assurance on fundraising, probably we should add that if we are to fundraise for some concepts those are the only ones and that the costs are clear always.


Bruce Caron Tue 28 Jan 2020 5:15PM

It could be better if the host does not rely on commercial cloud services for the platform, but has already made a commitment to running their own servers. This would mean that the preprints are just a minor load on existing capacity. Again, preprints are tiny, compared with data.


Bryan Lougheed Fri 31 Jan 2020 5:27PM

What is the approx expected bandwidth per day?


Bruce Caron Fri 31 Jan 2020 5:54PM

Here's another issue. We hope to build the volume of new preprints and of researchers downloading preprints over time. Let's say we start with 1000 new preprints a year, and 10 times that number of people grabbing these (or more?). In five years, we will want to have 3-4 times that number of new preprints, and 10-20 times the number of downloads. (I could be conservative on these numbers).


Dasapta Erwin Irawan Wed 29 Jan 2020 12:28PM

Hi all. Thank you for inviting me in to this conversation. Many important points. We have a similar consideration with the INArxiv. And we would go with local Indonesian server hosted by National Research Institute (LIPI), rin.lipi.go.id. It's still a long way to go, but we had some glonal discussions, and they are interested to contribute.


Dasapta Erwin Irawan Wed 29 Jan 2020 12:30PM

Did you notice that AfricaXiv had an agreement with ScienceOpen. I did not have the details yet, but next week I will have a chat with Jo Haveman the founder of AfricaXiv.


Bruce Caron Wed 29 Jan 2020 4:02PM

One reason to not choose a common preprint server (like Zenodo or Figshare) may be to enhance the number of services and build a more diverse ecosystem of providers. This article from a while back is a good start for this idea <https://thewinnower.com/papers/4172-a-healthy-research-ecosystem-diversity-by-design>.


Bruce Caron Fri 6 Mar 2020 3:00PM

From the AC: "maybe we should request a way to put submissions we are discussing like this on hold/flagged."


Daniel Ibarra Fri 6 Mar 2020 5:25PM

Just to chime in with my perspective (as one of the AC moderators), the problem we have been having is that right now if there are issues with a preprint that don't fulfill our moderation policy, which is posted on our github but all authors do not seem to read, we have to 'reject' the preprint, send them a link to the moderation policy and the authors re-uploaded a new version or fix the metadata etc. Whatever system we choose, it would be helpful to have a sort of hold/flagged category that the AC moderators are looking at so that we can more quickly work with the authors to fix the submission and thus reduce the time from submission to the server to the preprint being fully posted on EarthArXiv.


Victor Venema Fri 6 Mar 2020 9:36PM

Is my guess right that what goes wrong most often are requirements people do not expect?

Such as printing on the first page the that the manuscript is not peer reviewed (it is a pre-prints server after all). Or printing the name of the journal one is submitting to (does not have to be decided yet, one may try multiple ones and why should we force people to submit to journals?).

If that is what makes the work, that would be an additional reason to reconsider these rules.


Daniel Pastor Galán Mon 9 Mar 2020 2:33AM

It is a pre-print but also a post-print server. The disclaimer is necessary, it is good to know if you are reading a non-submitted pre-print. A pre-print already submitted and under review. A pre-print that has been accepted and already includes corrections and modifications from peer review... Maybe we should make this rules more visible, for example in the submission page, or literally asking to fill a simple form that fills that page. But I think it is absolutely necessary.


Daniel Nüst Mon 9 Mar 2020 7:34AM

I was asked by an EarthArXiv moderator to add that information to the PDF and had not considered that before, but it's a very valid concern AND extremely useful. As per requirements, I think the authors should be helped by the preprint platform, and at the same time there is a chance for branding and integrating useful information.

So, when moving to the next platform, I think the generation of either a) an overlay or watermark (like arXiv) or b) automatic generation of a first page from the form metadata.

When the PDF is processed anyway, proper metadata can also be added. This information could then ideally include the final DOI, so there is a good connection between a PDF someone downloads today and the official record in 5 years time.


Christopher Jackson Mon 9 Mar 2020 2:35PM

I so agree with this Daniel! Automated addition of preprint/postprint status is one of the key things to get. I spend a decent amount of time rejecting submissions for this reason. Also: automated/quicker checking of journal conditions around preprinting/self-archiving of post-prints.


Victor Venema Mon 9 Mar 2020 3:21PM

Would it be enough to add this information as metadata (not forcing people to print it on the first page of the PDF)?

I once had permission of a colleague to put our manuscript on EarthArXiv, but all I had was the PDF and he did not want to go to the trouble of generating a new PDF. Then he started saying "we publish open access anyway", "it is confusing to have two versions", anything not to do the additional work. The easier we make it for people the more likely people will move to pre-printing.

(And I remain that we should not force people to submit to journals.)


Daniel Nüst Mon 9 Mar 2020 3:59PM

I would not ask users to do that at all. I would just add it from the system, see e.g. the left hand margin of articles on arXiv:

Hope this clarifies.


Leonardo Uieda Mon 9 Mar 2020 6:39PM

I like the automatic watermark that adds our DOI and the disclaimer. For now, it might be worth putting a link to the instructions somewhere very very visible (impossible to ignore). I have the feeling that a lot of people just don't find them when submitting.


Daniel Ibarra Mon 9 Mar 2020 7:43PM

Yes, we've asked for this via @Tom Narock I believe, but COS never followed through. I like this watermark DOI addition if that's something that a future preprint hosting service could add.


Christopher Jackson Mon 9 Mar 2020 7:59PM

We did indeed ask, and didn’t get anywhere...:-(


Bruce Caron Mon 9 Mar 2020 8:16PM

One (hopefully) great thing about moving to a dedicated preprint platform like Open Preprint System is that feature requests might get more attention. Adding new preprint features to the OSF jams these into a larger queue with other priorities.