OER Africa Menu

Close Menu

Search form

Image courtesy of National Cancer Institute, Unsplash

What is ArXiv?

You may have seen ArXiv pop up in your science-related or open access repository searches. For those of us who are still wondering how to say it, it is pronounced ’archive’. The X represents the Greek letter, c, which is pronounced ‘ch’ and thus spells out archive.  

ArXiv is an open access repository for pre-prints[1] and post-prints[2] that have been moderated but not peer reviewed. It was the first freely available, open access repository, established years before Creative Commons or other mechanisms were available or the Internet became ubiquitous. ArXiv was established in 1991 as a way for physicists and mathematicians to circulate their research for comment before peer review and publication in a journal. It was developed with distribution formats few people use today — File Transfer Protocol (FTP), Gopher, and Mosaic (the world’s first Internet browser). According to Wikipedia:[3]

In many fields of mathematics and physics, almost all scientific papers are self-archived on the arXiv repository before publication in a peer-reviewed journal. Some publishers also grant permission for authors to archive the peer-reviewed postprint.

ArXiv has become tremendously important for scientists worldwide. On this, its 30th birthday, there are almost 2 million articles posted on the site in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, systems science, and economics. 

In the 30 years since ArXiv’s  founding, many additional servers for different disciplines and regions have been established, all on the Xiv model.  AfricArXiv, for Africa, is discussed in detail below. Two preprint servers in the biomedical sciences have received considerable attention because of the COVID-19 pandemic, including the publication of several highly quoted articles in the mass media and research publications.[4]  But how much relevance do ArXiv and the other servers have for Africa?

How relevant are preprint servers to Africa?

ArXiv and the disciplinary servers that followed aim to help scientists worldwide share results and get feedback quickly without waiting for publication in journals.  Although bibliometric surveys have not been done for all of the servers, two in the biomedical sciences point to the paucity of African research appearing in them.  A 2020 survey of in eLife, of over 67,000 articles posted on bioRxiv found that international authorship and collaboration of African researchers was scant and most were not principal author.  Table 1 shows which were the 11 most published African countries in bioRxiv.[5]

Table 1: Which African countries publish the most in bioRxiv?

Figure 1: The good news[6]

When the researchers dug deeper into subject matter and where the research was carried out, they found:

Figure 2: Digging deeper[7]

This is important research for those of us who care about the contribution of African scientists to the global knowledge pool.  The role of scientists has evolved over the course of the pandemic – they are becoming more public-facing, which can significantly improve dissemination of accurate information in any given country.  Examples include Dr. Anthony Fauci[8] and Dr. John Nkengasong.[9] It is also important to encourage representation so that we can learn from other approaches and develop solutions that cater for diverse populations. 

 

AfricArXiv

In 2018, African scientists established an open access African preprint server, called AfricArXivto promote better visibility for African research and enhanced collaboration throughout the continent. AfricArXiv’s African focus has a special set of objectives, among them:

  • It is an African-owned open scholarly repository, a knowledge commons of African scholarly works to catalyze the African Renaissance.
  • Submissions must be relevant to Africa, with at least one African author.
  • Language is important in Africa, where AfricArXiv estimates that over 2,000 are spoken, a number which has been confirmed elsewhere.[10] All submissions must be accompanied by a summary in English and French to ease language gaps. Automated translation is allowed and must be acknowledged because these translations are not always accurate.  AfricArXiv also uses volunteer translators. In addition, AfricArXiv encourages postings in African languages and is partnering with Masakhane to undertake human translation of articles into African languages. AfricaArXiv writes about the significance of translation as follows:[11]

We encourage submissions in languages that are commonly used by the scientific community in the respective country, such as English, French, Swahili, Zulu, Afrikaans, Igbo, Akan, or other native African languages. Manuscripts submitted in non-English languages will be held in the moderation queue until we can get them verified. We herewith encourage you to suggest people who could assist in moderating in your language.

For those who want to know more about the significance of preprint servers, we encourage you to read Ten simple rules to consider regarding preprint submission, which was published in May 2017 in PLOS Computational Biology, an open access and highly prestigious journal. The article has been viewed over 44,000 times and cited 61 times.[12]

For those who would like more information on the lack of representation of African science in the journal literature, please see Where there is no local author: a network bibliometric analysis of authorship parasitism among research conducted in sub-Saharan Africa, published on 27 October 2021 in BMJ Global Health.[13]  It demonstrates how few African biomedical researchers receive recognition for research results from their own countries.  See also the journal’s editorial on ‘parachute’ research: Using scientific authorship criteria as a tool for equitable inclusion in global health research.[14]

 

This has been one of OER Africa’s communications on open knowledge, which we will continue to explore in future communications.


Related articles

Access the OER Africa Communications Archive here

 


[1] Wikipedia contributors. (2021b, October 8). Preprint. Wikipedia. Retrieved October 25, 2021, from https://en.wikipedia.org/wiki/Preprint (CC BY-SA)

[2] Wikipedia contributors. (2021b, October 1). Postprint. Wikipedia. Retrieved October 25, 2021, from https://en.wikipedia.org/wiki/Postprint (CC BY-SA)

[3] Wikipedia contributors. (2021, September 7). ArXiv. Wikipedia. Retrieved 13 October 2021 from https://en.wikipedia.org/wiki/ArXiv#Moderation_process_and_endorsement (CC BY)

[4] Ginsparg, P. Lessons from arXiv’s 30 years of information sharing. Nat Rev Phys 3, 602–603 (2021). https://doi.org/10.1038/s42254-021-00360-z (Freely available but copyright protected.  Springer Nature has a content-sharing initiative, which does not permit printing; the link for this article is https://rdcu.be/czHnW.

[5] Abdill RJ, Adamowicz EM, Blekhman R. International authorship and collaboration across bioRxiv preprints. Elife. 2020 Jul 27;9:e58496. doi: 10.7554/eLife.58496. PMID: 32716295; PMCID: PMC7384855. (CC BY)

[6] Guleid, F. H., Oyando, R., Kabia, E., Mumbi, A., Akech, S., & Barasa, E. (2021, March 17). A bibliometric analysis of COVID-19 research in Africa. MedRxiv. Retrieved 19 October 2021 from https://www.medrxiv.org/content/10.1101/2021.03.15.21253589v1(CC BY)

[7] Guleid FH, Oyando R, Kabia E, et al, A bibliometric analysis of COVID-19 research in Africa, BMJ Global Health 2021; https://gh.bmj.com/content/6/5/e005690. (CC BY)

[10] Wikipedia contributors. (2021d, October 23). Languages of Africa. Wikipedia. Retrieved November 3, 2021, from Wikipedia contributors. (2021d, October 23). Languages of Africa. Wikipedia. Retrieved November3, 2021 from https://en.wikipedia.org/wiki/Languages_of_Africa (CC BY)

[11] Languages – AfricArXiv. (n.d.). AfricArXiv. Retrieved October 20, 2021, from https://info.africarxiv.org/languages/ (CC BY)

[12] Bourne PE, Polka JK, Vale RD, Kiley R (2017) Ten simple rules to consider regarding preprint submission. PLoS Comput Biol 13(5): e1005473. Retrieved 20 October 2021 from https://doi.org/10.1371/journal.pcbi.1005473 (CC0)

[13] Rees CA, Ali M, Kisenge R, et al Where there is no local author: a network bibliometric analysis of authorship parasitism among research conducted in sub-Saharan Africa BMJ Global Health 2021. https://gh.bmj.com/content/6/10/e006982. (CC BY-NC)

[14] Sam-Agudu NA, Abimbola S. Using scientific authorship criteria as a tool for equitable inclusion in global health research. BMJ Global Health 2021. https://gh.bmj.com/content/6/10/e007632 (CC BY-NC)

 

What's New

Why might you want to publish your research in an open access journal? Open access journals use Creative Commons licences, which lay out the terms under which they can be used and distributed. Although most open access journals are highly respected and entirely legitimate, there are scores of journals that can be classified as ‘predatory’; they prey on the unwary who want to publish or to read a reliable article.

Introduction: Why is open access publishing beneficial to academics?

Why might you want to publish your research in an open access journal? Open access journals use Creative Commons licences, which lay out the terms under which they can be used and distributed.  All Creative Commons licences require full attribution.  Open access can benefit scholars because wider access to their research, enhances visibility and citations.[1]

Figure 1 shows some of the possible benefits of OA publishing, many of which are relevant to researchers around the world, including those in Africa.

Figure 1: Benefits of open access publishing

What are predatory journals?

Most open access journals are highly respected and entirely legitimate. The Directory of Open Access Journals (DOAJ) lists more than 20,000 journals, many without an author processing fee:

Figure 2: DOAJ coverage[2]

The Department of Higher Education and Training (DHET) in South Africa includes the DOAJ journals amongst its list of accredited journals. Academics, researchers, and librarians are sure to find a reliable open access journal on the DOAJ database or any of the others that DHET lists.[3]

Even so, there are scores of journals that can be classified as ‘predatory’; they prey on the unwary who want to publish or to read a reliable article.

What is a predatory journal?  In 2019, a group of legal experts and publishers agreed on this definition:

"Predatory journals and publishers are entities that prioritize self-interest at the expense of scholarship and are characterized by false or misleading information, deviation from best editorial and publication practices, a lack of transparency, and/or the use of aggressive and indiscriminate solicitation practices."

Though it might seem straightforward, there are so many forms of predatory practices that this group of specialists had trouble agreeing on a definition to describe how predation manifests itself.[4]

Experts [5] believe that there are now more than 15,000 predatory journals, which promise:

  • Peer review with a fast turnaround time.
  • Low author processing fees—low in comparison to some of the top tier journals, but high in terms of what authors get for their money.
  • Online publication and visibility.
  • Indexing in platforms such as Scopus and Web of Science.

Figure 3: How to spot a predatory journal[6]

OER Africa has a free online tutorial on open access publishing, which includes suggestions on how to verify a journal’s legitimacy.[7] There is also a discussion in Open Knowledge Primer for African Universities on the ways in which DOAJ tries to ensure that the journals in its database are legitimate.[8]

All researchers are under pressure to publish to keep their jobs and become eligible for promotion. The pressure on African scholars is increased because they cannot afford the high publication fees some journals charge, and some may not be familiar with the steps necessary to evaluate journals.

Two researchers are quoted in a 2022 article in the Africa Edition of University World News to illustrate the dilemmas facing African scientists who must publish but have neither the funds to pay the APC costs of top-tier journals nor the knowhow to discern the legitimate from the predatory.[9]

One scientist, Euclides Sacomboio of Agostinho Neto University in Angola, had two articles published in disreputable journals. His preference would have been high-impact journals, but, as he told University World News:

"I earn US$500, and the article processing fee in reputable journals is about US$2,180. Where do I get the money without any support?"

Sacomboio added:

"To me, it was important to share my data. Worse, it was difficult to choose [where to publish] because some of these journals we call predatory have peer review processes."

The second scientist, Moses Samje of the University of Bamenda, Cameroon and a member of the African Academy of Sciences Chapter of Affiliates, was also taken in—this time because the journal’s focus was on research like his and because of the journal’s allegedly high impact factor. Samje said:

"The impact factor was quite attractive. It was too good to be true … We had to try and we submitted a paper and, in the space of 24 hours, they [the publishers] asked for the processing charge, which was getting way more affordable. In less than 48 hours, we received an e-mail [saying] our paper was online. I was quite excited."

Samje subsequently went online and discovered that the journal’s peer review process was not as it seemed; he believes that the journal is a sham.

‘Plagiarism, fraud, and predatory publishing’

The noted bioethicist, Arthur Caplan, wrote those words in 2015 and called predatory journals ‘polluting journals.’[10]

Although the points in figure 3 elucidate the major ways to identify a predatory journal, there are two additional strategies they employ. Predatory journals are noted for accepting plagiarized articles and those that have already been published elsewhere. Even though predatory journals may report that they check for plagiarism, they typically don’t.

A 2018 blog post in the Indian newsletter, The Wire, succinctly described the situation in India and gave examples. The authors wrote:[11]

"Fake journals and plagiarism in academics go hand-in-hand. The lack of peer review and a complete absence of quality checking provides a safe channel to publish plagiarised articles. It is therefore no coincidence that along with fake journals, almost all academic fields have also seen an epidemic of plagiarism."

Sometimes plagiarism is intentional; other times it is the result of a researcher’s lack of expertise on what the concept means.

It isn’t always easy to find specific examples of plagiarism. Science Integrity Digest is one source of information. In 2020, it reported on a clear case of plagiarism in which the work of the OstrowskiLab was stolen and published in a predatory journal.[12] In 2019 in the Journal of Nursing Scholarship, authors wrote about numerous instances of plagiarism in three predatory nursing journals.[13]

In South Africa, Professor Nicki Tiffin, a former researcher at the University of Cape Town (UCT) found that not only had she been plagiarized in a predatory journal, but her name had been stolen too.[14]

Unwary researchers are also trapped because some predatory journals have titles very similar to those of reputable journals. The three journals in the figure below all have similar titles but the similarity ends there.

What's in a name?

The first journal, Plant Physiology and Biochemistry is published by Elsevier, a reputable scientific publisher. The second, Journal of Plant Biochemistry and Physiology, is published by Longdom Press. Note it has phone numbers in Great Britain and in Spain and a registered address in Brussels. The journal is not included in any of the major indexing services that have quality controls, such as Web of Science, Scopus, or PubMed. The third, Journal of Plant Biochemistry & Physiology, is published by Omics, a publisher that was sued by the US Federal Trade Commission for predatory practices and ordered to pay a fine of more than $50 million.[15]

How to help researchers distinguish between the fake and the real

Above, we outlined several ways to determine legitimate journals from predatory ones. The two OER Africa publications we cited offer detailed help to students, researchers, and librarians.

Intellectual property rights, plagiarism, and referencing are taught in the Use of Libraries or embedded in the Use of English course, which is an integral part of the compulsory General Studies (GS) for first year students in Nigerian universities. However, the effect of the course on students has been found to be minimal.[16] Traditionally, African academic libraries run library orientation activities for new students. This window of opportunity could be widened to include provision of information packs or tutorials (online and physically) on information literacy, copyright, and plagiarism issues (including an introduction to plagiarism detecting software), as well as information about predatory journals.

Figure 4: AfLIA poster for use in libraries

Academic libraries can play an important role in raising awareness to the need to be wary of predatory practices.  But universities as a whole should be engaged in preventing staff and students from falling prey to these journals. They can list the open access journals for which academics associated with their institution can use for purposes of promotion, tenure, and contracts.  The DHET site discussed above would be a good place start.  Supervisors can advise their PhD students about conducting a literature review without including predatory journals.  Sarah Elaine Eaton of the University of Calgary wrote the following about the need of universities to support their students and academics against predation: [17]

"There are implications for mentors of graduate students and early-career stage academics, as well as for institutions as a whole. The issue of questionable conferences and publications is so complex that early-stage academics require support and mentorship to cultivate a deeper understanding of how to share their work in a credible way."

Dr. Eaton’s statement is valid around the world, particularly in circumstances such as Drs. Sacomboio and Samje described—insufficient funds to pay fees and insufficient guidance within the institution.


[1] See Sharing Africa’s knowledge through openly licensed publishing for more information on open access. https://www.oerafrica.org/content/sharing-africa’s-knowledge-through-openly-licensed-publishing

[3]See https://www.up.ac.za/news/post_3048195-the-department-of-higher-education-and-training-2022-accredited-journals-

[4]Grudniewicz, A., Moher, D., Cobey, K.D. et al. (2019). Predatory journals: no definition, no defence. Nature, Vol. 576: 210-212. Retrieved from https://media.nature.com/original/magazine-assets/d41586-019-03759-y/d41586-019-03759-y.pdf

[14] Simon, N. (2023). Protecting research integrity from predatory journals. University of Cape Town. Retrieved from https://www.news.uct.ac.za/article/-2023-11-09-protecting-research-integrity-from-predatory-journals

[17] See Sarah Elaine Eaton’s Resource Guide Avoiding Predatory Journals and Questionable Conferences. https://files.eric.ed.gov/fulltext/ED579189.pdf

 

Education systems around the world have traditionally been characterized by closed knowledge systems, overly prescriptive curricula, narrow conceptions of success, and a failure to fully empower teachers as facilitators of learning. A recent paper by Neil Butcher & Associates argues that a key reason for these issues is that many education systems are inhibited by complex policy environments that, likely unintentionally, impede learning and create educational closure.

Image courtesy of Michael Anderson, Unsplash, Unsplash licence

Education systems around the world have traditionally been characterized by closed knowledge systems, overly prescriptive curricula, narrow conceptions of success and achievement, and a failure to fully empower teachers as facilitators of learning. This inhibits their ability to develop a full spectrum of human learning capabilities amongst learners, especially in their formative schooling years. A recently published paper by Neil Butcher & Associates (NBA) argues that, while there may be various reasons for these issues, one critical problem is that many education systems are inhibited by complex policy environments that, most likely unintentionally, impede meaningful learning and create educational closure.

Education policies often create new rules that accumulate over time, giving rise to inefficiencies and unnecessary constraints that do not support (and often obstruct) learner success. One manifestation of policy complexity within education systems is the growing granularization and rigidity of the formal national curriculum, which has led to the proclivity to use standardized testing and high-stakes examinations as a proxy for learner success. This complexity has also eroded autonomy for teachers, constraining what they can do in the classroom and increasing the tendency to ‘teach to the curriculum’ (or, worse even, to the examination). Standardized testing and high-stakes examinations have also increased anxiety and tension amongst learners, parents, and teachers, who perceive a false equivalence between test performance and success in later life.

The paper argues that despite the diverse nature of education systems around the world, many share a common problem of complex policy environments. Increased use of standardized testing models and resulting curriculum rigidity does not lead to better quality education but can have a deleterious effect on learner achievement. As complexity filters down into the classroom, another consequence is that the teachers who are tasked with delivering curricula are increasingly constrained and disempowered by these central policies. The consequences of this are far reaching as they emphasize rigidity and closure in knowledge acquisition, leaving little space for substantive learner-teacher engagement, contextual adaptation, and discovery.

In response to these challenges, we can use the principles of open learning as a tool to reflect on policy complexity in education systems, including the extent to which a policy environment is facilitating openness or promoting closure. A useful mechanism to tackle policy creep and ensure that education systems are geared toward a broader definition of learner success is to adopt and systematically implement the concept of openness within education systems, which begins at the policy level. Prioritizing openness offers significant opportunities for teachers and learners to reclaim what happens in the classroom and become more engaged members of society.

Integrating open learning principles into policy discourse would be a step forward in reducing unnecessary complexity and closure within education systems.

 

African languages are vastly underrepresented in the global knowledge pool, even though scholars at Harvard University believe that Africa is home to about one third of the world’s languages. This week, we delve into how Artificial Intelligence can assist with African language representation, and some of the challenges therein.

Much has been written about Artificial Intelligence (AI), mainly in English, including by OER Africa.[1] English is the predominant language on the Internet, in research and publications, and in education.  African languages are vastly underrepresented in the global knowledge pool, even though scholars at Harvard University believe that with between 1,000 and 2,000 languages, Africa is home to about one third of the world’s languages.[2]

Artificial intelligence (AI) can play an important role in mitigating these language challenges. Already, international search engines, such as Google, play a large role in using AI to translate English into African languages and vice-versa. Efforts are constrained, however, by the paucity of documents on the web written in most African languages. Additionally, networks of African researchers have become actively engaged in looking for ways to increase the data on the web in African languages, including documenting scientific terms in the African languages where no such terms currently exist. Such data will then be available for use by AI to improve access to African languages. Importantly, they are trying to grow the field of African AI researchers by building networks and finding AI language technology solutions.  

Many of us think about Google Translate when we want to understand what has been written in a language that we do not understand. Google Translate is now supported in 25 African languages: Afrikaans, Amharic, Arabic, Bambara, ChichewaEwe, Hausa, Igbo, Kinyarwanda, Krio, Lingala, Luganda, Malagasy, Oromo, Sepedi, Swahili, Sesotho, Shona, Somali, Tigrinya, Tsonga, Twi, Xhosa, Yoruba, and Zulu. Several of these languages are spoken across borders. The good news is that the number of them keeps increasing. The bad news is that there does not seem to be any one place to ascertain which African languages are covered; this can only be determined through searches within Google Translate. Furthermore, Google Translate uses machine translation, which is mostly accurate, but not entirely.

The Nigerian linguist, Aremu Adeola, uses an interesting example about why context matters in many languages, including Yoruba:[3]

"Most translations done by machines render some words wrong, especially words that are culturally nuanced. For example, Yorùbá words ayaba and obabìnrin have their meanings situated in a cultural context. Most machines translate both words as queen. However, from a traditional-cum-cultural vantage point, it is essential to note that the meanings of ayaba and obabìnrin are different: Ọbabìnrin means queen in English while ayaba is wife of the king.’"

Using AI as a translation tool is not straightforward. Most AI tools:[4]

"Rely on a field of AI called natural language processing, a technology that enables computers to understand human languages. Computers can master a language through training, where they pick up on patterns in speech and text data. However, they fail when data in a particular language is scarce, as seen in African languages."

The South African science journalist, Sibusiso Biyela, gives an excellent example of just how difficult it can be to make scientific discoveries understandable and relatable in African languages, such as isiZulu.   Biyela was given an assignment to write about the discovery of a new species of dinosaur, Ledumahadi mafube in isi-Zulu.  He explained:[5]

"But there’s no word for “dinosaur” in Zulu. Nor are there words for “Jurassic,” “fossilization,” or “evolution.” Despite the fact that Zulu—or isiZulu, as the language is called in South Africa—is spoken by some 10 million people, it simply doesn’t have the words for communicating science.

So my news piece wasn’t just a news piece. It was an attempt to tell a science story in a language that science overlooked—to help right a societal wrong. It was a small contribution among an increasing number that aim to help decolonize South African science writing. And it was rife with more pitfalls than I could have imagined. The task of describing science clearly, concisely, and accurately—already challenging in English—became exponentially more difficult in my native tongue."

At the end of his article, Biyela gives a lexicon of some of the English-isiZulu scientific terms that he used. Biyela uses technology joined with his expertise in science for his work on conveying scientific terms from English to isiZulu. He was one of the partners in Masakhane, which is discussed below.[6]

The underrepresentation of African languages online makes it more difficult to use AI as a translating tool because computers have trouble identifying datasets with which to work. Several organizations are trying to mitigate this challenge, among them the Masakhane Research Foundation. Masakhane is collaborating with the African scientific preprint server, AfricArXiv,[7] to find a way to translate the papers that AfricArXiv receives into African languages.

Masakhane is a grassroots natural language processing (NLP) network that was formed for NLP research in African languages, for Africans, by Africans. The Masakhane community consists of:[8]

 ">1000 participants from 30 African countries with diverse educations and occupations, and >3 countries outside Africa. As of February 2020, over 49 translation results for over 38 African languages have been published by over 35 contributors on GitHub."

Masakhane has a trial translation page, but the translation results do not always match those of Google Translate. For example, ‘kisukuku’ is how ‘fossil’ is translated in Google Translate. ‘Mabaki ya Wanyama’ is the translation given by Masakhane. (Most online translations use kisukuku).

Figure 1: What is the correct translation?

These efforts are just getting started. If Africa is going to join the global knowledge pool, its languages must be represented too. Both AfricarXiv and Masakhane welcome volunteers; there are other such organizations that would also appreciate assistance.

And for those who are interested in the interrelationship between AI and library and information studies, the African Library and Information Associations and Institutions (AfLIA) will host a webinar on this topic on 25 October 2023. Visit the webinar’s information page for more information.


Related articles


References and attribution

[3]Lost in Translation: Why Google Translate Often Gets Yorùbá-and Other Languages-Wrong. Aremu Adeola. Rising Voices. 20 November 2020. https://rising.globalvoices.org/blog/2020/11/20/lost-in-translation-why-google-translate-often-gets-yoruba-and-other-languages-wrong/

[4] A roadmap to help AI technologies speak African languages. 11 August 2023. https://www.sciencedaily.com/releases/2023/08/230811115430.htm

[5] Decolonizing Science Writing in South Africa. Sibusiso Biyela. 12 February 2019. https://www.theopennotebook.com/2019/02/12/decolonizing-science-writing-in-south-africa/

Image at the top of the article courtesy of albyantoniazzi, Flickr, CC BY-NC-SA