Today we’re excited to release a big update to the Galaxy visualization, an interactive UMAP plot of graph embeddings of books and articles assigned in the Open Syllabus corpus! (This is using the new v2.5 release of the underlying dataset, which also comes out today.) The Galaxy is an attempt to give a 10,000-meter view of the “co-assignment” patterns in the OS data – basically, which books and articles are assigned together in the same courses. By training node embeddings on the citation graph formed from (syllabus, book/article) edges, we can get really high-quality representations of books and articles that capture the ways in which professional instructors use them in the classroom – the types of courses they’re assigned in, the other books they’re paired with, etc.
The new version is a pretty big upgrade from before, both in terms of the size of the slice of the underlying citation graph that we’re operating on, and the capabilities of the front-end plot viewer. The plot now contains the 1,138,841 most frequently-assigned books and articles in the dataset (up from 160k before) and shows 500,000 points on the screen at once (up from 30k before).
Under the hood, this is a pretty straightforward transformation of the raw citation graph that comes out of the OS data pipeline. The citation extractor identifies references to books and articles in the syllabus, which can take a few different forms – lists of required books, week-by-week reading assignments, bibliographies, etc. Eg, from “Statistical
Olga Togarczuk won the Nobel Prize in Literature in 2018. She appears on 22 syllabi in the OS dataset. Peter Handke won in 2019 and appears on 221. Louise Glück, who won this past September, appears on 91. These are low numbers (even assuming, in Glück’s case, that we structurally undercount poetry, which we probably do). None of these authors are widely taught. Curious, I spent some time exploring the place of Nobel Prize-winners in the curriculum. The results are pretty striking. Here are the past forty Literature winners.
I’ve struggled somewhat to make generalizations here. There is clearly a lot variation in how often the prize winners are taught, ranging from the ubiquitous Toni Morrison to a whole raft of writers who are almost never assigned. The notions of literary reputation and value that animate the Nobel committee appear to have little connection to the judgements that faculty make in assigning texts. Nor–by all appearances–is winning a prize a guarantee of more teaching attention.
For Nobel watchers, this probably isn’t a surprise. Generations of commentators have written about the byzantine politics of literary reputation and influence that shape the prize, about its varieties of regional and gender bias and more recent politics of outreach, and about the diverse uses of the prize for social and political commentary. There are endless arguments about the uneven judgement of the committee, focused mostly on the literary giants left unrecognized and the winners who were (and remain) obscure. (For an entertaining, polemical
Recently we’ve been exploring the place of ‘non-traditional’ materials in the curriculum: newspaper and magazine articles, TV and radio episodes, podcasts, blogs, and so on. Such materials are, of course, both very common on syllabi and largely invisible to traditional approaches to curricular design. They have been invisible in Open Syllabus, too, which relies on library catalogs to describe the range of titles that we can search for in the syllabus collection. Over the summer we decided to address this by extracting URLs in the collection, walking them back to their sources, and filtering for work used in instructional contexts. We now have a very interesting catalog of non-traditional classroom materials.
This is a big topic that will probably be the subject of several posts, but let’s look at some fun stuff first: TV/radio shows and podcasts that do serious long-form exploration of topics. Many of these are assigned with some frequency on syllabi — though we suspect more often as supplements to conventional assignments than as primary materials. It would be interesting to test this via a deeper dive into the documents. In any case, there is some standout programming that has a strong presence in the classroom. For example, Frontline:
Open Access (OA) monographs and Open Educational Resource (OER) textbooks are works that are ‘openly licensed’ — that is, they can be used and distributed for free. In a world of $200 textbooks, OA/OER plays a fairly high-profile role in efforts to reduce the cost of education.
But free circulation makes it difficult to track classroom adoption, which in turn makes it difficulty to understand the shape of demand for OA/OER work–either overall or with respect to particular subjects. The link between supply and demand established in the commercial book market by a sale doesn’t exist in the OA/OER world. Our thought is that this delinking is one reason–and maybe a significant reason–for the relatively low rate of adoption of OA/OER in teaching, despite over a decade of efforts. It’s still too hard to characterize demand for these titles to faculty, curricular designers, publishers, and investors. It’s hard to tell what’s popular and what’s been effectively adopted in peer institutions.
So we’re eager to see what happens when we partially close this information loop by measuring demand via syllabi. Here’s a normalized US trendline for OA/OER adoption based on the OS collection (drawing on catalog information from the Open Textbook Library and the Directory of Open Access Books). It shows rapid OER textbook growth in recent years–but from a very low baseline. In 2017, roughly 1 in 300 classes used OER textbooks and around 1 in 400 assigned an OA monograph (the lighter blue is for textbooks; darker for