Skip to main content



February 10, 2021, by David McClure 

Today we’re excited to release a big update to the Galaxy visualization, an interactive UMAP plot of graph embeddings of books and articles assigned in the Open Syllabus corpus! (This is using the new v2.5 release of the underlying dataset, which also comes out today.) The Galaxy is an attempt to give a 10,000-meter view of the “co-assignment” patterns in the OS data – basically, which books and articles are assigned together in the same courses. By training node embeddings on the citation graph formed from (syllabus, book/article) edges, we can get really high-quality representations of books and articles that capture the ways in which professional instructors use them in the classroom – the types of courses they’re assigned in, the other books they’re paired with, etc.

The new version is a pretty big upgrade from before, both in terms of the size of the slice of the underlying citation graph that we’re operating on, and the capabilities of the front-end plot viewer. The plot now contains the 1,138,841 most frequently-assigned books and articles in the dataset (up from 160k before) and shows 500,000 points on the screen at once (up from 30k before).

February 10, 2021, by Joe Karaganis 

Today we’re releasing a big update to Open Syllabus data and websites. Here’s a rundown:

The Co-Assignment Galaxy

The Galaxy has received a massive upgrade in scale and functionality. The previous version mapped 164,000 titles and could display 30,000 at a time. The new version maps 1.1 million titles and can display 500,000 at a time. The resolution of fields and subfields is vastly improved as a result.

The Galaxy also implements a much-requested ‘search by topic’ function, which searches against the full text of syllabi rather than titles and authors–though you can still do that too. Results are now heat mapped to help users zoom in on areas of interest. David McClure has written up a detailed technical post on the new Galaxy for those who want a look under the hood.

OER Metrics

OER Metrics is a new subsite for investigating trends and adoption patterns for openly-licensed books and textbooks (i.e., Open Educational Resources). It provides the first tools for mapping the demand side of the OER ecosystem and–we hope–can help inform adoption decisions by instructors and programs and investment decisions by authors, publishers, and funders.

Link Lab

Link Lab is an exploration of ‘non-traditional’ teaching materials in the collection identified by URLs in the syllabi. These links are then walked back to their source to collect titles, authors, and other metadata. Link Lab picks up newspaper and magazine stories, videos and documentaries, blog posts, and other materials that are frequently taught but rarely recognized or curated as teaching materials. We are working on integrating URL identification into the main dataset. In the meantime, the preliminary data is presented here as a Lab.

The 2.5 Dataset

The Syllabus Explorer, Galaxy, and other services are now using version 2.5 of the OS dataset, which represents a big improvement over the previous 2.0 version. Among the highlights:

It adds 2018 data, bringing the total to 7.2 million syllabi.

This number reflects significant overall growth due to better collecting techniques and also better deduplication techniques. Because the latter outweighed the former, we saw a net gain of 500K syllabi through 2017 (the period of the 2.0 collection).

The 2.5 collection has a much larger reference catalog that enables the identification of many more titles: 4.6 million compared to 1.7 million in 2.0.

And it does a better job of identifying dates and fields.

The result is a richer and more accurate portrait of the curriculum of higher education. Is it perfect? No. Spend some time browsing the data and you will find errors. But it is bigger and better–and we hope more useful and interesting to faculty, students, lifelong learners, and bibliophiles of all kinds.

November 13, 2020, by Joe Karaganis 

Olga Togarczuk won the Nobel Prize in Literature in 2018. She appears on 22 syllabi in the OS dataset. Peter Handke won in 2019 and appears on 221. Louise Glück, who won this past September, appears on 91. These are low numbers (even assuming, in Glück’s case, that we structurally undercount poetry, which we probably do). None of these authors are widely taught. Curious, I spent some time exploring the place of Nobel Prize-winners in the curriculum. The results are pretty striking. Here are the past forty Literature winners.

I’ve struggled somewhat to make generalizations here. There is clearly a lot variation in how often the prize winners are taught, ranging from the ubiquitous Toni Morrison to a whole raft of writers who are almost never assigned. The notions of literary reputation and value that animate the Nobel committee appear to have little connection to the judgements that faculty make in assigning texts. Nor–by all appearances–is winning a prize a guarantee of more teaching attention.

For Nobel watchers, this probably isn’t a surprise. Generations of commentators have written about the byzantine politics of literary reputation and influence that shape the prize, about its varieties of regional and gender bias and more recent politics of outreach, and about the diverse uses of the prize for social and political commentary. There are endless arguments about the uneven judgement of the committee, focused mostly on the literary giants left unrecognized and the winners who were (and remain) obscure. (For an entertaining, polemical summary, see Myer 2007).

These factors surely play some role in the different teaching fates of the winners, but also lead into a thicket of subjective critique. We can perhaps tease out a couple simpler patterns.

Putting aside Morrison’s outlier popularity for a moment, the Anglo-American winners–Ishiguro, Munro, Lessing, Pinter, Heaney, Golding–are pretty well and consistently represented in teaching. Counts between 1000 and 2000 put them in the company of canonical writers like Livy (1800 appearances), Carlyle, (1850), Proust (1594), and Alcott (1612) though not among the highest-scorers that all literature majors and many other students will encounter at least once in their studies.

Non-British European writers, in contrast, account for a lot of Nobel Prizes but are nearly invisible in the curriculum. Only Gunter Grass (the 1998 winner) cracks 500 appearances, and his most assigned title, The Tin Drum, appears only 211 times. None of the other continental European winners–Modiano, Tranströmer, Müller, Le Clézio, Jelinek, Saramago, Fo, Cela, Simon, Seifert, and so on–are likely to be encountered outside the rare regional or country-focused literature class. The two Chinese winners are also rarely taught.

One simple explanation is that the Nobel Prize committee works with a concept of ‘world’ literary culture that has no equivalent in university teaching. Most teaching continues to pass through national traditions, which in our mostly US-based sample favors British and American writers and pushes the study of most other literatures to the edges of the curriculum.

There is, nonetheless, a strong framework for cross-cultural literary comparison in the Nobel results. The ascendancy of post-colonial literary studies is visible throughout, from Coetzee’s 3675 appearances, to Naipaul’s 1375 and Walcott’s 1353; from Vargas Llosa’s 1073 to Paz’s 1012, Soyinka’s 1563, and Garcia-Marquez’s 2342. And arguably through Morrison’s 10,414. The most visible division in the results is between authors who fit easily within a successfully-institutionalized post-colonial teaching enterprise and those (mostly continental European authors) who don’t.

Obviously a handful of prizes provide a limited view of this topic. But there is the beginning of a story here about competing concepts of literary value, their different forms of institutionalization (in the case of post colonial studies across thousands of classes), and the zero-sum nature of the choices they create.

October 30, 2020, by Joe Karaganis 

Recently we’ve been exploring the place of ‘non-traditional’ materials in the curriculum: newspaper and magazine articles, TV and radio episodes, podcasts, blogs, and so on. Such materials are, of course, both very common on syllabi and largely invisible to traditional approaches to curricular design. They have been invisible in Open Syllabus, too, which relies on library catalogs to describe the range of titles that we can search for in the syllabus collection. Over the summer we decided to address this by extracting URLs in the collection, walking them back to their sources, and filtering for work used in instructional contexts. We now have a very interesting catalog of non-traditional classroom materials.

This is a big topic that will probably be the subject of several posts, but let’s look at some fun stuff first: TV/radio shows and podcasts that do serious long-form exploration of topics. Many of these are assigned with some frequency on syllabi — though we suspect more often as supplements to conventional assignments than as primary materials. It would be interesting to test this via a deeper dive into the documents. In any case, there is some standout programming that has a strong presence in the classroom. For example, Frontline:

Spying on the Home Front,” “On Our Watch” (which is about the genocide in Darfur), and “Ghosts of Rwanda” play central roles in the teaching of their respective topics. For comparison, the other major Rwanda titles — Prunier’s The Rwanda Crisis, Powers’ Bystanders to Genocide, and Kuperman’s Rwanda in Retrospect–are assigned 290, 237, and 201 times respectively.

“This American Life” episodes are also widely assigned. ‘The Giant Pool of Money‘ won a Peabody award and plays a significant role in teaching the financial crisis. ‘The Problem We all Live With‘ is about the fate of school desegregation.

LastWeekTonight has some frequently assigned episodes–notably one on bad journalistic coverage of scientific studies.

Radiolab has one strong performer: an episode on the mind-body relationship called Where Am I.

There is more to explore and it will be a while, yet, before this data is integrated into the larger OS dataset. Next stop: magazines and newspapers.

October 6, 2020, by Joe Karaganis 

Open Access (OA) monographs and Open Educational Resource (OER) textbooks are works that are ‘openly licensed’ — that is, they can be used and distributed for free. In a world of $200 textbooks, OA/OER plays a fairly high-profile role in efforts to reduce the cost of education.

But free circulation makes it difficult to track classroom adoption, which in turn makes it difficulty to understand the shape of demand for OA/OER work–either overall or with respect to particular subjects. The link between supply and demand established in the commercial book market by a sale doesn’t exist in the OA/OER world. Our thought is that this delinking is one reason–and maybe a significant reason–for the relatively low rate of adoption of OA/OER in teaching, despite over a decade of efforts. It’s still too hard to characterize demand for these titles to faculty, curricular designers, publishers, and investors. It’s hard to tell what’s popular and what’s been effectively adopted in peer institutions.

So we’re eager to see what happens when we partially close this information loop by measuring demand via syllabi. Here’s a normalized US trendline for OA/OER adoption based on the OS collection (drawing on catalog information from the Open Textbook Library and the Directory of Open Access Books). It shows rapid OER textbook growth in recent years–but from a very low baseline. In 2017, roughly 1 in 300 classes used OER textbooks and around 1 in 400 assigned an OA monograph (the lighter blue is for textbooks; darker for monographs).

That’s across all ‘US syllabi with citations,’ which requires caveats since some types of classes don’t have assigned materials.

The availability of good OER materials also varies significantly by field and topic. Studies that have focused narrowly on classes for which there are good OER equivalents find significantly higher rates of adoption–around 6% in this 2018 study. Recent surveys of OER use also suggest higher and rising numbers. We can’t zero in on specific classes but we can explore differences between fields. In Math–a field with a number of widely used OER textbooks–we put US adoption at 1.5% of ‘syllabi with citations’ in 2017 (and climbing).

There are other ways to slice this data. The chart below focuses on US 2-year colleges. The data is choppier (we may drop the early 2000s from charts, which are noisy when normalized) but pretty clearly show sharp recent growth in textbook adoption–and virtually no role for OA monographs, which are typically more advanced scholarly work.

There is a decent case, in other words, that OER is at a takeoff point in higher ed–though still at very low levels of adoption across the curriculum. These charts come from a dashboard that we’ll publish in a couple weeks. Then you can explore.